Domain-Specific Voice Flows: Building the Guardrails

Param Harrison
6 min read

Share this post

Voice data is messy. Users mumble, interrupt, go off-topic, and answer questions you haven't asked yet.

If you are building a "chatty" companion, this is fine. But if you are building a high-stakes agent for Lead Qualification, Medical Triage, or 911 Dispatch, you cannot let the LLM just "chat." You need it to follow a strict business process.

This post is for engineers who need to constrain a voice agent to a rigid flowchart without making it sound like a 1990s IVR menu ("Press 1 for Sales"). We will explore how to build Domain-Specific Flows using State Machines and Slot Filling.

The problem: The "Wandering" agent

Let's look at a Sales Qualification bot. Its goal is to get the user's Name, Company, and Budget.

The Scenario:

  • Agent: "What is your budget for this project?"
  • User: "Well, honestly, it depends on the weather! I'd love to go sailing today."
  • Agent (The Failure): "Oh, I love sailing! The wind is perfect in San Francisco right now. Do you have a boat?"

Why this is bad: The agent followed the user off a cliff. It forgot its business goal (getting the budget) to pursue a "helpful" conversation. We need to lock it down.

The solution: The structured state machine

We stop treating the conversation as a "chat" and start treating it as a Form Filling exercise.

We define a State Machine where the agent cannot proceed to "Step B" until "Step A" is satisfied.

graph TD
    A[Start] --> B{Has Name?}
    B -- "No" --> C[State: ASK_NAME Prompt: Who am I speaking with?]
    B -- "Yes" --> D{Has Company?}
    D -- "No" --> E[State: ASK_COMPANY Prompt: What company are you with?]
    D -- "Yes" --> F{Has Budget?}
    F -- "No" --> G[State: ASK_BUDGET Prompt: What is your budget?]
    F -- "Yes" --> H[State: QUALIFIED Action: Transfer to Sales]
    
    style C fill:#e3f2fd,stroke:#0d47a1
    style E fill:#e8f5e9,stroke:#388e3c
    style G fill:#fff3e0,stroke:#e65100

The Engineering Insight: The LLM doesn't decide what to ask. The State Machine decides what is missing, and the LLM's only job is to phrase the question naturally.

The "How": Pydantic slot filling

We don't need complex graph code for simple flows. We can use Pydantic to define the "slots" we need to fill, and a System Prompt that enforces the logic.

Step 1: Define the data structure

from pydantic import BaseModel, Field
from typing import Optional

class CustomerProfile(BaseModel):
    name: Optional[str] = Field(None, description="Customer's full name")
    company: Optional[str] = Field(None, description="Company name")
    # We use 'int' to force the LLM to extract a number, not text
    budget: Optional[int] = Field(None, description="Budget in USD")

Step 2: The "Update" tool

We give the LLM a tool to update this profile. This allows us to track progress.

# Global state for this call
current_profile = CustomerProfile()

def update_profile(name: str = None, company: str = None, budget: int = None):
    """Call this tool IMMEDIATELY when the user provides new information."""
    if name: current_profile.name = name
    if company: current_profile.company = company
    if budget: current_profile.budget = budget
    
    return f"Profile updated. Current state: {current_profile}"

Step 3: The guardrailed prompt

We dynamically inject the missing fields into the prompt every turn.

system_prompt = """
You are a strict Sales Qualification Agent.
Your Goal: Fill the missing fields in the Customer Profile.

CURRENT PROFILE STATE:
{current_profile}

RULES:
1. Look at the 'None' fields above.
2. Ask for the *first* missing field. Do not skip ahead.
3. If the user goes off-topic (e.g., talks about sailing), politely acknowledge it 
   but IMMEDIATELY pivot back to the missing field.
4. Do not end the call until all fields are filled.
"""

Handling "Soft" errors (Voice UX)

In text forms, if you type "enough" into a Number field, it turns red.

In voice, the user says: "My budget is... well, we have enough money."

This isn't an integer. The update_profile tool will fail validation.

We must handle this Gracefully.

The Pattern:

  1. LLM: Calls update_profile(budget="enough").
  2. Code: Pydantic raises ValidationError.
  3. System: Catches error and returns string: Error: Budget must be a number.
  4. LLM (Reacting): Sees the error and self-corrects.
    • Agent: "I understand you have budget, but I need a specific number for our records. Could you give me a rough estimate in dollars?"

This creates a robust loop where the "Code" enforces business logic (integers only), but the "LLM" handles the social awkwardness of asking again.

sequenceDiagram
    participant User
    participant LLM
    participant Tool_Runtime
    
    User->>LLM: "We have enough money."
    LLM->>Tool_Runtime: Call update_profile(budget="enough")
    
    Note right of Tool_Runtime: Validation Error!<br/>'enough' is not an int.
    
    Tool_Runtime-->>LLM: Result: "Error: Budget must be a valid integer."
    
    LLM->>User: "I need a specific number. Can you estimate it?"
    User->>LLM: "About 50k."
    LLM->>Tool_Runtime: Call update_profile(budget=50000)
    Tool_Runtime-->>LLM: Result: "Success."

Summary: Why guardrails matter

Approach Flexibility Safety Best For
Free-Form Chat High Low Companions, General Q&A
State Machine Low High High-Stakes (911, Medical, Legal)
Hybrid (Guided Chat) Medium Medium Support, Sales Qualification

Challenge for you

Scenario: You are building a 911 Dispatch Voice Bot.

The Rules:

  1. Safety Critical: You MUST get the Address before asking anything else (even before asking what the emergency is).
  2. Context Switching: If the user screams "My house is on fire!", the bot must acknowledge the fire but still pivot back to "Where is the house?" immediately.

Your Task:

  1. How do you structure the System Prompt to enforce this "Address First" rule?
  2. What happens if you use a standard "Chat" prompt? (Hint: The bot will likely say "Oh no, get out of the house!" and forget to ask for the address).
  3. Write the "State Check" logic that injects <INSTRUCTION>ASK FOR ADDRESS NOW</INSTRUCTION> into the prompt if the address slot is empty.

Key takeaways

  • State machines enforce business logic: By defining required fields and their order, you prevent agents from wandering off-topic
  • Pydantic provides validation: Using structured models ensures data quality while allowing graceful error handling
  • Slot filling tracks progress: Tools that update a shared state object let you track which information has been collected
  • Dynamic prompts guide behavior: Injecting missing fields into the system prompt every turn keeps the agent focused
  • Error handling enables self-correction: When validation fails, returning clear error messages lets the LLM ask again naturally
  • Guardrails prevent dangerous failures: In high-stakes scenarios, strict control flow is more important than conversational flexibility
  • Voice UX requires graceful recovery: Unlike text forms, voice agents must handle invalid inputs with natural language, not red error messages

For more on voice AI systems, see our voice AI fundamentals guide, our multi-agent voice systems guide, and our workflow orchestration guide.


For more on building production AI systems, check out our AI Bootcamp for Software Engineers.

Share this post

Continue Reading

Weekly Bytes of AI

Technical deep-dives for engineers building production AI systems.

Architecture patterns, system design, cost optimization, and real-world case studies. No fluff, just engineering insights.

Unsubscribe anytime. We respect your inbox.