Domain-Specific Voice Flows: Building the Guardrails
Voice data is messy. Users mumble, interrupt, go off-topic, and answer questions you haven't asked yet.
If you are building a "chatty" companion, this is fine. But if you are building a high-stakes agent for Lead Qualification, Medical Triage, or 911 Dispatch, you cannot let the LLM just "chat." You need it to follow a strict business process.
This post is for engineers who need to constrain a voice agent to a rigid flowchart without making it sound like a 1990s IVR menu ("Press 1 for Sales"). We will explore how to build Domain-Specific Flows using State Machines and Slot Filling.
The problem: The "Wandering" agent
Let's look at a Sales Qualification bot. Its goal is to get the user's Name, Company, and Budget.
The Scenario:
- Agent: "What is your budget for this project?"
- User: "Well, honestly, it depends on the weather! I'd love to go sailing today."
- Agent (The Failure): "Oh, I love sailing! The wind is perfect in San Francisco right now. Do you have a boat?"
Why this is bad: The agent followed the user off a cliff. It forgot its business goal (getting the budget) to pursue a "helpful" conversation. We need to lock it down.
The solution: The structured state machine
We stop treating the conversation as a "chat" and start treating it as a Form Filling exercise.
We define a State Machine where the agent cannot proceed to "Step B" until "Step A" is satisfied.
graph TD
A[Start] --> B{Has Name?}
B -- "No" --> C[State: ASK_NAME Prompt: Who am I speaking with?]
B -- "Yes" --> D{Has Company?}
D -- "No" --> E[State: ASK_COMPANY Prompt: What company are you with?]
D -- "Yes" --> F{Has Budget?}
F -- "No" --> G[State: ASK_BUDGET Prompt: What is your budget?]
F -- "Yes" --> H[State: QUALIFIED Action: Transfer to Sales]
style C fill:#e3f2fd,stroke:#0d47a1
style E fill:#e8f5e9,stroke:#388e3c
style G fill:#fff3e0,stroke:#e65100
The Engineering Insight: The LLM doesn't decide what to ask. The State Machine decides what is missing, and the LLM's only job is to phrase the question naturally.
The "How": Pydantic slot filling
We don't need complex graph code for simple flows. We can use Pydantic to define the "slots" we need to fill, and a System Prompt that enforces the logic.
Step 1: Define the data structure
from pydantic import BaseModel, Field
from typing import Optional
class CustomerProfile(BaseModel):
name: Optional[str] = Field(None, description="Customer's full name")
company: Optional[str] = Field(None, description="Company name")
# We use 'int' to force the LLM to extract a number, not text
budget: Optional[int] = Field(None, description="Budget in USD")
Step 2: The "Update" tool
We give the LLM a tool to update this profile. This allows us to track progress.
# Global state for this call
current_profile = CustomerProfile()
def update_profile(name: str = None, company: str = None, budget: int = None):
"""Call this tool IMMEDIATELY when the user provides new information."""
if name: current_profile.name = name
if company: current_profile.company = company
if budget: current_profile.budget = budget
return f"Profile updated. Current state: {current_profile}"
Step 3: The guardrailed prompt
We dynamically inject the missing fields into the prompt every turn.
system_prompt = """
You are a strict Sales Qualification Agent.
Your Goal: Fill the missing fields in the Customer Profile.
CURRENT PROFILE STATE:
{current_profile}
RULES:
1. Look at the 'None' fields above.
2. Ask for the *first* missing field. Do not skip ahead.
3. If the user goes off-topic (e.g., talks about sailing), politely acknowledge it
but IMMEDIATELY pivot back to the missing field.
4. Do not end the call until all fields are filled.
"""
Handling "Soft" errors (Voice UX)
In text forms, if you type "enough" into a Number field, it turns red.
In voice, the user says: "My budget is... well, we have enough money."
This isn't an integer. The update_profile tool will fail validation.
We must handle this Gracefully.
The Pattern:
- LLM: Calls
update_profile(budget="enough"). - Code: Pydantic raises
ValidationError. - System: Catches error and returns string:
Error: Budget must be a number. - LLM (Reacting): Sees the error and self-corrects.
- Agent: "I understand you have budget, but I need a specific number for our records. Could you give me a rough estimate in dollars?"
This creates a robust loop where the "Code" enforces business logic (integers only), but the "LLM" handles the social awkwardness of asking again.
sequenceDiagram
participant User
participant LLM
participant Tool_Runtime
User->>LLM: "We have enough money."
LLM->>Tool_Runtime: Call update_profile(budget="enough")
Note right of Tool_Runtime: Validation Error!<br/>'enough' is not an int.
Tool_Runtime-->>LLM: Result: "Error: Budget must be a valid integer."
LLM->>User: "I need a specific number. Can you estimate it?"
User->>LLM: "About 50k."
LLM->>Tool_Runtime: Call update_profile(budget=50000)
Tool_Runtime-->>LLM: Result: "Success."
Summary: Why guardrails matter
| Approach | Flexibility | Safety | Best For |
|---|---|---|---|
| Free-Form Chat | High | Low | Companions, General Q&A |
| State Machine | Low | High | High-Stakes (911, Medical, Legal) |
| Hybrid (Guided Chat) | Medium | Medium | Support, Sales Qualification |
Challenge for you
Scenario: You are building a 911 Dispatch Voice Bot.
The Rules:
- Safety Critical: You MUST get the Address before asking anything else (even before asking what the emergency is).
- Context Switching: If the user screams "My house is on fire!", the bot must acknowledge the fire but still pivot back to "Where is the house?" immediately.
Your Task:
- How do you structure the System Prompt to enforce this "Address First" rule?
- What happens if you use a standard "Chat" prompt? (Hint: The bot will likely say "Oh no, get out of the house!" and forget to ask for the address).
- Write the "State Check" logic that injects
<INSTRUCTION>ASK FOR ADDRESS NOW</INSTRUCTION>into the prompt if the address slot is empty.
Key takeaways
- State machines enforce business logic: By defining required fields and their order, you prevent agents from wandering off-topic
- Pydantic provides validation: Using structured models ensures data quality while allowing graceful error handling
- Slot filling tracks progress: Tools that update a shared state object let you track which information has been collected
- Dynamic prompts guide behavior: Injecting missing fields into the system prompt every turn keeps the agent focused
- Error handling enables self-correction: When validation fails, returning clear error messages lets the LLM ask again naturally
- Guardrails prevent dangerous failures: In high-stakes scenarios, strict control flow is more important than conversational flexibility
- Voice UX requires graceful recovery: Unlike text forms, voice agents must handle invalid inputs with natural language, not red error messages
For more on voice AI systems, see our voice AI fundamentals guide, our multi-agent voice systems guide, and our workflow orchestration guide.
For more on building production AI systems, check out our AI Bootcamp for Software Engineers.