Structured Output: Making LLMs Application-Ready

Param Harrison
7 min read

Share this post

We've seen that LLMs are brilliant. We can ask them a question, and they'll give us a well-written paragraph.

But what if you asked an intern to compile a list of 100 customer contacts into a spreadsheet, and instead, they handed you a 5-page narrative describing each customer?

The information is all there, but it's in the wrong format. You can't sort it, you can't filter it, and you can't import it into your app. It's useless.

This is the key insight: For an LLM's output to be programmatically useful, it must be structured, predictable, and machine-readable. We need to move beyond asking for prose and start demanding data.

The bad way: Brittle Prompting

The simplest way to ask for JSON is to just... ask for it in the prompt. Let's see why this is a terrible idea for real applications.

text = "John Doe is a 32-year-old software engineer from New York."

brittle_prompt = [
    {"role": "system", "content": "Extract user information into a JSON object."},
    {"role": "user", "content": f"Extract the name, age, city, and profession from: '{text}'"}
]

# --- LLM's (Unreliable) Response ---
# Here is the JSON you requested:
# ```json
# {
#   "name": "John Doe",
#   "age": 32,
#   "city": "New York",
#   "profession": "software engineer"
# }
# ```

This output is brittle. Our code will crash trying to parse the extra text like "Here is the JSON..." and the markdown ``` backticks. It's not reliable.

The better way: JSON mode

Modern LLMs have a built-in JSON Mode. When you enable this, the model guarantees its output string will be a syntactically valid JSON object. No extra chat, no parsing errors.

json_mode_prompt = [
    {"role": "system", "content": "You are a data extraction assistant. Output a valid JSON object."},
    {"role": "user", "content": f"Extract the name, age, city, and profession from: '{text}'"}
]

# We pass a special parameter to the API call
response = call_llm(
    json_mode_prompt,
    response_format={"type": "json_object"}
)

# --- LLM's (Reliable) Response ---
# {
#   "name": "John Doe",
#   "age": 32,
#   "city": "New York",
#   "profession": "software engineer"
# }

This is much better! Our code can now reliably parse the string.

But what if the LLM decides to use different keys? What if it returns "job" instead of "profession", or "age": "32" (a string) instead of 32 (an integer)? The syntax is correct, but the schema (the structure) is still not guaranteed.

The best way: Tool Calling with Pydantic

This is the most powerful, modern, and reliable technique. We give the LLM a set of "tools" it can use. To do this, we first define a strict "data contract" using a Python library called Pydantic.

Step 1: Define your data contract with Pydantic

Pydantic lets us define a Python class that acts as our schema. We specify the exact field names, data types, and even descriptions.

from pydantic import BaseModel, Field
from typing import Optional

# This class IS our data contract.
class UserInfo(BaseModel):
    """A model to hold detailed information about a user."""
    
    # The '...' means this field is REQUIRED
    name: str = Field(..., description="The user's full name.")
    age: int = Field(..., description="The user's age in years.")
    
    # This field is OPTIONAL
    profession: Optional[str] = Field(None, description="The user's job title.")

Step 2: Give the contract to the LLM as a tool

We convert this Pydantic class into a JSON schema and give it to the LLM as a "tool" it's allowed to "call."

# Create the tool definition for the OpenAI API
user_tool = {
    "type": "function",
    "function": {
        "name": "extract_user_info",
        "description": "Extracts structured user information from text.",
        
        # Pydantic automatically generates the exact JSON schema!
        "parameters": UserInfo.model_json_schema() 
    }
}

Step 3: Call the LLM and get a validated object

Now, we make the LLM call. We give it the user's text and our list of tools, and force it to use our new tool.

text = "Jane Smith is a 45-year-old product manager."

prompt = [{"role": "user", "content": f"Please process this user: {text}"}]

# Force the LLM to use our tool
response_message = call_llm(
    prompt,
    tools=[user_tool],
    tool_choice={"type": "function", "function": {"name": "extract_user_info"}}
)

# The LLM doesn't "chat" back. It returns the arguments for the tool.
# --- LLM's Response (Tool Call) ---
# {
#   "name": "Jane Smith",
#   "age": 45,
#   "profession": "product manager"
# }

# Now, the magic: we validate this JSON string against our Pydantic class
arguments_str = response_message.tool_calls[0].function.arguments
user_object = UserInfo.model_validate_json(arguments_str)

# We now have a clean, validated Python object!
print(user_object.name)     # Output: Jane Smith
print(user_object.age)      # Output: 45
print(user_object.profession) # Output: product manager

This is the key to building real applications. The LLM acts as a natural language interpreter that directly populates our application's data models.

graph TD
    A["User Text: '...45-year-old product manager...'"] --> B{LLM}
    C["Pydantic Schema <br/> (name, age, profession)"] --> B
    B --> D["Guaranteed JSON <br/> {'name': 'Jane Smith', 'age': 45, ...}"]
    D --> E["Validated Python Object <br/> user_object.name"]

The final step: the LLM as a router

This "Tool Calling" concept has another, even more powerful use. What if we give the LLM multiple tools?

Imagine an e-commerce chatbot. We can define three tools:

  1. SearchProduct(query: str)
  2. AddToCart(product_id: str, quantity: int)
  3. CheckOrderStatus(order_id: str)

Now, we don't force the LLM to use a specific tool. We just give it the list of all three and let it choose.

# 1. Define all the tools the LLM can use
tools = [
    {"type": "function", "function": {"name": "search_product", ...}},
    {"type": "function", "function": {"name": "add_to_cart", ...}},
    {"type": "function", "function": {"name": "check_order_status", ...}}
]

# 2. A user sends a natural language command
user_query = "I need to know the status of order #A-12345"

# 3. We ask the LLM to respond, giving it the tool list
response_message = call_llm(
    [{"role": "user", "content": user_query}],
    tools=tools
)

# 4. The LLM intelligently chooses the right tool and extracts the arguments
tool_call = response_message.tool_calls[0]
print(f"LLM chose tool: {tool_call.function.name}")
print(f"Arguments: {tool_call.function.arguments}")

# Output:
# LLM chose tool: check_order_status
# Arguments: {"order_id": "A-12345"}

The LLM has just acted as an intelligent router, converting the user's plain English into a precise, structured function call that our application can execute.

graph TD
    A["User Query: 'Check order #A-12345'"] --> B{LLM Router}
    B -- Chooses --> C["Tool: check_order_status(order_id='A-12345')"]
    B -.-> D["Tool: search_product(...)"]
    B -.-> E["Tool: add_to_cart(...)"]

Key takeaways

  • Prompting for JSON is unreliable: Never trust an LLM to format JSON just by asking. It will fail in production.
  • JSON mode guarantees syntax, not schema: It's a good step up, but it doesn't enforce key names or data types.
  • Pydantic is your data contract: It's the clearest, most robust way to define the exact data structure you need.
  • Tool calling is the key: This is the technology that turns LLMs from text generators into true application components. It's the "engine" of all modern AI agents.
  • LLMs can be routers: By giving an LLM multiple tools, it can act as an intelligent dispatcher, choosing the right action based on the user's intent.

For more on building production AI systems, check out our AI Bootcamp for Software Engineers.

Share this post

Continue Reading

Weekly Bytes of AI — Newsletter by Param

Technical deep-dives for engineers building production AI systems.

Architecture patterns, system design, cost optimization, and real-world case studies. No fluff, just engineering insights.

Unsubscribe anytime. We respect your inbox.