Assembling and Running Your Thinking RAG Agent

Param Harrison
7 min read

Share this post

In our last post, we built all the "thinking" nodes for our agent: the Router, the Grader, and the Generator. We have all the "bricks" for our advanced RAG system.

This post is where we put it all together. We will use LangGraph to "wire up" these nodes and build the final, autonomous, self-correcting agent. Then, we'll take it for a test drive and see it in action.

The problem: Connecting the "Bricks"

We have a collection of Python functions (route_query, retrieve, grade_documents, etc.), but they don't know about each other. We need a way to define the flow and logic from our blueprint.

Our Agent's Blueprint:

graph TD
    A[Start] --> B(Route Query)
    B -- "vectorstore" --> C[Retrieve]
    B -- "web_search" --> D[Web Search]
    C --> E(Grade Documents)
    D --> E
    E -- "yes" --> F[Generate Answer]
    E -- "no" --> D
    F --> G[End]
    
    style B fill:#e3f2fd,stroke:#0d47a1
    style E fill:#e3f2fd,stroke:#0d47a1

The "How": Assembling the graph with LangGraph

Let's translate this flowchart directly into LangGraph code.

from langgraph.graph import StateGraph, END

# We also need all our node functions from the previous posts:
# GraphState, route_query, retrieve, web_search, grade_documents, generate

# --- 1. Initialize the Graph ---
# We tell the graph what our "memory" (state) looks like
workflow = StateGraph(GraphState)

# --- 2. Add all our "Nodes" (the "bricks") ---
# Each node is just a name and the Python function it runs
workflow.add_node("retrieve", retrieve)
workflow.add_node("web_search", web_search)
workflow.add_node("generate", generate)

# Note: The "router" and "grader" are added as "edges" because they *decide*

# --- 3. Define the "Edges" (the "wires") ---
# 3a. The Entry Point (The Router)
# This is the *first* decision.
workflow.set_conditional_entry_point(
    route_query,  # The 'route_query' function will run first
    {
        # The return value of 'route_query' maps to a node name
        "vectorstore": "retrieve",
        "web_search": "web_search"
    }
)

# 3b. The Self-Correction Loop (The Grader)
# This is the *second* decision.
workflow.add_conditional_edges(
    "retrieve",        # After the 'retrieve' node runs...
    grade_documents,   # ...run the 'grade_documents' function to decide...
    {
        # The return value of 'grade_documents' maps to a node name
        "yes": "generate",   # If docs are good, go to 'generate'
        "no": "web_search"   # If docs are bad, go to 'web_search'
    }
)

# 3c. The Final Steps
# After 'web_search', we don't need to grade again, just generate
workflow.add_edge("web_search", "generate")

# After 'generate', we are done
workflow.add_edge("generate", END)

# --- 4. Compile the Graph ---
# This turns our blueprint into a runnable application
app = workflow.compile()

print("✅ Agentic graph compiled!")

Observation: We have just "programmed" our agent's logic. We've defined its memory, its tools, and its decision-making process. The code is a perfect mirror of our flowchart.

The test drive: Running our agent

Now for the moment of truth. Let's run the exact same query that failed our simple RAG in Post 1 and watch our new agent "think."

Test 1: The "Hard" query (The self-correction path)

The "How": We call our compiled app with the comparative query.

from pprint import pprint

query = "How does the Model-V compare to the new Model-Z from our competitor?"

inputs = {"question": query}

# We'll stream the output to see every step
for output in app.stream(inputs):
    for key, value in output.items():
        pprint(f"--- Finished Node: {key} ---")
        
# Print the final answer from the state
print("\n--- FINAL ANSWER ---")
print(value["generation"])

The "Log" (What we see in the terminal):

--- Finished Node: __start__ ---
---NODE: ROUTE_QUERY---
Routing decision: web_search
--- Finished Node: web_search ---
---NODE: WEB_SEARCH---
--- Finished Node: generate ---
---NODE: GENERATE---
--- Finished Node: __end__ ---
--- FINAL ANSWER ---
The Model-V, with its 5-trillion parameter architecture, is focused on
technical domains. In contrast, web sources indicate that the Model-Z
is a smaller, faster model from a competitor, aimed at creative tasks.

Observation: It worked! The Router correctly identified this as an external query and sent it directly to web_search. The agent completely bypassed the irrelevant internal documents.

But what about the self-correction loop? Let's try a query that tricks the router.

Test 2: The "Tricky" query (The correction loop)

The "How": We'll ask a question that sounds internal but can't be answered.

Query: "What is the pricing for Model-V?" (Remember, our docs only have tech specs).

query = "What is the pricing for Model-V?"

inputs = {"question": query}

for output in app.stream(inputs):
    for key, value in output.items():
        pprint(f"--- Finished Node: {key} ---")

The "Log" (The full loop in action):

graph TD
    A["Query: 'Pricing?'"] --> B(1. route_query)
    B -- "vectorstore" --> C(2. retrieve)
    C --> D(3. grade_documents)
    D -- "no" --> E(4. web_search)
    E --> F(5. generate)
    F --> G[Final Answer]
    
    style B fill:#e3f2fd,stroke:#0d47a1
    style C fill:#fff8e1,stroke:#f57f17
    style D fill:#e3f2fd,stroke:#0d47a1
    style E fill:#fff8e1,stroke:#f57f17
    style F fill:#e8f5e9,stroke:#388e3c

Terminal Output:

--- Finished Node: __start__ ---
---NODE: ROUTE_QUERY---
Routing decision: vectorstore
--- Finished Node: retrieve ---
---NODE: RETRIEVE---
(Retrieves docs on 'architecture' and 'processing core')
--- Finished Node: grade_documents ---
---NODE: GRADE_DOCUMENTS---
Grader decision: no
--- Finished Node: web_search ---
---NODE: WEB_SEARCH---
(Finds a tech article: "Model-V pricing is unannounced...")
--- Finished Node: generate ---
---NODE: GENERATE---
--- Finished Node: __end__ ---
--- FINAL ANSWER ---
Based on web search results, the official pricing for Model-V
has not yet been announced.

Observation: This is the magic.

  1. Our Router made a mistake! It saw "Model-V" and sent the query to retrieve.
  2. Our Grader caught the mistake. It saw the retrieved docs had no pricing info and returned "no".
  3. The graph then self-corrected, re-routing the query to web_search.
  4. The final Generator got the correct context (from the web) and gave a perfect answer.

Final takeaways

  1. Graphs > Chains: Linear chains are simple but brittle. Agentic graphs (using tools like LangGraph) are how you build robust, production-ready systems.
  2. Routing is Efficiency: A Router node (Post 3) is the first step. It saves you from wasting time and money by sending the query to the right tool first.
  3. Grading is Reliability: A Grader node (Post 3) is the most critical part. It enables self-correction, allowing the agent to recover from its own mistakes.
  4. Agents are Systems: This approach shifts our thinking from just "prompt engineering" to "systems engineering." We built a logical, stateful system where the LLM is just one (very smart) component.

Challenge for you

  1. Use Case: Our web_search is good, but it's not graded. What if the web search also returns junk?
  2. The Goal: Make the agent even more robust.
  3. Your Task: How would you modify the graph's "wires" (the edges) to fix this? (Hint: Where could the web_search node connect to? What if it created a loop?)

Key takeaways

  • LangGraph wires nodes together: Using StateGraph, we define nodes, conditional edges, and cycles to create a complete agent workflow
  • Conditional edges enable decision-making: The Router and Grader use conditional edges to dynamically route the flow based on LLM decisions
  • Self-correction loops are powerful: When the Grader detects bad results, it triggers a fallback path, making the agent robust to failures
  • Streaming shows the agent thinking: By streaming the graph execution, we can see exactly which nodes run and how the agent makes decisions
  • The complete system is more than the sum of parts: Individually, each node is simple, but together they create an intelligent, self-correcting agent

For more on LangGraph and agent frameworks, see our agent framework comparison and our advanced RAG guide.


For more on building production AI systems, check out our AI Bootcamp for Software Engineers.

Share this post

Continue Reading

Weekly Bytes of AI

Technical deep-dives for engineers building production AI systems.

Architecture patterns, system design, cost optimization, and real-world case studies. No fluff, just engineering insights.

Unsubscribe anytime. We respect your inbox.