Assembling and Running Your Thinking RAG Agent
In our last post, we built all the "thinking" nodes for our agent: the Router, the Grader, and the Generator. We have all the "bricks" for our advanced RAG system.
This post is where we put it all together. We will use LangGraph to "wire up" these nodes and build the final, autonomous, self-correcting agent. Then, we'll take it for a test drive and see it in action.
The problem: Connecting the "Bricks"
We have a collection of Python functions (route_query, retrieve, grade_documents, etc.), but they don't know about each other. We need a way to define the flow and logic from our blueprint.
Our Agent's Blueprint:
graph TD
A[Start] --> B(Route Query)
B -- "vectorstore" --> C[Retrieve]
B -- "web_search" --> D[Web Search]
C --> E(Grade Documents)
D --> E
E -- "yes" --> F[Generate Answer]
E -- "no" --> D
F --> G[End]
style B fill:#e3f2fd,stroke:#0d47a1
style E fill:#e3f2fd,stroke:#0d47a1
The "How": Assembling the graph with LangGraph
Let's translate this flowchart directly into LangGraph code.
from langgraph.graph import StateGraph, END
# We also need all our node functions from the previous posts:
# GraphState, route_query, retrieve, web_search, grade_documents, generate
# --- 1. Initialize the Graph ---
# We tell the graph what our "memory" (state) looks like
workflow = StateGraph(GraphState)
# --- 2. Add all our "Nodes" (the "bricks") ---
# Each node is just a name and the Python function it runs
workflow.add_node("retrieve", retrieve)
workflow.add_node("web_search", web_search)
workflow.add_node("generate", generate)
# Note: The "router" and "grader" are added as "edges" because they *decide*
# --- 3. Define the "Edges" (the "wires") ---
# 3a. The Entry Point (The Router)
# This is the *first* decision.
workflow.set_conditional_entry_point(
route_query, # The 'route_query' function will run first
{
# The return value of 'route_query' maps to a node name
"vectorstore": "retrieve",
"web_search": "web_search"
}
)
# 3b. The Self-Correction Loop (The Grader)
# This is the *second* decision.
workflow.add_conditional_edges(
"retrieve", # After the 'retrieve' node runs...
grade_documents, # ...run the 'grade_documents' function to decide...
{
# The return value of 'grade_documents' maps to a node name
"yes": "generate", # If docs are good, go to 'generate'
"no": "web_search" # If docs are bad, go to 'web_search'
}
)
# 3c. The Final Steps
# After 'web_search', we don't need to grade again, just generate
workflow.add_edge("web_search", "generate")
# After 'generate', we are done
workflow.add_edge("generate", END)
# --- 4. Compile the Graph ---
# This turns our blueprint into a runnable application
app = workflow.compile()
print("✅ Agentic graph compiled!")
Observation: We have just "programmed" our agent's logic. We've defined its memory, its tools, and its decision-making process. The code is a perfect mirror of our flowchart.
The test drive: Running our agent
Now for the moment of truth. Let's run the exact same query that failed our simple RAG in Post 1 and watch our new agent "think."
Test 1: The "Hard" query (The self-correction path)
The "How": We call our compiled app with the comparative query.
from pprint import pprint
query = "How does the Model-V compare to the new Model-Z from our competitor?"
inputs = {"question": query}
# We'll stream the output to see every step
for output in app.stream(inputs):
for key, value in output.items():
pprint(f"--- Finished Node: {key} ---")
# Print the final answer from the state
print("\n--- FINAL ANSWER ---")
print(value["generation"])
The "Log" (What we see in the terminal):
--- Finished Node: __start__ ---
---NODE: ROUTE_QUERY---
Routing decision: web_search
--- Finished Node: web_search ---
---NODE: WEB_SEARCH---
--- Finished Node: generate ---
---NODE: GENERATE---
--- Finished Node: __end__ ---
--- FINAL ANSWER ---
The Model-V, with its 5-trillion parameter architecture, is focused on
technical domains. In contrast, web sources indicate that the Model-Z
is a smaller, faster model from a competitor, aimed at creative tasks.
Observation: It worked! The Router correctly identified this as an external query and sent it directly to web_search. The agent completely bypassed the irrelevant internal documents.
But what about the self-correction loop? Let's try a query that tricks the router.
Test 2: The "Tricky" query (The correction loop)
The "How": We'll ask a question that sounds internal but can't be answered.
Query: "What is the pricing for Model-V?" (Remember, our docs only have tech specs).
query = "What is the pricing for Model-V?"
inputs = {"question": query}
for output in app.stream(inputs):
for key, value in output.items():
pprint(f"--- Finished Node: {key} ---")
The "Log" (The full loop in action):
graph TD
A["Query: 'Pricing?'"] --> B(1. route_query)
B -- "vectorstore" --> C(2. retrieve)
C --> D(3. grade_documents)
D -- "no" --> E(4. web_search)
E --> F(5. generate)
F --> G[Final Answer]
style B fill:#e3f2fd,stroke:#0d47a1
style C fill:#fff8e1,stroke:#f57f17
style D fill:#e3f2fd,stroke:#0d47a1
style E fill:#fff8e1,stroke:#f57f17
style F fill:#e8f5e9,stroke:#388e3c
Terminal Output:
--- Finished Node: __start__ ---
---NODE: ROUTE_QUERY---
Routing decision: vectorstore
--- Finished Node: retrieve ---
---NODE: RETRIEVE---
(Retrieves docs on 'architecture' and 'processing core')
--- Finished Node: grade_documents ---
---NODE: GRADE_DOCUMENTS---
Grader decision: no
--- Finished Node: web_search ---
---NODE: WEB_SEARCH---
(Finds a tech article: "Model-V pricing is unannounced...")
--- Finished Node: generate ---
---NODE: GENERATE---
--- Finished Node: __end__ ---
--- FINAL ANSWER ---
Based on web search results, the official pricing for Model-V
has not yet been announced.
Observation: This is the magic.
- Our
Routermade a mistake! It saw "Model-V" and sent the query toretrieve. - Our
Gradercaught the mistake. It saw the retrieved docs had no pricing info and returned "no". - The graph then self-corrected, re-routing the query to
web_search. - The final
Generatorgot the correct context (from the web) and gave a perfect answer.
Final takeaways
- Graphs > Chains: Linear chains are simple but brittle. Agentic graphs (using tools like LangGraph) are how you build robust, production-ready systems.
- Routing is Efficiency: A
Routernode (Post 3) is the first step. It saves you from wasting time and money by sending the query to the right tool first. - Grading is Reliability: A
Gradernode (Post 3) is the most critical part. It enables self-correction, allowing the agent to recover from its own mistakes. - Agents are Systems: This approach shifts our thinking from just "prompt engineering" to "systems engineering." We built a logical, stateful system where the LLM is just one (very smart) component.
Challenge for you
- Use Case: Our
web_searchis good, but it's not graded. What if the web search also returns junk? - The Goal: Make the agent even more robust.
- Your Task: How would you modify the graph's "wires" (the edges) to fix this? (Hint: Where could the
web_searchnode connect to? What if it created a loop?)
Key takeaways
- LangGraph wires nodes together: Using
StateGraph, we define nodes, conditional edges, and cycles to create a complete agent workflow - Conditional edges enable decision-making: The Router and Grader use conditional edges to dynamically route the flow based on LLM decisions
- Self-correction loops are powerful: When the Grader detects bad results, it triggers a fallback path, making the agent robust to failures
- Streaming shows the agent thinking: By streaming the graph execution, we can see exactly which nodes run and how the agent makes decisions
- The complete system is more than the sum of parts: Individually, each node is simple, but together they create an intelligent, self-correcting agent
For more on LangGraph and agent frameworks, see our agent framework comparison and our advanced RAG guide.
For more on building production AI systems, check out our AI Bootcamp for Software Engineers.