Building a Self-Correcting RAG Agent
In our last post, we built the foundation for our advanced RAG agent. We defined its "memory" (GraphState) and its "tools" (retrieve, web_search).
But our agent is still just a collection of parts. It has no "brain" to connect them. This post is for you if you're ready to build the logic that makes an agent "smart."
Today, we'll build the three "thinking" nodes of our agent's brain:
- The Router: The initial decision-maker.
- The Grader: The "self-correction" loop.
- The Generator: The final "voice."
The problem: A "Dumb" agent
Without logic, our agent doesn't know what to do. If we ask it the "competitor" question, it doesn't know to use web_search instead of retrieve. If retrieve finds junk, the agent doesn't know it failed.
graph TD
A[User Query] --> B[Agent Without Logic]
B --> C[Doesn't know which tool to use]
B --> D[Doesn't know if results are good]
C --> E[Random or wrong tool choice]
D --> F[Accepts bad results]
E --> G[Poor Answer]
F --> G
style B fill:#ffebee,stroke:#b71c1c
style G fill:#ffebee,stroke:#b71c1c
We need to build nodes that can make decisions.
The "How": Building the "Thinking" nodes
These nodes are also simple Python functions, but instead of just fetching data, they use an LLM to reason.
Brick 3: The Router node (The first decision)
This is our agent's "triage" step. Its only job is to look at the user's question and decide where to send it first: to our internal vectorstore or to the public web_search.
from openai import OpenAI
llm_client = OpenAI() # Assumes OPENAI_API_KEY is set
# --- Node 3: The Router ---
def route_query(state):
print("---NODE: ROUTE_QUERY---")
question = state["question"]
# We ask an LLM to act as the router
prompt = [
{"role": "system", "content": """You are an expert at routing a user question.
Use 'vectorstore' for specific questions about 'Model-V' (its features, architecture, or training data).
Use 'web_search' for all other questions, especially comparisons, competitors, pricing, or recent events."""},
{"role": "user", "content": f"Given the user question, which datasource should I use? Question: {question}"}
]
response = llm_client.chat.completions.create(
model="gpt-4o-mini",
messages=prompt,
temperature=0
)
source = response.choices[0].message.content
print(f"Routing decision: {source}")
# The return value of this node will be the *name* of the *next* node to run
if "vectorstore" in source.lower():
return "retrieve"
else:
return "web_search"
Observation: We've built a "smart" router. By using an LLM, we don't need to write complex if/else rules. We just tell the LLM (in plain English) how to make the decision.
graph TD
A[User Query] --> B(Router Node)
B --> C{Decision}
C -- "vectorstore" --> D[Retrieve from Internal Docs]
C -- "web_search" --> E[Search the Web]
style B fill:#e3f2fd,stroke:#0d47a1
style C fill:#fff8e1,stroke:#f57f17
Brick 4: The Grader node (The self-correction loop)
This is the most important node in our graph. The Grader's job is to check the output of our tools. After the retrieve node runs, this node looks at the retrieved documents and grades them: "Are these documents actually relevant?"
This allows our agent to "realize" it failed.
# --- Node 4: The Grader ---
def grade_documents(state):
print("---NODE: GRADE_DOCUMENTS---")
question = state["question"]
documents = state["documents"]
# We ask an LLM to be the grader
prompt = [
{"role": "system", "content": """You are a grader. Your task is to determine if the retrieved documents
are relevant to the user question and contain enough information to answer it completely.
Respond with a single word: 'yes' or 'no'."""},
{"role": "user", "content": f"Retrieved Documents:\n{documents}\n\nUser Question: {question}"}
]
response = llm_client.chat.completions.create(
model="gpt-4o-mini",
messages=prompt,
temperature=0
)
decision = response.choices[0].message.content
print(f"Grader decision: {decision}")
# This 'yes' or 'no' will control the next conditional edge
if 'yes' in decision.lower():
return "yes"
else:
return "no"
Observation: This node is our "self-correcting" mechanism. If the retrieve node (our internal search) fails, this node will catch it and output "no," which will allow us to trigger the web_search as a fallback.
graph TD
A[Retrieved Documents] --> B(Grader Node)
B --> C{Are docs relevant?}
C -- "yes" --> D[Proceed to Generate]
C -- "no" --> E[Try Alternative Tool]
E --> F[Web Search]
F --> D
style B fill:#e3f2fd,stroke:#0d47a1
style C fill:#fff8e1,stroke:#f57f17
style E fill:#ffebee,stroke:#b71c1c
Brick 5: The Generator node (The final answer)
Finally, once we have good documents (either from retrieve or web_search), we pass them to our Generator node. This node's only job is to synthesize the final answer.
# --- Node 5: The Generator ---
def generate(state):
print("---NODE: GENERATE---")
question = state["question"]
documents = state["documents"]
context = "\n".join(documents)
# This is our familiar RAG prompt
prompt = [
{"role": "system", "content": "You are a helpful assistant. Answer the user's question based on the provided context. Be comprehensive and synthesize information from all provided documents."},
{"role": "user", "content": f"Context:\n{context}\n\nQuestion:\n{question}"}
]
response = llm_client.chat.completions.create(
model="gpt-4o-mini",
messages=prompt,
temperature=0
)
generation = response.choices[0].message.content
# We update the "generation" field in our state
return {"generation": generation}
We've built all the pieces! We have:
- Memory:
GraphState - Tools:
retrieve,web_search - Logic:
route_query,grade_documents - Voice:
generate
In our next and final post, we'll "wire up" all these nodes in LangGraph and run our fully autonomous, self-correcting agent.
Challenge for you
- Use Case: Our
grade_documentsnode is good, but it's a bit simple. - The Problem: What if the documents are relevant but not sufficient (e.g., they only answer half the question)? A "yes/no" grade isn't enough.
- Your Task: How would you re-design the
grade_documentsnode? What should it output instead of just "yes" or "no"? (Hint: What if it outputted "complete" vs. "partial" vs. "irrelevant"?)
Key takeaways
- Routing is the first decision: The Router node uses an LLM to intelligently choose which tool to use first, saving time and tokens
- Grading enables self-correction: The Grader node is the most critical—it allows the agent to recognize when it has failed and try alternative approaches
- LLMs make great decision-makers: Instead of complex if/else rules, we use natural language prompts to guide LLM-based routing and grading
- Generator synthesizes the final answer: Once we have good documents, the Generator node creates a comprehensive, natural-language response
- All nodes share state: The
GraphStateallows each node to see what previous nodes found and make decisions accordingly
For more on self-correcting systems, see our advanced RAG guide.
For more on building production AI systems, check out our AI Bootcamp for Software Engineers.