Building an Agent's Brain with LangGraph

Param Harrison
5 min read

Share this post

In our last post, we proved that a simple, linear RAG pipeline is "brittle." It fails when a user's question requires information from outside its static knowledge base.

To fix this, we need to build a "smarter" system—an agent that can make decisions. Instead of a simple checklist, we'll build a "state machine" or a "graph" that can:

  1. Look at a question.
  2. Decide which tool to use (our vector store OR a web search).
  3. Check if the tool's output was any good.
  4. Loop back and try a different tool if it failed.

Today, we'll build the "blueprint" for this agent using a powerful library called LangGraph.

The problem: A linear chain isn't enough

Our old pipeline was a simple chain.

graph TD
    A[Retrieve] --> B[Generate] --> C[Answer]

This is inflexible. If the Retrieve step fails, the whole chain fails.

The solution: A "Cyclic" graph

We need a graph that can loop, branch, and make decisions. This is our new blueprint:

graph TD
    A[User Query] --> B(Route to Tool)
    B -- "Internal Query" --> C[Retrieve from Vector Store]
    B -- "External Query" --> D[Search the Web]
    C --> E(Grade Documents)
    D --> E
    E -- "Good Docs" --> F[Generate Answer]
    E -- "Bad Docs" --> D
    F --> G[Final Answer]
    
    style B fill:#e3f2fd,stroke:#0d47a1
    style E fill:#e3f2fd,stroke:#0d47a1

This is a state machine. Route to Tool and Grade Documents are "conditional edges" (decision points). Bad Docs --> Web Search is a "cycle" (our self-correcting loop).

To build this, we'll use LangGraph.

The "How": Building the agent's components

LangGraph works by defining "nodes" (the steps) and "edges" (the connections). First, let's build our "nodes."

Brick 1: The "Memory" (The GraphState)

Before we build the "nodes," we need to define our agent's "memory." A GraphState is a simple Python object (a TypedDict) that gets passed from node to node. Every node can read from and write to this "memory."

from typing import List, TypedDict

# This is the "memory" of our agent.
# Every node will have access to this.
class GraphState(TypedDict):
    question: str       # The user's query
    documents: List[str]  # The retrieved documents
    generation: str     # The final answer

Observation: This is the most important concept. By defining a shared "state," our generate node can see what the retrieve node found.

Brick 2: The "Tools" (Our nodes)

Now, we define our "tools" as plain Python functions. Each function takes the current state as input and returns a dictionary to update that state.

We need two tools to fetch information:

  1. retrieve: Searches our internal ChromaDB (from Post 1).
  2. web_search: Searches the public internet.

The "How":

from langchain_community.tools import DuckDuckGoSearchRun
import chromadb

# Initialize our tools
search_tool = DuckDuckGoSearchRun()
chroma_client = chromadb.Client()
collection = chroma_client.get_collection(name="product_docs") # Get our DB from Post 1

# --- Node 1: The Internal Retriever ---
def retrieve(state):
    print("---NODE: RETRIEVE---")
    question = state["question"]
    
    # Retrieve from our internal vector store
    documents = collection.query(
        query_texts=[question],
        n_results=3
    )['documents'][0]
    
    return {"documents": documents, "question": question}

# --- Node 2: The Web Searcher ---
def web_search(state):
    print("---NODE: WEB_SEARCH---")
    question = state["question"]
    
    # Call the DuckDuckGo tool
    search_result = search_tool.run(question)
    
    # We wrap the single string in a list to match our state's type
    documents = [search_result] 
    return {"documents": documents, "question": question}

Observation: We've built the "hands" of our agent. It now has two different ways to find information. But how does it know which one to use? And how does it generate the final answer?

Think About It: Our web_search node is a bit "dumb"—it just returns one long string from DuckDuckGo. How could we make this node "smarter"? (Hint: What if it retrieved multiple search results and "chunked" them?)

Next step

We've built our agent's "memory" (GraphState) and its "tools" (retrieve, web_search).

In our next post, we'll build the "brain" itself:

  1. The Router: The decision node that chooses which tool to use first.
  2. The Grader: The "self-correction" node that grades the results.
  3. The Generator: The node that writes the final answer.

Key takeaways

  • Linear chains are brittle: When one step fails, the entire pipeline fails—we need graphs that can branch and loop
  • State management is critical: GraphState allows nodes to share information and make decisions based on previous steps
  • Tools are just functions: Each tool is a simple Python function that takes state and returns updated state
  • LangGraph enables dynamic flow: Unlike linear chains, graphs can have conditional edges and cycles for self-correction
  • Foundation before logic: We build the tools first, then add the decision-making nodes that connect them

For more on LangGraph and agent frameworks, see our agent framework comparison.


For more on building production AI systems, check out our AI Bootcamp for Software Engineers.

Share this post

Continue Reading

Weekly Bytes of AI

Technical deep-dives for engineers building production AI systems.

Architecture patterns, system design, cost optimization, and real-world case studies. No fluff, just engineering insights.

Unsubscribe anytime. We respect your inbox.