I built a multi-agent AI system from scratch using LangGraph to automate complex development tasks through specialized agents that collaborate. This project demonstrates how to orchestrate multiple AI agents to handle research, coding, and review in a structured pipeline, moving beyond single-prompt interactions. You can explore the code and my other projects at suhailroushan.com.
Architecture Overview
The core idea is a supervisor agent that delegates tasks to specialized sub-agents: a researcher, a coder, and a reviewer. The system uses a directed graph to manage the workflow, where each node is an agent and edges define the transition paths based on the output. State management is centralized, allowing each agent to read from and write to a shared context.
graph TD
A[Supervisor Agent] -->|Research Task| B[Researcher Agent];
A -->|Coding Task| C[Coder Agent];
A -->|Review Task| D[Reviewer Agent];
B -->|Findings| E[Shared State];
C -->|Code| E;
D -->|Feedback| E;
E -->|Next Decision| A;
This cycle continues until the supervisor determines the task is complete. I used Redis for persistent state storage, which is crucial for debugging and resuming long-running operations.
Key Technical Decisions
Using LangGraph’s StateGraph was the foundational choice. It forces you to define a clear schema for the shared state, which becomes the single source of truth for all agents. Here’s how I defined the state and built the graph in TypeScript.
import { StateGraph, Annotation } from "@langchain/langgraph";
import { BaseMessage } from "@langchain/core/messages";
// Define the shared state schema
const StateSchema = Annotation.Root({
messages: Annotation<BaseMessage[]>({
reducer: (current, update) => current.concat(update),
}),
researchFindings: Annotation<string>({
reducer: (current, update) => update || current,
}),
codeArtifact: Annotation<string>({
reducer: (current, update) => update || current,
}),
reviewFeedback: Annotation<string>({
reducer: (current, update) => update || current,
}),
nextAgent: Annotation<string>({
reducer: (current, update) => update || current,
}),
});
// Initialize the graph
const workflow = new StateGraph(StateSchema)
.addNode("supervisor", supervisorNode)
.addNode("researcher", researcherNode)
.addNode("coder", coderNode)
.addNode("reviewer", reviewerNode);
Another key decision was implementing a conditional edge function for the supervisor. Instead of hardcoding transitions, the supervisor's output determines the next node.
// Conditional edge logic
function routeAssistant(state: typeof StateSchema.State): string {
const { messages } = state;
const lastMessage = messages[messages.length - 1];
if (lastMessage.content.includes("RESEARCH")) {
return "researcher";
} else if (lastMessage.content.includes("CODE")) {
return "coder";
} else if (lastMessage.content.includes("REVIEW")) {
return "reviewer";
}
return "end"; // Terminate the graph
}
// Add edges to the graph
workflow
.addConditionalEdges("supervisor", routeAssistant)
.addEdge("researcher", "supervisor")
.addEdge("coder", "supervisor")
.addEdge("reviewer", "supervisor")
.addEdge("supervisor", "end");
What Broke and How I Fixed It
The first major issue was state corruption. Initially, I let each agent directly modify the global state object, which led to race conditions and overwritten data during asynchronous operations. The fix was to use LangGraph's annotation reducers properly, as shown above. Each field's reducer function defines how updates are merged, preventing destructive writes.
The second problem was agent loops. The supervisor would sometimes get stuck delegating between the coder and reviewer indefinitely. I added a cycle detection mechanism by including a messageCount in the state and setting a hard limit.
const StateSchemaWithLimit = Annotation.Root({
// ... other fields
iterationCount: Annotation<number>({
reducer: (current) => (current || 0) + 1,
}),
});
// In the conditional edge function
function routeAssistantWithLimit(state: typeof StateSchemaWithLimit.State): string {
if (state.iterationCount > 10) {
console.warn("Iteration limit reached, terminating.");
return "end";
}
// ... original routing logic
}
How to Build Something Similar
Start by defining your state schema meticulously—every piece of data your agents need should be in there. Then, build and test each agent node in isolation before composing the graph. Use a simple in-memory store first, then add Redis for persistence.
Here’s a minimal researcher agent node to illustrate the pattern.
import { ChatOpenAI } from "@langchain/openai";
const researchModel = new ChatOpenAI({ modelName: "gpt-4-turbo" });
async function researcherNode(state: typeof StateSchema.State) {
const { messages } = state;
const lastUserMessage = messages[messages.length - 1].content;
const prompt = `You are a research assistant. Based on the query: "${lastUserMessage}", find relevant information and summarize key findings.`;
const response = await researchModel.invoke([{ role: "user", content: prompt }]);
return {
messages: [response],
researchFindings: response.content,
nextAgent: "supervisor",
};
}
Compile your graph and run it with an entry point.
const app = workflow.compile();
const finalState = await app.invoke({
messages: [{ role: "user", content: "Build a login form with React and Tailwind CSS." }],
nextAgent: "supervisor",
});
Would I Build It the Same Way Again?
For a production system, I would still use LangGraph for orchestration because its stateful graph model is ideal for multi-agent workflows. However, I'd likely replace the OpenAI functions for agent routing with a simpler, rule-based classifier to reduce latency and cost. The supervisor prompt was sometimes overkill for deciding between "coder" and "reviewer."
I would also consider breaking the monolithic graph into smaller, nested subgraphs for complex tasks. This improves modularity and makes debugging easier. LangGraph supports this, but I didn't implement it initially.
The one thing you should know before starting is that your graph is only as robust as your state schema—spend time designing it upfront.