If you've used Claude Code, you've seen what an AI agent can actually doβread files, run commands, edit code, figure out the steps to accomplish a task.
And you know it doesn't just help you write code, it takes ownership of problems and works through them the way a thoughtful engineer would.
The Claude Agent SDK is the same engine, yours to point at whatever problem you want, so you can easily build agents of your own.
The Claude Agent SDK is how you build that same thing into your own applications.
It's the infrastructure behind Claude Code, exposed as a library. You get the agent loop, the built-in tools, the context managementβall the stuff you'd otherwise have to build yourself.
This guide walks through building a code review agent from scratch. By the end, you'll have something that can analyze a codebase, find bugs and security issues, and return structured feedback. More importantly, you'll understand how the SDK works so you can build whatever you actually need.
Our code review agent will:
- Analyze a codebase for bugs and security issues
- Read files and search through code autonomously
- Provide structured, actionable feedback
- Track its progress as it works
β’ Runtime - Claude Code CLI
β’ SDK - @anthropic-ai/claude-agent-sdk
β’ Language - TypeScript
β’ Model - Claude Opus 4.5
If you've built agents with the raw API, you know the pattern: call the model, check if it wants to use a tool, execute the tool, feed the result back, repeat until done. It's tedious.
The SDK handles that loop:
// Without SDK: You manage the loop
let response = await client.messages.create({...});
while (response.stop_reason === "tool_use") {
const result = yourToolExecutor(response.tool_use);
response = await client.messages.create({ tool_result: result, ... });
}
// With SDK: Claude manages it
for await (const message of query({ prompt: "Fix the bug in auth.py" })) {
console.log(message); // Claude reads files, finds bugs, edits code
}You also get working tools out of the box:
| Tool | What it does |
|---|---|
| Read | Read any file in the working directory |
| Write | Create new files |
| Edit | Make precise edits to existing files |
| Bash | Run terminal commands |
| Glob | Find files by pattern |
| Grep | Search file contents with regex |
| WebSearch | Search the web |
| WebFetch | Fetch and parse web pages |
You don't have to implement any of this yourself.
- Node.js 18+ installed
- An Anthropic API key (get one here)
The Agent SDK uses Claude Code as its runtime:
npm install -g @anthropic-ai/claude-codeAfter installing, run claude in your terminal and follow the prompts to authenticate.
mkdir code-review-agent && cd code-review-agent
npm init -y
npm install @anthropic-ai/claude-agent-sdk
npm install -D typescript @types/node tsxexport ANTHROPIC_API_KEY=your-api-keyCreate agent.ts:
import { query } from "@anthropic-ai/claude-agent-sdk";
async function main() {
for await (const message of query({
prompt: "What files are in this directory?",
options: {
model: "opus",
allowedTools: ["Glob", "Read"],
maxTurns: 50
}
})) {
if (message.type === "assistant") {
for (const block of message.message.content) {
if ("text" in block) {
console.log(block.text);
}
}
}
if (message.type === "result") {
console.log("\nDone:", message.subtype);
}
}
}
main();Run it:
npx tsx agent.tsClaude will use the Glob tool to list files and tell you what it found.
The query() function returns an async generator that streams messages as Claude works. Here are the key message types:
for await (const message of query({ prompt: "..." })) {
switch (message.type) {
case "system":
// Session initialization info
if (message.subtype === "init") {
console.log("Session ID:", message.session_id);
console.log("Available tools:", message.tools);
}
break;
case "assistant":
// Claude's responses and tool calls
for (const block of message.message.content) {
if ("text" in block) {
console.log("Claude:", block.text);
} else if ("name" in block) {
console.log("Tool call:", block.name);
}
}
break;
case "result":
// Final result
console.log("Status:", message.subtype); // "success" or error type
console.log("Cost:", message.total_cost_usd);
break;
}
}Now let's build something useful. Create review-agent.ts:
import { query } from "@anthropic-ai/claude-agent-sdk";
async function reviewCode(directory: string) {
console.log(`\nπ Starting code review for: ${directory}\n`);
for await (const message of query({
prompt: `Review the code in ${directory} for:
1. Bugs and potential crashes
2. Security vulnerabilities
3. Performance issues
4. Code quality improvements
Be specific about file names and line numbers.`,
options: {
model: "opus",
allowedTools: ["Read", "Glob", "Grep"],
permissionMode: "bypassPermissions", // Auto-approve read operations
maxTurns: 50
}
})) {
// Show Claude's analysis as it happens
if (message.type === "assistant") {
for (const block of message.message.content) {
if ("text" in block) {
console.log(block.text);
} else if ("name" in block) {
console.log(`\nπ Using ${block.name}...`);
}
}
}
// Show completion status
if (message.type === "result") {
if (message.subtype === "success") {
console.log(`\nβ
Review complete! Cost: $${message.total_cost_usd.toFixed(4)}`);
} else {
console.log(`\nβ Review failed: ${message.subtype}`);
}
}
}
}
// Review the current directory
reviewCode(".");Create a file with some intentional issues. Create example.ts:
function processUsers(users: any) {
for (let i = 0; i <= users.length; i++) { // Off-by-one error
console.log(users[i].name.toUpperCase()); // No null check
}
}
function connectToDb(password: string) {
const connectionString = `postgres://admin:${password}@localhost/db`;
console.log("Connecting with:", connectionString); // Logging sensitive data
}
async function fetchData(url) { // Missing type annotation
const response = await fetch(url);
return response.json(); // No error handling
}Run the review:
npx tsx review-agent.tsClaude will identify the bugs, security issues, and suggest fixes.
For programmatic use, you'll want structured data. The SDK supports JSON Schema output:
import { query } from "@anthropic-ai/claude-agent-sdk";
const reviewSchema = {
type: "object",
properties: {
issues: {
type: "array",
items: {
type: "object",
properties: {
severity: { type: "string", enum: ["low", "medium", "high", "critical"] },
category: { type: "string", enum: ["bug", "security", "performance", "style"] },
file: { type: "string" },
line: { type: "number" },
description: { type: "string" },
suggestion: { type: "string" }
},
required: ["severity", "category", "file", "description"]
}
},
summary: { type: "string" },
overallScore: { type: "number" }
},
required: ["issues", "summary", "overallScore"]
};
async function reviewCodeStructured(directory: string) {
for await (const message of query({
prompt: `Review the code in ${directory}. Identify all issues.`,
options: {
model: "opus",
allowedTools: ["Read", "Glob", "Grep"],
permissionMode: "bypassPermissions",
maxTurns: 50,
outputFormat: {
type: "json_schema",
schema: reviewSchema
}
}
})) {
if (message.type === "result" && message.subtype === "success") {
const review = message.structured_output as {
issues: Array<{
severity: string;
category: string;
file: string;
line?: number;
description: string;
suggestion?: string;
}>;
summary: string;
overallScore: number;
};
console.log(`\nπ Code Review Results\n`);
console.log(`Score: ${review.overallScore}/100`);
console.log(`Summary: ${review.summary}\n`);
for (const issue of review.issues) {
const icon = issue.severity === "critical" ? "π΄" :
issue.severity === "high" ? "π " :
issue.severity === "medium" ? "π‘" : "π’";
console.log(`${icon} [${issue.category.toUpperCase()}] ${issue.file}${issue.line ? `:${issue.line}` : ""}`);
console.log(` ${issue.description}`);
if (issue.suggestion) {
console.log(` π‘ ${issue.suggestion}`);
}
console.log();
}
}
}
}
reviewCodeStructured(".");By default, the SDK asks for approval before executing tools. You can customize this:
options: {
// Standard mode - prompts for approval
permissionMode: "default",
// Auto-approve file edits
permissionMode: "acceptEdits",
// No prompts (use with caution)
permissionMode: "bypassPermissions"
}For fine-grained control, use canUseTool:
options: {
canUseTool: async (toolName, input) => {
// Allow all read operations
if (["Read", "Glob", "Grep"].includes(toolName)) {
return { behavior: "allow", updatedInput: input };
}
// Block writes to certain files
if (toolName === "Write" && input.file_path?.includes(".env")) {
return { behavior: "deny", message: "Cannot modify .env files" };
}
// Allow everything else
return { behavior: "allow", updatedInput: input };
}
}For complex tasks, you can create specialized subagents:
import { query, AgentDefinition } from "@anthropic-ai/claude-agent-sdk";
async function comprehensiveReview(directory: string) {
for await (const message of query({
prompt: `Perform a comprehensive code review of ${directory}.
Use the security-reviewer for security issues and test-analyzer for test coverage.`,
options: {
model: "opus",
allowedTools: ["Read", "Glob", "Grep", "Task"], // Task enables subagents
permissionMode: "bypassPermissions",
maxTurns: 50,
agents: {
"security-reviewer": {
description: "Security specialist for vulnerability detection",
prompt: `You are a security expert. Focus on:
- SQL injection, XSS, CSRF vulnerabilities
- Exposed credentials and secrets
- Insecure data handling
- Authentication/authorization issues`,
tools: ["Read", "Grep", "Glob"],
model: "sonnet"
} as AgentDefinition,
"test-analyzer": {
description: "Test coverage and quality analyzer",
prompt: `You are a testing expert. Analyze:
- Test coverage gaps
- Missing edge cases
- Test quality and reliability
- Suggestions for additional tests`,
tools: ["Read", "Grep", "Glob"],
model: "haiku" // Use faster model for simpler analysis
} as AgentDefinition
}
}
})) {
if (message.type === "assistant") {
for (const block of message.message.content) {
if ("text" in block) {
console.log(block.text);
} else if ("name" in block && block.name === "Task") {
console.log(`\nπ€ Delegating to: ${(block.input as any).subagent_type}`);
}
}
}
}
}
comprehensiveReview(".");For multi-turn conversations, capture and resume sessions:
import { query } from "@anthropic-ai/claude-agent-sdk";
async function interactiveReview() {
let sessionId: string | undefined;
// Initial review
for await (const message of query({
prompt: "Review this codebase and identify the top 3 issues",
options: {
model: "opus",
allowedTools: ["Read", "Glob", "Grep"],
permissionMode: "bypassPermissions",
maxTurns: 50
}
})) {
if (message.type === "system" && message.subtype === "init") {
sessionId = message.session_id;
}
// ... handle messages
}
// Follow-up question using same session
if (sessionId) {
for await (const message of query({
prompt: "Now show me how to fix the most critical issue",
options: {
resume: sessionId, // Continue the conversation
allowedTools: ["Read", "Glob", "Grep"],
maxTurns: 50
}
})) {
// Claude remembers the previous context
}
}
}Hooks let you intercept and customize agent behavior:
import { query, HookCallback, PreToolUseHookInput } from "@anthropic-ai/claude-agent-sdk";
// Hook callbacks receive three arguments:
// 1. input - details about the event (tool name, arguments, etc.)
// 2. toolUseId - correlates PreToolUse and PostToolUse events for the same call
// 3. context - contains AbortSignal for cancellation
const auditLogger: HookCallback = async (input, toolUseId, { signal }) => {
if (input.hook_event_name === "PreToolUse") {
const preInput = input as PreToolUseHookInput;
console.log(`[AUDIT] ${new Date().toISOString()} - ${preInput.tool_name}`);
}
return {}; // Return empty object to allow the operation
};
const blockDangerousCommands: HookCallback = async (input, toolUseId, { signal }) => {
if (input.hook_event_name === "PreToolUse") {
const preInput = input as PreToolUseHookInput;
if (preInput.tool_name === "Bash") {
const command = (preInput.tool_input as any).command || "";
if (command.includes("rm -rf") || command.includes("sudo")) {
return {
hookSpecificOutput: {
hookEventName: "PreToolUse",
permissionDecision: "deny", // Block the tool from executing
permissionDecisionReason: "Dangerous command blocked"
}
};
}
}
}
return {};
};
for await (const message of query({
prompt: "Clean up temporary files",
options: {
model: "opus",
allowedTools: ["Bash", "Glob"],
maxTurns: 50,
hooks: {
// PreToolUse fires before each tool executes
// Other hooks: PostToolUse, Stop, SessionStart, SessionEnd, etc.
PreToolUse: [
// Each entry has an optional matcher (regex) and an array of callbacks
// No matcher = runs for ALL tool calls
{ hooks: [auditLogger] },
// matcher: 'Bash' = only runs when tool_name matches 'Bash'
// Use regex for multiple tools: 'Bash|Write|Edit'
{ matcher: "Bash", hooks: [blockDangerousCommands] }
]
}
}
})) {
if (message.type === "assistant") {
for (const block of message.message.content) {
if ("text" in block) {
console.log(block.text);
}
}
}
}
import { query, HookCallback, PreToolUseHookInput } from "@anthropic-ai/claude-agent-sdk";Tools are how agents interact with the world - reading files, calling APIs, querying databases, running code. The SDK includes built-in tools for common operations (filesystem, shell, web), but most agents will need custom tools to access your own systems.
The raw API pattern
Without the SDK, you manage the tool loop yourself:
// 1. Define tools with their schemas
const tools = [{
name: "get_weather",
description: "Get current weather for a city",
input_schema: {
type: "object",
properties: {
city: { type: "string", description: "City name" }
},
required: ["city"]
}
}];
// 2. Write an executor for each tool
function executeTool(name: string, input: any): string {
if (name === "get_weather") {
return fetchWeatherAPI(input.city);
}
throw new Error(`Unknown tool: ${name}`);
}
// 3. Run the agent loop
const messages = [{ role: "user", content: "What's the weather in Tokyo?" }];
let response = await client.messages.create({
model: "claude-opus-4-5-20251101",
tools,
messages
});
while (response.stop_reason === "tool_use") {
messages.push({ role: "assistant", content: response.content });
const toolResults = response.content
.filter(block => block.type === "tool_use")
.map(toolUse => ({
type: "tool_result",
tool_use_id: toolUse.id,
content: executeTool(toolUse.name, toolUse.input)
}));
messages.push({ role: "user", content: toolResults });
response = await client.messages.create({ model, tools, messages });
}
const textBlock = response.content.find(block => block.type === "text");
if (textBlock && textBlock.type === "text") {
console.log("Final response:", textBlock.text);
}Key points:
- Claude decides when to use tools based on the user's request and tool descriptions
- You execute the tools and return results
- The loop continues until Claude has enough information (
stop_reason: "end_turn") - Message history grows with each iteration - the API is stateless, so every request needs the full conversation
What the SDK handles
When you use built-in tools like allowedTools: ["Read", "Glob"], the SDK manages all of this automatically - definitions, execution, and the loop.
For custom tools, you need a way to define them so the SDK can do the same. That's what MCP provides.
Extend Claude with custom tools using Model Context Protocol:
import { query, tool, createSdkMcpServer } from "@anthropic-ai/claude-agent-sdk";
import { z } from "zod";
// Create your custom MCP server
const customServer = createSdkMcpServer({
name: "code-metrics",
version: "1.0.0",
tools: [
// Define a custom tool using the `tool` helper
// Arguments: name, description, input schema, handler function
tool(
"analyze_complexity",
"Calculate cyclomatic complexity for a file",
{
// Zod schema defines what inputs the tool accepts
filePath: z.string().describe("Path to the file to analyze")
},
// Handler function - runs when Claude calls the tool
async (args) => {
// In real implementation, calculate actual complexity
const complexity = Math.floor(Math.random() * 20) + 1;
// Return format required by MCP - array of content blocks
return {
content: [{
type: "text",
text: `Cyclomatic complexity for ${args.filePath}: ${complexity}`
}]
};
}
)
]
});
async function analyzeCode(filePath: string) {
for await (const message of query({
prompt: `Analyze the complexity of ${filePath}`,
options: {
model: "opus",
// Register the custom MCP server
// The key ("code-metrics") becomes part of the tool name
mcpServers: {
"code-metrics": customServer
},
// Specify which tools Claude can use
// MCP tools follow the pattern: mcp__<server-name>__<tool-name>
allowedTools: ["Read", "mcp__code-metrics__analyze_complexity"],
// Maximum number of back-and-forth turns before stopping
maxTurns: 50
}
})) {
// Handle assistant messages (Claude's responses and tool calls)
if (message.type === "assistant") {
for (const block of message.message.content) {
// Text blocks contain Claude's written responses
if ("text" in block) {
console.log(block.text);
}
}
}
// Handle the final result when the agent loop completes
if (message.type === "result") {
console.log("Done:", message.subtype); // "success" or an error type
}
}
}
analyzeCode("main.ts");Track API costs for billing:
for await (const message of query({ prompt: "..." })) {
if (message.type === "result" && message.subtype === "success") {
console.log("Total cost:", message.total_cost_usd);
console.log("Token usage:", message.usage);
// Per-model breakdown (useful with subagents)
for (const [model, usage] of Object.entries(message.modelUsage)) {
console.log(`${model}: $${usage.costUSD.toFixed(4)}`);
}
}
}Here's a production-ready agent that ties everything together:
import { query, AgentDefinition } from "@anthropic-ai/claude-agent-sdk";
interface ReviewResult {
issues: Array<{
severity: "low" | "medium" | "high" | "critical";
category: "bug" | "security" | "performance" | "style";
file: string;
line?: number;
description: string;
suggestion?: string;
}>;
summary: string;
overallScore: number;
}
const reviewSchema = {
type: "object",
properties: {
issues: {
type: "array",
items: {
type: "object",
properties: {
severity: { type: "string", enum: ["low", "medium", "high", "critical"] },
category: { type: "string", enum: ["bug", "security", "performance", "style"] },
file: { type: "string" },
line: { type: "number" },
description: { type: "string" },
suggestion: { type: "string" }
},
required: ["severity", "category", "file", "description"]
}
},
summary: { type: "string" },
overallScore: { type: "number" }
},
required: ["issues", "summary", "overallScore"]
};
async function runCodeReview(directory: string): Promise<ReviewResult | null> {
console.log(`\n${"=".repeat(50)}`);
console.log(`π Code Review Agent`);
console.log(`π Directory: ${directory}`);
console.log(`${"=".repeat(50)}\n`);
let result: ReviewResult | null = null;
for await (const message of query({
prompt: `Perform a thorough code review of ${directory}.
Analyze all source files for:
1. Bugs and potential runtime errors
2. Security vulnerabilities
3. Performance issues
4. Code quality and maintainability
Be specific with file paths and line numbers where possible.`,
options: {
model: "opus",
allowedTools: ["Read", "Glob", "Grep", "Task"],
permissionMode: "bypassPermissions",
maxTurns: 50,
outputFormat: {
type: "json_schema",
schema: reviewSchema
},
agents: {
"security-scanner": {
description: "Deep security analysis for vulnerabilities",
prompt: `You are a security expert. Scan for:
- Injection vulnerabilities (SQL, XSS, command injection)
- Authentication and authorization flaws
- Sensitive data exposure
- Insecure dependencies`,
tools: ["Read", "Grep", "Glob"],
model: "sonnet"
} as AgentDefinition
}
}
})) {
// Progress updates
if (message.type === "assistant") {
for (const block of message.message.content) {
if ("name" in block) {
if (block.name === "Task") {
console.log(`π€ Delegating to: ${(block.input as any).subagent_type}`);
} else {
console.log(`π ${block.name}: ${getToolSummary(block)}`);
}
}
}
}
// Final result
if (message.type === "result") {
if (message.subtype === "success" && message.structured_output) {
result = message.structured_output as ReviewResult;
console.log(`\nβ
Review complete! Cost: $${message.total_cost_usd.toFixed(4)}`);
} else {
console.log(`\nβ Review failed: ${message.subtype}`);
}
}
}
return result;
}
function getToolSummary(block: any): string {
const input = block.input || {};
switch (block.name) {
case "Read": return input.file_path || "file";
case "Glob": return input.pattern || "pattern";
case "Grep": return `"${input.pattern}" in ${input.path || "."}`;
default: return "";
}
}
function printResults(result: ReviewResult) {
console.log(`\n${"=".repeat(50)}`);
console.log(`π REVIEW RESULTS`);
console.log(`${"=".repeat(50)}\n`);
console.log(`Score: ${result.overallScore}/100`);
console.log(`Issues Found: ${result.issues.length}\n`);
console.log(`Summary: ${result.summary}\n`);
const byCategory = {
critical: result.issues.filter(i => i.severity === "critical"),
high: result.issues.filter(i => i.severity === "high"),
medium: result.issues.filter(i => i.severity === "medium"),
low: result.issues.filter(i => i.severity === "low")
};
for (const [severity, issues] of Object.entries(byCategory)) {
if (issues.length === 0) continue;
const icon = severity === "critical" ? "π΄" :
severity === "high" ? "π " :
severity === "medium" ? "π‘" : "π’";
console.log(`\n${icon} ${severity.toUpperCase()} (${issues.length})`);
console.log("-".repeat(30));
for (const issue of issues) {
const location = issue.line ? `${issue.file}:${issue.line}` : issue.file;
console.log(`\n[${issue.category}] ${location}`);
console.log(` ${issue.description}`);
if (issue.suggestion) {
console.log(` π‘ ${issue.suggestion}`);
}
}
}
}
// Run the review
async function main() {
const directory = process.argv[2] || ".";
const result = await runCodeReview(directory);
if (result) {
printResults(result);
}
}
main().catch(console.error);Run it:
npx tsx review-agent.ts ./srcThe code review agent covers the essentials: query(), allowedTools, structured output, subagents, and permissions. Here's where to go deeper:
- File checkpointing - Track and revert file changes
- Skills - Package reusable capabilities
- Hosting - Deploy in containers and CI/CD
- Secure deployment - Sandboxing and credential management
This guide covers V1 of the SDK. V2 is currently in development. I will update this guide with V2 once it's released and stable.
If you're interested in building verifiable agents, check out the work we're doing at EigenCloud here.
thanks for putting this together! helped me a ton with my exploration of how the claude agent sdk works (and in creating some examples + learnings along the way)