Claude Agent SDK Guide: Embed Claude Code in Your Apps Safely
Current Claude Agent SDK setup, permissions, MCP tools, runnable examples, and production pitfalls.
If you want to move Claude Code from an interactive terminal assistant into your own internal tools, CI jobs, support consoles, or review bots, the current target is the Claude Agent SDK.
Older examples often mention @anthropic-ai/claude-code, claude-code-sdk, or the phrase “Claude Code SDK.” As of June 2026, the official documentation points developers to the TypeScript package @anthropic-ai/claude-agent-sdk and the Python package claude-agent-sdk. Use the official Agent SDK overview and Migration Guide as the source of truth before copying code from search results.
This article rebuilds the old article from the ground up. The goal is not to show another toy chatbot. The goal is to build agents that can inspect files, search a codebase, call safe tools, optionally edit a small change, run a constrained test command, and report what happened in a way a human can review.
What Changed
Claude Agent SDK lets you use the agent loop that powers Claude Code from your own Python or TypeScript program. An agent loop means Claude can decide the next useful action, call a tool, read the result, and continue for several turns. That is very different from a single request to a chat completion API.
Here is the practical map I use today.
| Topic | Current recommendation |
|---|---|
| TypeScript package | Use @anthropic-ai/claude-agent-sdk |
| Python package | Use claude-agent-sdk |
| CLI vs SDK | CLI is for humans; SDK is for apps, CI, and automation |
| Client SDK vs Agent SDK | Client SDK makes you implement tool loops; Agent SDK uses Claude Code’s built-in loop |
| Main risk | Wider permissions create more useful agents and more dangerous failures |
The TypeScript SDK normally bundles a native Claude Code binary as an optional dependency, so you do not need to install the Claude Code CLI separately. One real failure mode is a package manager that skips optional dependencies. In that case the SDK can fail with a missing native binary message. Fix the install policy, or point pathToClaudeCodeExecutable at a separately installed claude binary.
Copy-Paste Setup
The following setup uses TypeScript with tsx, so you can run the examples without creating a build pipeline first.
mkdir claude-agent-sdk-demo
cd claude-agent-sdk-demo
npm init -y
npm install @anthropic-ai/claude-agent-sdk zod
npm install -D typescript tsx @types/node
Replace the generated package.json with this minimal shape.
{
"type": "module",
"scripts": {
"audit": "tsx src/read-only-audit.ts",
"runbook": "tsx src/runbook-agent.ts",
"fix": "tsx src/safe-fix.ts"
},
"dependencies": {
"@anthropic-ai/claude-agent-sdk": "latest",
"zod": "latest"
},
"devDependencies": {
"@types/node": "latest",
"tsx": "latest",
"typescript": "latest"
}
}
Set the API key through your environment. Do not hard-code it in the article, repository, CI logs, or sample output.
export ANTHROPIC_API_KEY="sk-ant-..."
On Windows PowerShell:
$env:ANTHROPIC_API_KEY = "sk-ant-..."
Create src/read-only-audit.ts. This first agent is deliberately read-only: it can inspect files and search text, but it cannot edit files or run shell commands.
import { query } from "@anthropic-ai/claude-agent-sdk";
const prompt = [
"Inspect this repository in read-only mode.",
"Find TODO comments, stale dependencies, and areas with weak tests.",
"Return a prioritized list with concrete file references.",
].join("\n");
for await (const message of query({
prompt,
options: {
cwd: process.cwd(),
allowedTools: ["Read", "Glob", "Grep"],
maxTurns: 4,
},
})) {
if (message.type === "result" && message.subtype === "success") {
console.log(message.result);
}
}
Run it:
npm run audit
Starting with read-only tools is not a formality. It gives you a safe baseline before you allow Edit, Write, or Bash.
Use Cases That Are Actually Worth Automating
Claude Agent SDK is strongest when a task requires observation, tool use, and a final decision. It is weaker when you only need one short text response. In Masa’s content and engineering workflows, these four cases are the most realistic:
| Use case | Useful tools | Output | Guardrail |
|---|---|---|---|
| Pre-review pull request audit | Read, Glob, Grep | Risk list, missing tests, review focus | Let a human post the final review |
| Small safe code fix | Read, Edit, Bash(npm test) | Minimal diff and test output | Block push, deploy, and destructive commands |
| Incident triage assistant | MCP runbook and log search | Hypotheses and next checks | Keep production tools read-only |
| Multilingual article QA | Read, Grep | Mojibake, missing CTA, stale links | Humans still review natural language |
For adjacent ClaudeCodeLab material, read the permissions guide, the MCP server guide, and Claude Code productivity tips. Those articles cover the operating habits that keep SDK agents from becoming invisible automation.
Add a Runbook Tool with MCP
MCP means Model Context Protocol. In plain language, it is a standard way to expose tools and data sources to an agent. The official MCP guide shows how Agent SDK applications can connect MCP servers. You can also define a small in-process MCP server directly in your app.
This example gives Claude a read-only runbook lookup tool. A production version might query an internal wiki or incident database, but the local object keeps the sample copy-pasteable.
import {
createSdkMcpServer,
query,
tool,
} from "@anthropic-ai/claude-agent-sdk";
import { z } from "zod";
const runbooks: Record<string, string> = {
billing: "Check failed payments, Stripe webhooks, and latest deploy.",
search: "Check index time, Algolia task status, and API limits.",
content: "Check CMS sync, locale slugs, and hero image presence.",
};
const lookupRunbook = tool(
"lookup_runbook",
"Return a read-only operations runbook for a service name",
{ service: z.string().min(1) },
async ({ service }) => {
const text = runbooks[service] ?? "No runbook found.";
return { content: [{ type: "text", text }] };
},
{ annotations: { readOnlyHint: true, openWorldHint: false } },
);
const runbookServer = createSdkMcpServer({
name: "runbook",
version: "1.0.0",
tools: [lookupRunbook],
});
for await (const message of query({
prompt: "Suggest the first checks for a content publishing incident.",
options: {
mcpServers: { runbook: runbookServer },
allowedTools: ["mcp__runbook__lookup_runbook"],
maxTurns: 3,
},
})) {
if (message.type === "result" && message.subtype === "success") {
console.log(message.result);
}
}
The common mistake is allowing the wrong tool name. MCP tools are approved as mcp__serverName__toolName. Allowing only lookup_runbook will not approve the SDK tool call.
Allow Edits Without Giving Away the Repo
Once the read-only audit is useful, you can allow small edits. Do it narrowly. The official permissions guide is worth reading before you ship any agent that can write files or run commands.
Create src/safe-fix.ts.
import { query } from "@anthropic-ai/claude-agent-sdk";
const prompt = [
"Fix one small TypeScript error under src.",
"After the change, run npm test and report the diff summary.",
"Do not refactor broadly and do not add dependencies.",
].join("\n");
for await (const message of query({
prompt,
options: {
cwd: process.cwd(),
allowedTools: [
"Read",
"Glob",
"Grep",
"Edit",
"Bash(npm test)",
],
disallowedTools: [
"Bash(git push)",
"Bash(git commit)",
"Bash(rm -rf *)",
],
permissionMode: "default",
maxTurns: 6,
},
})) {
if (message.type === "result") {
console.log(message.subtype, message.result ?? "");
}
}
This is not a “let the agent do anything” pattern. It is a pattern for producing a reviewable diff with test evidence. For real teams, I would keep deploy, push, release, billing, and customer-data tools outside this agent until the audit trail is mature.
Pitfalls and Failure Modes
The first pitfall is stale naming. If a guide imports from @anthropic-ai/claude-code or uses ClaudeCodeOptions, it may be describing the older SDK surface. For a refreshed article or production template, check the official migration guide and use the Agent SDK names.
The second pitfall is unrestricted Bash. If you allow Bash broadly, the agent may consider commands you never intended: cleanup, deploy, git operations, package installs, or local scripts with side effects. Start with command-specific approval such as Bash(npm test).
The third pitfall is an implicit working directory. If cwd depends on the parent process, your local script, CI job, and background worker may inspect different folders. Always set the project directory explicitly in production code.
The fourth pitfall is treating Agent SDK like a normal chat SDK. Agent runs can use multiple tool turns, read many files, and consume more time and tokens. Read the official cost tracking guide, log usage, and set operational limits.
The fifth pitfall is vague custom tools. If the tool name, description, schema, and annotations do not explain when the tool should be used, Claude has less signal. Mark read-only tools as read-only, keep destructive tools separate, and avoid overloading one tool with many unrelated actions.
Production Checklist
- API keys and third-party tokens are never printed in logs.
- The first run is read-only with
Read,Glob, andGrep. - Write-enabled agents have a clear
cwd, limited tools, andmaxTurns. - MCP tools separate read-only lookups from destructive operations.
- Test commands are specific, not broad shell access.
- Humans review diffs and test output before merge.
- Official docs and changelogs are checked before publishing templates.
Claude Agent SDK is powerful, but the useful question is not “how much can I automate?” It is “which boundary makes the automation reviewable?” That matters for engineering and for monetized content operations. Article updates, CTAs, product links, analytics events, and consulting forms can all change in one automation run if you are careless.
For personal workflows, start with the free Claude Code cheatsheet and keep the read-only audit pattern nearby. For reusable prompts, setup material, and review templates, use the products page. For team rollout, permissions, MCP design, CI review gates, and incident-safe automation, use Claude Code training and consultation.
Hands-On Verification Note
For this refresh, I checked the official Agent SDK overview, TypeScript reference, MCP guide, permissions guide, migration guide, and cost tracking docs, then replaced the old Claude Code Agent SDK style examples with current @anthropic-ai/claude-agent-sdk code. The most useful practical change was making the first runnable example read-only. It gives readers a low-risk npm run audit starting point before they move to MCP tools, edits, and test execution. That same staged approach is what I would use before adding any Agent SDK workflow to a real repository.
Free PDF: Claude Code Cheatsheet
Enter your email and download the one-page Claude Code cheatsheet for commands, review habits, and safe workflows.
We handle your data with care and never send spam.
Level up your Claude Code workflow
Start with the free PDF, use Gumroad guides when you need repeatable workflows, and book consultation when rollout or revenue paths need human judgment.
About the Author
Masa
Engineer focused on practical Claude Code workflows. Runs claudecode-lab.com, a 10-language technical media site.
Related Posts
Claude Code Permission Receipt Pattern: Record Scope, Proof, and Rollback
A permission receipt pattern for Claude Code: allowed actions, approval boundaries, proof commands, rollback, and revenue CTA checks.
Safe Agent Harness Design for Claude Code and Codex: Permissions, Checks, and Rollback
Build a practical agent harness for Claude Code and Codex with policy, planning, verification, and recovery layers.
Claude Code Subagents: A Practical Guide to Safe Agent Delegation
Claude Code subagent guide for safe parallel article and code work: delegation rules, prompts, pitfalls, and checks.
Related Products
The Complete Claude Code Setup & Configuration Guide
From install to team-ready workflow.
A practical guide to installation, CLAUDE.md, hooks, MCP servers, permissions, IDE setup, and CI/CD workflows.
50 Battle-Tested Claude Code Prompt Templates
Copy, paste, ship. 50 production-ready prompts.
Use proven prompts for code review, refactoring, testing, documentation, debugging, architecture, and incident response.