Claude Code Token Optimization: Use /usage to Cut Cost Without Losing Quality
Cut Claude Code token use with /usage, CLAUDE.md, hooks, subagents, scoped inputs, and OpenTelemetry.
Token optimization in Claude Code is not just a billing topic. When the conversation gets too heavy, responses slow down, old assumptions leak into the current task, and you pay for Claude to reread logs, diffs, and decisions that no longer matter. The real goal is not the smallest possible token count; it is the same engineering quality with less irrelevant context.
As of June 2026, the practical entry point is /usage. The official command reference describes /usage as the command for session cost, plan usage limits, and activity stats. /cost and /stats exist as aliases, but /usage is clearer for documentation and team habits. For Pro, Max, Team, and Enterprise users, the screen can also attribute usage to skills, subagents, plugins, and MCP servers.
This guide uses four layers: observe usage, reduce base context, split noisy work into isolated contexts, and monitor repeatable team workflows. A few terms first: a hook is a shell command or endpoint that runs at a fixed Claude Code lifecycle event; a subagent is a separate agent context for a focused task; a harness is the scaffolding that makes agent work repeatable and safe.
flowchart LR
A["Observe: /usage and /context"] --> B["Reduce: CLAUDE.md and scoped inputs"]
B --> C["Split: hooks / skills / subagents"]
C --> D["Measure: OpenTelemetry and team rules"]
Start with /usage
Do not run /usage after every prompt. Check it at useful boundaries: when a session starts feeling slow, after a large tool run, before a handoff, and after a workflow you expect to repeat. Pair it with /context so you can see whether the cost is coming from the current task, stale conversation history, memory files, or tool context.
The billing nuance matters. The Session block in /usage is a local estimate based on token counts. API users should confirm authoritative billing in the Claude Console. Subscription users should not treat the session dollar figure as their bill; plan usage bars and activity attribution are more relevant for day-to-day limits.
# Run these inside Claude Code
/usage
/context
# When the conversation is long but the work should continue
/compact Preserve changed files, failing tests, decisions, and unresolved questions.
# When switching to unrelated work
/clear
/compact and /clear solve different problems. Compact keeps the session moving by summarizing important state. Clear starts a new context. If you clear too early, Claude loses decisions. If you never clear, every unrelated task pays for stale context.
Keep Always-Loaded Memory Short
Tokens are not only the text you just typed. They also come from conversation history, CLAUDE.md, auto memory, logs, tool outputs, MCP servers, and previous investigation notes. Separate information into three buckets:
| Bucket | Best home | Example |
|---|---|---|
| Needed every session | Short CLAUDE.md | Build commands, test commands, hard constraints |
| Needed for this task | Conversation and /compact | Changed files, failing tests, unresolved decisions |
| Disposable after inspection | Filtered command output | Long logs, generated diffs, broad search results |
The official memory docs say CLAUDE.md files are loaded into context, and long files create recurring overhead. Imports can organize instructions, but they do not help if the imported content is still loaded at startup. Keep CLAUDE.md for the small set of rules that truly apply every day.
# CLAUDE.md
## Project commands
- Build: npm run build
- Test: npm run test
- Type check: npm run typecheck
## Fast navigation
- API code: src/api/
- UI components: src/components/
- Tests: tests/
## Avoid by default
- Do not scan node_modules/, dist/, coverage/, or generated clients.
- Do not paste full logs. Ask for the failing command and relevant lines.
## Compact instructions
When compacting, preserve changed files, failing tests, decisions, credentials policy, and next actions.
Detailed PR review checklists, translation rules, migration runbooks, or release playbooks should usually become skills or separate docs. A skill loads when you invoke it; CLAUDE.md is paid for from the start of the session.
Filter Inputs Before Claude Reads Them
The fastest way to waste tokens is to paste an entire log or diff. Claude needs evidence, not a dump. Keep the full artifact on disk, then pass only the lines that can change the next decision.
# Give Claude the request ID and nearby errors, not the whole production log
tail -n 800 logs/app.log | grep -E -n -C 4 "request_id=abc123|ERROR|WARN"
# Inspect PR size before reading the whole diff
git diff --stat
git diff -- src/auth.ts tests/auth.test.ts
# Keep the full test output locally, but show Claude the failure neighborhood
npm test 2>&1 | tee test.log
grep -E -n -C 6 "FAIL|ERROR|Error|failed|Assertion" test.log | head -160
This pattern works because it does not hide data. It changes the first slice Claude sees. If the slice is insufficient, ask for the next file or the next section of the log.
Use Hooks Carefully
A hook can automate that filtering. It should not hide failures or silently approve risky commands. Start with ask, inspect the rewritten command, and only tighten the workflow after you have tested it locally.
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "~/.claude/hooks/filter-test-output.sh"
}
]
}
]
}
}
#!/usr/bin/env bash
set -euo pipefail
input=$(cat)
cmd=$(echo "$input" | jq -r '.tool_input.command // ""')
case "$cmd" in
npm\ test*|pnpm\ test*|pytest*|go\ test*)
filtered="$cmd 2>&1 | grep -E -n -C 6 '(FAIL|ERROR|Error|failed|Assertion)' | head -160"
jq -n --arg command "$filtered" '{
hookSpecificOutput: {
hookEventName: "PreToolUse",
permissionDecision: "ask",
permissionDecisionReason: "Run test command with filtered output",
updatedInput: { command: $command }
}
}'
;;
*)
echo '{}'
;;
esac
This example requires jq. For a production team hook, store the complete test log in a file and preserve the original exit code. The hook earns its keep only when it reduces noise without reducing evidence.
Split Verbose Work
Subagents help when the work is noisy but the main conversation only needs the conclusion. Good tasks are narrow: “read these official docs and return only changed facts,” “check these 10 localized files and return only blockers,” or “run the failing test and summarize the first actionable stack trace.”
Do not spawn subagents just because you can. Each one has its own context, memory, tools, and cost. Use them to protect the main decision context, not to parallelize vague work. The same rule applies to skills: move detailed, occasional playbooks out of CLAUDE.md and load them only when the workflow needs them.
Monitor Team Usage
For a solo developer, /usage and /context are often enough. For repeatable team workflows, OpenTelemetry turns cost, token counts, model choice, duration, and tool activity into shared data. Start with the console exporter before wiring a collector or dashboard.
export CLAUDE_CODE_ENABLE_TELEMETRY=1
export OTEL_METRICS_EXPORTER=console
export OTEL_LOGS_EXPORTER=console
export OTEL_METRIC_EXPORT_INTERVAL=10000
export OTEL_LOGS_EXPORT_INTERVAL=5000
claude
Track quality next to token usage. If tokens drop by 30% but review defects and regenerations go up, the process got worse. I usually compare usage, number of correction rounds, verification status, and post-merge fixes.
Practical Use Cases
Log Investigation
Pass the request ID, the latest errors, timestamps, expected behavior, and nearby lines. Start with a small slice, let Claude form a hypothesis, then widen only if the evidence is missing.
Code Review
Start with git diff --stat, changed files, test output, and review concerns. Split a large PR into security, performance, and compatibility passes so Claude does not repeatedly reread the whole patch.
Multilingual Publishing
Keep the canonical article decisions in the main session. Send translation, link checks, description length, and CTA checks into separate contexts. Return changed files and verification evidence, not every intermediate note.
Training Days
During a workshop, concurrent Claude Code usage is higher than normal. Give participants rules such as “read five files first,” “paste the last 160 failure lines,” and “explain why scope needs to widen.” That keeps both cost and teaching flow predictable.
Pitfalls to Avoid
- Writing a
/cost-only guide. Use/usageas the main command and mention/costand/statsas aliases. - Cutting context that is actually evidence. Reproduction steps, expected output, failing commands, and decisions must survive.
- Turning CLAUDE.md into an operations notebook. Rare workflows belong in skills or separate docs.
- Enabling every MCP server. Use
/mcpand prefer CLI tools when they give the same answer with less context. - Hiding failures with hooks. Keep the full log somewhere and show Claude the filtered slice.
- Assuming subagents are automatically cheaper. They isolate noise, but they still consume their own context.
A Small Handoff Script
This dependency-free Node.js script creates a compact brief from changed files, diff size, and a test log.
#!/usr/bin/env node
import { execFileSync } from "node:child_process";
import { existsSync, readFileSync } from "node:fs";
function git(args) {
return execFileSync("git", args, { encoding: "utf8" }).trim();
}
const testLogPath = process.argv[2];
const changedFiles = git(["diff", "--name-only"])
.split(/\r?\n/)
.filter(Boolean);
const diffStat = git(["diff", "--stat"]);
const testLog = testLogPath && existsSync(testLogPath)
? readFileSync(testLogPath, "utf8")
: "";
const failures = testLog
.split(/\r?\n/)
.filter((line) => /(FAIL|ERROR|Error|failed|Assertion)/.test(line))
.slice(0, 80);
console.log("# Claude handoff brief\n");
console.log("## Changed files");
console.log(changedFiles.length ? changedFiles.map((file) => `- ${file}`).join("\n") : "- None");
console.log("\n## Diff stat");
console.log(diffStat || "No diff");
console.log("\n## Test failures");
console.log(failures.length ? failures.map((line) => `- ${line}`).join("\n") : "- No matching failure lines");
node scripts/claude-brief.mjs test.log > claude-brief.md
Official Docs Checked
- Commands for
/usage,/context,/compact, and/clear. - Manage costs effectively for cost tracking, MCP overhead, hooks, skills, subagents, and model effort.
- How Claude remembers your project for CLAUDE.md and auto memory behavior.
- Hooks reference and Monitoring for PreToolUse hooks and OpenTelemetry.
For adjacent workflow improvements, read Claude Code speed optimization, Claude Code permissions guide, and harness engineering guide.
What I Actually Tested
For this rewrite, I treated the Japanese article as the canonical source and checked the official docs before updating the localized versions. The biggest practical win was separating permanent memory, task state, and disposable logs. A compact brief with changed files, failing lines, and verification status produced better answers than pasting a full test run.
If you want reusable prompts and setup material, start with ClaudeCodeLab products. For team rollout, permissions, review policy, telemetry, and training, use Claude Code training and consultation.
Free PDF: Claude Code Cheatsheet
Enter your email and download the one-page Claude Code cheatsheet for commands, review habits, and safe workflows.
We handle your data with care and never send spam.
Level up your Claude Code workflow
Start with the free PDF, use Gumroad guides when you need repeatable workflows, and book consultation when rollout or revenue paths need human judgment.
About the Author
Masa
Engineer focused on practical Claude Code workflows. Runs claudecode-lab.com, a 10-language technical media site.
Related Posts
Claude Code Permission Safety Ladder: Expand Access Without Losing Control
A beginner-friendly ladder for moving Claude Code from read-only to limited edits, proof commands, and deploy checks.
Claude Code Small PR Proof Pack: Make Tiny Changes Reviewable
A practical proof pack for Claude Code PRs: diff, checks, public URL, CTA path, and rollback note.
Claude Code Review Gate Before Commit: Diff, Tests, Public URL, and CTA Checks
A commit-time review gate for Claude Code work: diff scope, build, public URL, revenue CTA links, missing tests, and unrelated files.
Related Products
50 Battle-Tested Claude Code Prompt Templates
Copy, paste, ship. 50 production-ready prompts.
Use proven prompts for code review, refactoring, testing, documentation, debugging, architecture, and incident response.
The Complete Claude Code Setup & Configuration Guide
From install to team-ready workflow.
A practical guide to installation, CLAUDE.md, hooks, MCP servers, permissions, IDE setup, and CI/CD workflows.