Claude Code Token Optimization: Use /usage to Cut Cost Without Losing Quality

Token optimization in Claude Code is not just a billing topic. When the conversation gets too heavy, responses slow down, old assumptions leak into the current task, and you pay for Claude to reread logs, diffs, and decisions that no longer matter. The real goal is not the smallest possible token count; it is the same engineering quality with less irrelevant context.

As of June 2026, the practical entry point is /usage. The official command reference describes /usage as the command for session cost, plan usage limits, and activity stats. /cost and /stats exist as aliases, but /usage is clearer for documentation and team habits. For Pro, Max, Team, and Enterprise users, the screen can also attribute usage to skills, subagents, plugins, and MCP servers.

This guide uses four layers: observe usage, reduce base context, split noisy work into isolated contexts, and monitor repeatable team workflows. A few terms first: a hook is a shell command or endpoint that runs at a fixed Claude Code lifecycle event; a subagent is a separate agent context for a focused task; a harness is the scaffolding that makes agent work repeatable and safe.

flowchart LR
  A["Observe: /usage and /context"] --> B["Reduce: CLAUDE.md and scoped inputs"]
  B --> C["Split: hooks / skills / subagents"]
  C --> D["Measure: OpenTelemetry and team rules"]

Start with /usage

Do not run /usage after every prompt. Check it at useful boundaries: when a session starts feeling slow, after a large tool run, before a handoff, and after a workflow you expect to repeat. Pair it with /context so you can see whether the cost is coming from the current task, stale conversation history, memory files, or tool context.

The billing nuance matters. The Session block in /usage is a local estimate based on token counts. API users should confirm authoritative billing in the Claude Console. Subscription users should not treat the session dollar figure as their bill; plan usage bars and activity attribution are more relevant for day-to-day limits.

# Run these inside Claude Code
/usage
/context

# When the conversation is long but the work should continue
/compact Preserve changed files, failing tests, decisions, and unresolved questions.

# When switching to unrelated work
/clear

/compact and /clear solve different problems. Compact keeps the session moving by summarizing important state. Clear starts a new context. If you clear too early, Claude loses decisions. If you never clear, every unrelated task pays for stale context.

Keep Always-Loaded Memory Short

Tokens are not only the text you just typed. They also come from conversation history, CLAUDE.md, auto memory, logs, tool outputs, MCP servers, and previous investigation notes. Separate information into three buckets:

Bucket	Best home	Example
Needed every session	Short CLAUDE.md	Build commands, test commands, hard constraints
Needed for this task	Conversation and `/compact`	Changed files, failing tests, unresolved decisions
Disposable after inspection	Filtered command output	Long logs, generated diffs, broad search results

The official memory docs say CLAUDE.md files are loaded into context, and long files create recurring overhead. Imports can organize instructions, but they do not help if the imported content is still loaded at startup. Keep CLAUDE.md for the small set of rules that truly apply every day.

# CLAUDE.md

## Project commands
- Build: npm run build
- Test: npm run test
- Type check: npm run typecheck

## Fast navigation
- API code: src/api/
- UI components: src/components/
- Tests: tests/

## Avoid by default
- Do not scan node_modules/, dist/, coverage/, or generated clients.
- Do not paste full logs. Ask for the failing command and relevant lines.

## Compact instructions
When compacting, preserve changed files, failing tests, decisions, credentials policy, and next actions.

Detailed PR review checklists, translation rules, migration runbooks, or release playbooks should usually become skills or separate docs. A skill loads when you invoke it; CLAUDE.md is paid for from the start of the session.

Filter Inputs Before Claude Reads Them

The fastest way to waste tokens is to paste an entire log or diff. Claude needs evidence, not a dump. Keep the full artifact on disk, then pass only the lines that can change the next decision.

# Give Claude the request ID and nearby errors, not the whole production log
tail -n 800 logs/app.log | grep -E -n -C 4 "request_id=abc123|ERROR|WARN"

# Inspect PR size before reading the whole diff
git diff --stat
git diff -- src/auth.ts tests/auth.test.ts

# Keep the full test output locally, but show Claude the failure neighborhood
npm test 2>&1 | tee test.log
grep -E -n -C 6 "FAIL|ERROR|Error|failed|Assertion" test.log | head -160

This pattern works because it does not hide data. It changes the first slice Claude sees. If the slice is insufficient, ask for the next file or the next section of the log.

Use Hooks Carefully

A hook can automate that filtering. It should not hide failures or silently approve risky commands. Start with ask, inspect the rewritten command, and only tighten the workflow after you have tested it locally.

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash",
        "hooks": [
          {
            "type": "command",
            "command": "~/.claude/hooks/filter-test-output.sh"
          }
        ]
      }
    ]
  }
}

#!/usr/bin/env bash
set -euo pipefail

input=$(cat)
cmd=$(echo "$input" | jq -r '.tool_input.command // ""')

case "$cmd" in
  npm\ test*|pnpm\ test*|pytest*|go\ test*)
    filtered="$cmd 2>&1 | grep -E -n -C 6 '(FAIL|ERROR|Error|failed|Assertion)' | head -160"
    jq -n --arg command "$filtered" '{
      hookSpecificOutput: {
        hookEventName: "PreToolUse",
        permissionDecision: "ask",
        permissionDecisionReason: "Run test command with filtered output",
        updatedInput: { command: $command }
      }
    }'
    ;;
  *)
    echo '{}'
    ;;
esac

This example requires jq. For a production team hook, store the complete test log in a file and preserve the original exit code. The hook earns its keep only when it reduces noise without reducing evidence.

Split Verbose Work

Subagents help when the work is noisy but the main conversation only needs the conclusion. Good tasks are narrow: “read these official docs and return only changed facts,” “check these 10 localized files and return only blockers,” or “run the failing test and summarize the first actionable stack trace.”

Do not spawn subagents just because you can. Each one has its own context, memory, tools, and cost. Use them to protect the main decision context, not to parallelize vague work. The same rule applies to skills: move detailed, occasional playbooks out of CLAUDE.md and load them only when the workflow needs them.

Monitor Team Usage

For a solo developer, /usage and /context are often enough. For repeatable team workflows, OpenTelemetry turns cost, token counts, model choice, duration, and tool activity into shared data. Start with the console exporter before wiring a collector or dashboard.

export CLAUDE_CODE_ENABLE_TELEMETRY=1
export OTEL_METRICS_EXPORTER=console
export OTEL_LOGS_EXPORTER=console
export OTEL_METRIC_EXPORT_INTERVAL=10000
export OTEL_LOGS_EXPORT_INTERVAL=5000

claude

Track quality next to token usage. If tokens drop by 30% but review defects and regenerations go up, the process got worse. I usually compare usage, number of correction rounds, verification status, and post-merge fixes.

Practical Use Cases

Log Investigation

Pass the request ID, the latest errors, timestamps, expected behavior, and nearby lines. Start with a small slice, let Claude form a hypothesis, then widen only if the evidence is missing.

Code Review

Start with git diff --stat, changed files, test output, and review concerns. Split a large PR into security, performance, and compatibility passes so Claude does not repeatedly reread the whole patch.

Multilingual Publishing

Keep the canonical article decisions in the main session. Send translation, link checks, description length, and CTA checks into separate contexts. Return changed files and verification evidence, not every intermediate note.

Training Days

During a workshop, concurrent Claude Code usage is higher than normal. Give participants rules such as “read five files first,” “paste the last 160 failure lines,” and “explain why scope needs to widen.” That keeps both cost and teaching flow predictable.

Pitfalls to Avoid

Writing a /cost-only guide. Use /usage as the main command and mention /cost and /stats as aliases.
Cutting context that is actually evidence. Reproduction steps, expected output, failing commands, and decisions must survive.
Turning CLAUDE.md into an operations notebook. Rare workflows belong in skills or separate docs.
Enabling every MCP server. Use /mcp and prefer CLI tools when they give the same answer with less context.
Hiding failures with hooks. Keep the full log somewhere and show Claude the filtered slice.
Assuming subagents are automatically cheaper. They isolate noise, but they still consume their own context.

A Small Handoff Script

This dependency-free Node.js script creates a compact brief from changed files, diff size, and a test log.

#!/usr/bin/env node
import { execFileSync } from "node:child_process";
import { existsSync, readFileSync } from "node:fs";

function git(args) {
  return execFileSync("git", args, { encoding: "utf8" }).trim();
}

const testLogPath = process.argv[2];
const changedFiles = git(["diff", "--name-only"])
  .split(/\r?\n/)
  .filter(Boolean);
const diffStat = git(["diff", "--stat"]);
const testLog = testLogPath && existsSync(testLogPath)
  ? readFileSync(testLogPath, "utf8")
  : "";
const failures = testLog
  .split(/\r?\n/)
  .filter((line) => /(FAIL|ERROR|Error|failed|Assertion)/.test(line))
  .slice(0, 80);

console.log("# Claude handoff brief\n");
console.log("## Changed files");
console.log(changedFiles.length ? changedFiles.map((file) => `- ${file}`).join("\n") : "- None");
console.log("\n## Diff stat");
console.log(diffStat || "No diff");
console.log("\n## Test failures");
console.log(failures.length ? failures.map((line) => `- ${line}`).join("\n") : "- No matching failure lines");

node scripts/claude-brief.mjs test.log > claude-brief.md

Official Docs Checked

Commands for /usage, /context, /compact, and /clear.
Manage costs effectively for cost tracking, MCP overhead, hooks, skills, subagents, and model effort.
How Claude remembers your project for CLAUDE.md and auto memory behavior.
Hooks reference and Monitoring for PreToolUse hooks and OpenTelemetry.

For adjacent workflow improvements, read Claude Code speed optimization, Claude Code permissions guide, and harness engineering guide.

What I Actually Tested

For this rewrite, I treated the Japanese article as the canonical source and checked the official docs before updating the localized versions. The biggest practical win was separating permanent memory, task state, and disposable logs. A compact brief with changed files, failing lines, and verification status produced better answers than pasting a full test run.

If you want reusable prompts and setup material, start with ClaudeCodeLab products. For team rollout, permissions, review policy, telemetry, and training, use Claude Code training and consultation.

Claude Code Token Optimization: Use /usage to Cut Cost Without Losing Quality

Start with /usage

Keep Always-Loaded Memory Short

Filter Inputs Before Claude Reads Them

Use Hooks Carefully

Split Verbose Work

Monitor Team Usage

Practical Use Cases

Log Investigation

Code Review

Multilingual Publishing

Training Days

Pitfalls to Avoid

A Small Handoff Script

Official Docs Checked

What I Actually Tested

Free PDF: Claude Code Cheatsheet

Level up your Claude Code workflow

Related Posts

Claude Code Permission Safety Ladder: Expand Access Without Losing Control

Claude Code Small PR Proof Pack: Make Tiny Changes Reviewable

Claude Code Review Gate Before Commit: Diff, Tests, Public URL, and CTA Checks

Related Products

50 Battle-Tested Claude Code Prompt Templates

The Complete Claude Code Setup & Configuration Guide