Claude Code 토큰 최적화: /usage로 비용을 줄이고 품질을 지키는 방법

Claude Code의 토큰 최적화는 단순한 비용 절감이 아닙니다. 대화가 무거워지면 응답이 느려지고, 오래된 전제가 현재 작업에 섞이며, 더 이상 필요 없는 로그와 diff를 다시 읽는 비용이 생깁니다. 목표는 토큰을 무조건 적게 쓰는 것이 아니라, 같은 품질을 더 작은 문맥으로 안정적으로 내는 것입니다.

2026년 6월 기준으로 실무 문서의 시작점은 /usage가 적절합니다. 공식 command reference는 /usage를 session cost, plan usage limits, activity stats를 보여주는 명령으로 설명합니다. /cost와 /stats는 alias이므로 완전히 틀린 명령은 아니지만, 팀 규칙과 글에서는 /usage를 중심으로 쓰는 편이 독자가 덜 헷갈립니다.

이 글은 네 단계로 정리합니다. 먼저 사용량을 보고, 기본 context를 줄이고, noisy work를 별도 context로 나누고, 반복되는 팀 작업은 모니터링합니다. hook은 Claude Code의 특정 lifecycle에 자동 실행되는 script, subagent는 좁은 일을 맡는 별도 agent context, harness는 agent가 안전하고 반복 가능하게 움직이게 하는 발판이라고 보면 됩니다.

flowchart LR
  A["보기: /usage 와 /context"] --> B["줄이기: CLAUDE.md 와 입력 범위"]
  B --> C["나누기: hooks / skills / subagents"]
  C --> D["측정하기: OpenTelemetry 와 팀 규칙"]

/usage부터 시작하기

/usage를 매 prompt마다 볼 필요는 없습니다. 세션이 느려졌을 때, 큰 tool output 뒤, handoff 전에, 반복할 workflow가 끝난 뒤에 확인하면 충분합니다. /context도 함께 보면 비용이 현재 task에서 오는지, 오래된 conversation history에서 오는지, memory file이나 tool context에서 오는지 구분할 수 있습니다.

billing과 작업 개선은 분리해야 합니다. /usage의 Session cost는 token count를 기반으로 한 local estimate입니다. API 사용자는 Claude Console의 Usage 화면에서 실제 청구 기준을 확인해야 합니다. 구독 사용자는 session dollar figure를 그대로 청구 금액으로 해석하지 말고, plan usage bar와 activity attribution을 보는 편이 안전합니다.

# Claude Code 대화창에서 실행
/usage
/context

# 대화가 길지만 작업 상태는 이어가야 할 때
/compact Preserve changed files, failing tests, decisions, and unresolved questions.

# 관련 없는 작업으로 넘어갈 때
/clear

/compact는 중요한 상태를 요약해 남기는 명령이고, /clear는 새 context를 시작하는 명령입니다. 너무 일찍 clear하면 중요한 결정이 사라지고, 전혀 clear하지 않으면 다음 작업도 이전 context 비용을 계속 냅니다.

항상 로드되는 memory를 짧게 유지하기

토큰은 방금 입력한 문장만으로 늘지 않습니다. conversation history, CLAUDE.md, auto memory, 로그, tool output, MCP server, 이전 조사 메모까지 모두 영향을 줍니다. 정보를 세 종류로 나누면 판단이 쉬워집니다.

종류	둘 곳	예시
매번 필요	짧은 CLAUDE.md	build, test, 절대 금지 규칙
이번 작업에 필요	대화와 `/compact`	변경 파일, 실패 test, 미결정 사항
보고 버릴 데이터	필터링된 command output	긴 로그, generated diff, 전체 검색 결과

공식 memory 문서는 CLAUDE.md가 context로 로드된다고 설명합니다. 파일이 길수록 모든 세션의 고정 비용이 커집니다. import로 파일을 나누는 것은 정리에는 도움이 되지만, 시작 시 함께 로드된다면 token 절약은 아닙니다. CLAUDE.md에는 매일 필요한 짧은 규칙만 둡니다.

# CLAUDE.md

## Project commands
- Build: npm run build
- Test: npm run test
- Type check: npm run typecheck

## Fast navigation
- API code: src/api/
- UI components: src/components/
- Tests: tests/

## Avoid by default
- Do not scan node_modules/, dist/, coverage/, or generated clients.
- Do not paste full logs. Ask for the failing command and relevant lines.

## Compact instructions
When compacting, preserve changed files, failing tests, decisions, credentials policy, and next actions.

긴 PR review checklist, 번역 규칙, migration runbook, release playbook은 skill이나 별도 문서가 더 알맞습니다. skill은 필요할 때 부르는 방식으로 운영할 수 있지만, CLAUDE.md는 세션 시작부터 비용이 됩니다.

Claude가 읽기 전에 입력을 줄이기

가장 흔한 낭비는 전체 로그나 전체 diff를 그대로 붙이는 것입니다. Claude에게 필요한 것은 증거이지 dump가 아닙니다. 원본은 디스크에 남기고, 먼저 판단에 필요한 줄만 전달합니다.

# production log 전체가 아니라 request ID와 오류 주변만 전달
tail -n 800 logs/app.log | grep -E -n -C 4 "request_id=abc123|ERROR|WARN"

# 전체 diff를 읽기 전에 PR 규모 확인
git diff --stat
git diff -- src/auth.ts tests/auth.test.ts

# full test output은 저장하고 Claude에게는 failure 주변만 전달
npm test 2>&1 | tee test.log
grep -E -n -C 6 "FAIL|ERROR|Error|failed|Assertion" test.log | head -160

이 방식은 정보를 숨기는 것이 아닙니다. Claude가 처음 보는 범위를 좁히는 것입니다. 부족하면 다음 로그 구간이나 파일을 추가하면 됩니다.

hooks는 조심해서 쓰기

같은 filtering을 반복한다면 hook으로 자동화할 수 있습니다. 다만 hook은 실패를 숨기면 안 됩니다. 먼저 ask로 rewritten command를 확인하고, 로컬에서 충분히 테스트한 뒤 팀에 공유합니다.

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash",
        "hooks": [
          {
            "type": "command",
            "command": "~/.claude/hooks/filter-test-output.sh"
          }
        ]
      }
    ]
  }
}

#!/usr/bin/env bash
set -euo pipefail

input=$(cat)
cmd=$(echo "$input" | jq -r '.tool_input.command // ""')

case "$cmd" in
  npm\ test*|pnpm\ test*|pytest*|go\ test*)
    filtered="$cmd 2>&1 | grep -E -n -C 6 '(FAIL|ERROR|Error|failed|Assertion)' | head -160"
    jq -n --arg command "$filtered" '{
      hookSpecificOutput: {
        hookEventName: "PreToolUse",
        permissionDecision: "ask",
        permissionDecisionReason: "Run test command with filtered output",
        updatedInput: { command: $command }
      }
    }'
    ;;
  *)
    echo '{}'
    ;;
esac

이 예시는 jq가 필요합니다. 팀에서 쓰려면 full log를 파일에 저장하고 원래 exit code를 보존하는 wrapper로 확장하는 편이 좋습니다. hook의 목적은 evidence를 줄이는 것이 아니라 noise를 줄이는 것입니다.

verbose work는 분리하기

subagent는 결과는 짧지만 과정이 긴 작업에 적합합니다. 예를 들어 “공식 문서를 확인하고 바뀐 사실만 반환”, “10개 locale 파일에서 blocker만 반환”, “실패 test를 실행하고 첫 actionable stack trace만 요약”처럼 범위를 좁힙니다.

하지만 subagent도 무료가 아닙니다. 각 subagent는 자기 context, memory, tools, cost를 가집니다. 목적은 main decision context를 깨끗하게 유지하는 것이지, 모호한 일을 무조건 병렬화하는 것이 아닙니다. skill도 마찬가지로, 가끔 쓰는 긴 절차를 CLAUDE.md에서 빼고 필요할 때 로드하는 방식이 효율적입니다.

팀 사용은 OpenTelemetry로 보기

혼자 쓸 때는 /usage와 /context로 충분한 경우가 많습니다. 팀에서 반복되는 workflow라면 OpenTelemetry가 cost, token count, model, duration, tool activity를 공통 데이터로 바꿉니다. 처음에는 console exporter로 확인하고, 이후 collector나 dashboard로 연결합니다.

export CLAUDE_CODE_ENABLE_TELEMETRY=1
export OTEL_METRICS_EXPORTER=console
export OTEL_LOGS_EXPORTER=console
export OTEL_METRIC_EXPORT_INTERVAL=10000
export OTEL_LOGS_EXPORT_INTERVAL=5000

claude

token만 보지 마세요. usage, correction round, verification pass, post-merge fix를 같이 봐야 합니다. token이 줄었지만 review defect가 늘었다면 성공이 아닙니다.

실무 use cases

로그 조사

request ID, 최근 error, timestamp, expected behavior, 주변 줄만 전달합니다. 작은 slice에서 hypothesis를 만들고, evidence가 부족할 때만 범위를 넓힙니다.

코드 리뷰

git diff --stat, changed files, test output, review concern을 먼저 줍니다. 큰 PR은 security, performance, compatibility pass로 나누면 같은 diff를 반복해서 읽는 비용이 줄어듭니다.

다국어 publishing

canonical article decision은 main session에 남기고, 번역 자연스러움, link, description length, CTA check는 별도 context로 보냅니다. 반환은 changed files와 verification evidence면 충분합니다.

교육일 운영

workshop에서는 동시 사용량이 평소보다 높습니다. “처음엔 다섯 파일만 읽기”, “failure log는 160줄까지”, “scope 확장 전 이유 쓰기” 같은 규칙을 주면 비용과 진행이 안정됩니다.

피해야 할 실패

/cost만 안내하기. 지금은 /usage를 주 명령으로 쓰고 /cost, /stats는 alias로 설명합니다.
증거까지 삭제하기. reproduction, expected output, failing command, decision은 남겨야 합니다.
CLAUDE.md를 운영 노트로 만들기. 드문 긴 workflow는 skill이나 별도 문서가 맞습니다.
MCP server를 과하게 켜기. /mcp로 확인하고 CLI가 충분하면 CLI를 씁니다.
hook으로 실패를 숨기기. 전체 로그는 파일에 두고 Claude에게 filtered slice를 보여줍니다.
subagent가 항상 싸다고 믿기. noise 격리에는 좋지만 별도 context 비용이 있습니다.

작은 handoff script

아래 Node.js script는 dependency 없이 changed files, diff size, test failure lines를 짧은 brief로 만듭니다.

#!/usr/bin/env node
import { execFileSync } from "node:child_process";
import { existsSync, readFileSync } from "node:fs";

function git(args) {
  return execFileSync("git", args, { encoding: "utf8" }).trim();
}

const testLogPath = process.argv[2];
const changedFiles = git(["diff", "--name-only"])
  .split(/\r?\n/)
  .filter(Boolean);
const diffStat = git(["diff", "--stat"]);
const testLog = testLogPath && existsSync(testLogPath)
  ? readFileSync(testLogPath, "utf8")
  : "";
const failures = testLog
  .split(/\r?\n/)
  .filter((line) => /(FAIL|ERROR|Error|failed|Assertion)/.test(line))
  .slice(0, 80);

console.log("# Claude handoff brief\n");
console.log("## Changed files");
console.log(changedFiles.length ? changedFiles.map((file) => `- ${file}`).join("\n") : "- None");
console.log("\n## Diff stat");
console.log(diffStat || "No diff");
console.log("\n## Test failures");
console.log(failures.length ? failures.map((line) => `- ${line}`).join("\n") : "- No matching failure lines");

node scripts/claude-brief.mjs test.log > claude-brief.md

확인한 공식 문서

Commands: /usage, /context, /compact, /clear.
Manage costs effectively: cost tracking, MCP, hooks, skills, subagents, model effort.
How Claude remembers your project: CLAUDE.md와 auto memory.
Hooks reference 및 Monitoring: PreToolUse hook과 OpenTelemetry.

같이 읽을 글은 속도 최적화, 권한 가이드, harness engineering 가이드입니다.

실제로 확인한 결과

이번 rewrite에서는 일본어 글을 canonical로 두고 공식 문서를 확인한 뒤 10개 locale을 맞췄습니다. 가장 효과가 컸던 습관은 permanent memory, task state, disposable log를 분리하는 것이었습니다. full test output을 붙이는 것보다 changed files, failure lines, verification status만 먼저 주는 편이 답변이 더 안정적이었습니다.

재사용 가능한 prompt와 setup material은 ClaudeCodeLab products에서 시작할 수 있습니다. 팀 rollout, permission, review policy, telemetry, training까지 정리해야 한다면 Claude Code training and consultation이 현실적인 다음 단계입니다.