How to Run a Practical Claude Code Security Audit

Do not hand security to AI blindly

Claude Code can make security audit work faster. It can read routes, compare pull request diffs, summarize dependency risk, search for dangerous logging, and turn repeated review habits into a checklist. That is useful because real audits are slow: the risky code is often spread across auth middleware, billing jobs, environment variables, webhooks, CI, and logs.

The mistake is asking, “Check security,” and accepting the answer as proof. A security audit is not a vibe check. It is a scoped review of assets, threats, controls, evidence, and remaining risk. Claude Code is a reviewer and organizer, not the owner of business risk or disclosure decisions.

This guide shows a beginner-friendly workflow for using Claude Code in a real audit without publishing exploit details. It covers scope, asset inventory, threat modeling, dependency review, auth and session review, secrets review, input validation, logging and PII, CI gates, evidence receipts, and four common use cases. For external baselines, keep OWASP Top 10, OWASP ASVS, NIST SSDF, and GitHub secret scanning nearby.

Start by locking the scope

The first deliverable is not a fix. It is a clear audit brief. Scope tells Claude Code what to inspect, what to ignore, which commands are allowed, and what counts as done. Without scope, the model tends to comment on obvious files while missing admin routes, billing flows, background jobs, or CI secrets.

Copy this brief into the session and fill in the blanks before asking for findings.

# Security Audit Brief

## Goal
- Find serious issues in authentication, authorization, secrets, input validation, logging, and dependencies before release
- Produce evidence and remaining-risk notes, not just suggested fixes

## Scope
- Repository:
- Branch or PR:
- Feature or workflow:
- Directories Claude Code may inspect:
- Directories Claude Code must not change:
- Commands Claude Code may run:

## Priority areas
- Authentication and session handling
- Authorization and roles
- Input validation and output encoding
- Secrets and environment variables
- Dependencies and licenses
- Logs, PII, and audit events
- CI gates that should block release

## Completion criteria
- Record findings in the risk register
- Include only minimal reproduction context
- Do not include live secrets, customer data, or dangerous exploit instructions
- Record commands, results, skipped areas, and human-review items in the evidence receipt

After the brief, ask Claude Code to map the application before editing it. A good first prompt is: “List routes, auth middleware, external APIs, environment variables, logging points, CI jobs, and high-risk files. Do not modify code yet.” That step gives the human reviewer a chance to catch missing areas before the audit narrows too early.

Build an asset inventory and threat model

Security only makes sense when you know what you are protecting. Assets include user records, billing state, API keys, admin panels, audit logs, uploaded files, support messages, and analytics data. A threat model is a small map of who may attack which entry point and what control should stop them.

| Asset | Stored in | Entry point | Likely threat | Required control | Owner |
| --- | --- | --- | --- | --- | --- |
| User email address | users table | signup, admin | unauthorized access, log leakage | authorization, PII masking, audit log | backend |
| Billing status | billing table, Stripe | webhook, admin | tampering, duplicate processing | signature verification, idempotency, role checks | backend |
| API keys | env, secret manager | CI, runtime | repository leak, log exposure | secret scanning, rotation, least privilege | platform |
| Admin console | /admin | browser | privilege escalation | MFA, admin role, operation logs | product |

Give this table to Claude Code and ask it to add trust boundaries and tests. A trust boundary is the line between data you control and data you must treat as untrusted. Browser input, webhooks, uploaded CSV files, Markdown, filenames, and third-party API responses should all be treated as untrusted until validated.

Give Claude Code specific review lanes

Dependency review should cover package.json, lockfiles, Docker images, GitHub Actions, runtime versions, and transitive packages. Do not treat npm audit as the whole audit. Ask Claude Code to separate reachable production risk from development-only noise, explain breaking-change risk, and list the tests that prove the update did not break the app.

Auth and session review should cover login, logout, password reset, MFA, OAuth, cookie flags, session expiration, CSRF protection, and server-side authorization. “Is the user logged in?” and “May this user perform this action?” are different questions. Admin APIs, billing actions, file downloads, user IDs in route params, and replayable webhooks are common places for authorization gaps.

Secrets review should include .env, CI secrets, sample configs, README snippets, logs, screenshots, and test fixtures. Claude Code should never print live secrets into the chat. If it finds a likely secret, the report should include the file path, secret type, rotation recommendation, and owner, with the value redacted. On GitHub, verify secret scanning and push protection instead of relying only on local search.

Input validation review should trace where data enters and where it is rendered, stored, or sent onward. Avoid asking Claude Code to generate dangerous payloads. Ask for boundary validation, normalization, escaping, type checks, and tests. For beginner teams, this is easier than memorizing vulnerability names because it follows real data flow.

Logging and PII review should look for emails, names, addresses, tokens, cookies, auth headers, payment IDs, and free-text support content in logs. Debug output that looked harmless during development can become a production retention problem. Logs should contain the minimum fields needed for debugging and audit events, not raw request bodies.

Four use cases that fit this workflow

The first use case is a pre-release SaaS audit. Before launching billing, invites, admin tools, or organization settings, have Claude Code read the PR diff, migrations, routes, webhook handlers, tests, and CI. The output should be a risk register for the release meeting, not a broad promise that everything is safe.

The second use case is a GitHub repository handoff audit. When you inherit a repo, the riskiest knowledge is often in deployment scripts, CI secrets, environment names, manual runbooks, and unclear owners. Ask Claude Code to produce a first-week handoff checklist: what to rotate, what to document, what CI gates are missing, and which areas need human owner review.

The third use case is an incident follow-up audit. After an outage or suspected leak, do not fix only the line that triggered the incident. Ask Claude Code to search for similar patterns, check logs for unnecessary PII, propose regression tests, update CI gates, and draft a restrained internal follow-up. Public writeups should explain impact and remediation without spreading operational exploit details.

The fourth use case is security review for AI-generated PRs. AI-written code often looks clean while skipping authorization, error handling, audit logs, or tests. Ask Claude Code to review only the security impact of the diff: attack surface, permissions, secrets, personal data, dependency changes, and CI coverage. That narrow prompt usually produces better findings than a generic review.

Keep a risk register and evidence receipt

Findings need evidence. Severity alone is not enough. A useful risk register records the impact, where the concern was observed, what fix is recommended, who owns it, and what remains unverified.

| ID | Risk | Impact | Evidence | Recommended action | Priority | Status | Owner |
| --- | --- | --- | --- | --- | --- | --- | --- |
| SEC-001 | Missing authorization on admin API | Another user's data may be changed | routes/admin.ts has no role check | Add middleware and regression tests | High | Open | backend |
| SEC-002 | Webhook logs include email | Unnecessary PII retention | logs/webhook-sample.txt | Mask email and reduce retention | Medium | Open | platform |

The evidence receipt prevents an audit from becoming a set of untraceable chat messages. Keep it short and factual.

# Security Audit Evidence Receipt

- Target:
- Date:
- Reviewer:
- Scope provided to Claude Code:
- Commands executed:
- Files or directories inspected:
- High-risk findings:
- Fixed items:
- Skipped or out-of-scope areas:
- Decisions requiring human review:
- Release recommendation:

Use a PR review checklist

For pull requests, paste a short checklist into the review prompt. This works especially well when a team is reviewing AI-generated changes, because it forces the discussion back to risk instead of style.

## Security PR Review

- [ ] Scope is clear and no unrelated files are changed
- [ ] Authenticated users cannot access another user's data
- [ ] Admin or billing actions require explicit authorization
- [ ] Inputs are validated at the boundary
- [ ] Outputs are escaped or rendered safely
- [ ] No secrets, tokens, cookies, or PII are printed to logs
- [ ] Dependency changes have a reason and test evidence
- [ ] Errors do not reveal internals
- [ ] CI includes lint, tests, typecheck, and security-relevant checks
- [ ] Risk register and evidence receipt are updated

Run a command checklist

The exact commands depend on the stack, but a Node or TypeScript project can often start with this set. Ask Claude Code to explain each command before running it and to stop if the result changes the risk picture.

git status --short
npm ci
npm audit --audit-level=moderate
npm run lint
npm run typecheck
npm test
rg -n "TODO|FIXME|console\\.log|process\\.env|localStorage|innerHTML|dangerouslySetInnerHTML" src
git diff --check

The rg results are leads, not verdicts. process.env may be correct. innerHTML may be acceptable when sanitized. Teach Claude Code to separate evidence, assumption, and confirmation steps instead of labeling every match as a vulnerability.

Avoid the common failure modes

The biggest failure is vague prompting. “Check security” gives Claude Code permission to be shallow. The better request names the scope, assets, release deadline, allowed commands, high-risk flows, and output format. A report that clearly says “not reviewed” is more useful than a confident paragraph with no evidence.

The second failure is missing evidence. If no one can tell which commands ran, which files were inspected, and which areas were skipped, the audit cannot support a release decision. Use CI output, tests, diffs, logs, and config files to confirm the model’s claims.

A dangerous failure is publishing too much detail. Do not paste live URLs, tokens, customer data, or step-by-step exploit instructions into public issues or blog posts. Keep public language focused on impact and remediation. Store sensitive reproduction detail in a restricted tracker.

Do not ignore secrets and logs. A system can have reasonable application code and still leak through CI logs, debug screenshots, sample .env files, or verbose error handling. Finally, high-risk changes around auth, payments, personal data, and admin tools need human review. Claude Code can accelerate that review, but it should not replace accountability.

Put the audit into CI and team habits

One-time audits fade quickly. Add CI gates for lint, typecheck, tests, dependency audit, secret scanning, dangerous API search, and log policy checks. Not every warning needs to block every build. A practical rule is to block High risks, create dated tickets for Medium risks, and batch Low risks into maintenance review.

For the Claude Code side, review permissions and tool access with the Claude Code permissions guide. Pair this article with secrets management, security failure cases, and code review workflows. If your team wants help turning this into a repeatable review policy, CI gate, and repository-specific audit brief, start with Claude Code training and consultation.

Summary

Claude Code is most useful in security audits when the human defines the audit system around it: scope, asset inventory, threat model, review lanes, risk register, evidence receipt, CI gates, and human approval for high-risk decisions. Without that structure, the output may read well while leaving the real risk unexplained.

When Masa applied this workflow to ClaudeCodeLab pre-release reviews, the biggest improvement came from the risk register and evidence receipt. Having Claude Code inventory routes, logs, environment variables, and CI first made the human review sharper. The result was not a claim of perfect security; it was a clearer release decision with visible open questions.

How to Run a Practical Claude Code Security Audit

Do not hand security to AI blindly

Start by locking the scope

Build an asset inventory and threat model

Give Claude Code specific review lanes

Four use cases that fit this workflow

Keep a risk register and evidence receipt

Use a PR review checklist

Run a command checklist

Avoid the common failure modes

Put the audit into CI and team habits

Summary

Free PDF: Claude Code Cheatsheet

Level up your Claude Code workflow

Related Posts

Claude Code Permission Receipt Pattern: Record Scope, Proof, and Rollback

Safe Agent Harness Design for Claude Code and Codex: Permissions, Checks, and Rollback

Claude Code Subagents: A Practical Guide to Safe Agent Delegation

Related Products

The Complete Claude Code Setup & Configuration Guide

50 Battle-Tested Claude Code Prompt Templates