Advanced Prompt Engineering for Claude Code and Codex: Practical Task Briefs That Survive Real Work
Design Claude Code/Codex prompts with briefs, acceptance criteria, verification receipts, and safe iteration loops.
When Claude Code or Codex gives uneven results, the problem is often not the model. It is usually the way the work was handed over. Advanced prompt engineering is not a magic phrase. It is workflow design: scope, context, constraints, acceptance criteria, verification, and handoff in one package.
This guide shows how to build a reusable “prompt packet” for Claude Code and Codex. A beginner can copy the templates, but the goal is advanced enough for team workflows, parallel workers, production code, and published content. If you have not built the basics yet, start with 5 tips for better prompts and CLAUDE.md best practices.
Tool behavior changes, so use official docs for product-specific claims. For Claude Code, start with the Claude Code overview, Memory, and Anthropic’s prompt engineering overview. For Codex, use the OpenAI Codex docs and AGENTS.md guidance.
Advanced Means A Work Contract
A weak prompt is not only short. It lacks decision rules. It does not say which files are safe to touch. It does not define what “done” means. It does not ask for proof. With those gaps, an agent has to guess.
A practical prompt should behave like a small work contract.
| Part | What to write | Failure when missing |
|---|---|---|
| Goal | The outcome you need | The answer looks plausible but solves the wrong problem |
| Scope | Files or areas that may and may not change | Unrelated refactors appear in the diff |
| Context | Docs, similar code, official references | The agent copies stale patterns or invents behavior |
| Constraints | Rules and forbidden changes | Dependencies, APIs, or tone drift unexpectedly |
| Acceptance criteria | How to judge completion | ”Looks good” replaces reviewable quality |
| Verification | Commands and manual checks | Work ends without evidence |
Anthropic’s prompt engineering overview explicitly starts from success criteria and empirical tests. That advice matters even more for coding agents, because Claude Code and Codex can read files, edit files, run commands, and produce changes that look complete before they have actually been checked.
Put The Prompt Packet In A File
Typing a long instruction into chat every time is hard to review and easy to forget. A file such as prompt-packet.md makes the task repeatable. The next Bash snippet creates a minimal packet you can paste into a repository.
cat > prompt-packet.md <<'EOF'
# Goal
Improve one published article so it is practical, accurate, and ready for review.
# Scope
May edit:
- site/src/content/blog/example-article.mdx
Do not edit:
- heroImage
- slug
- unrelated articles
- package or deployment files
# Context to read
- AGENTS.md
- site/src/content/blog/claude-md-best-practices.mdx
- Official docs relevant to the article topic
# Constraints
- Preserve existing frontmatter keys unless this task explicitly changes them.
- Use copy-pasteable examples, not pseudocode.
- Avoid unsupported claims. Link to official docs for tool behavior.
- Keep paragraphs short enough for mobile reading.
# Acceptance criteria
- updatedDate is 2026-06-02.
- The article has at least three concrete use cases.
- The article names specific pitfalls and how to avoid them.
- The article includes an internal link, an official external link, and a natural CTA.
- The final section explains what was actually verified.
# Verification
- node scripts/check-code-fences.mjs
- node scripts/check-updated-article-quality.mjs
- Read the diff once as a critical reviewer.
# Return format
- Changed files
- Key improvements
- Checks run
- Residual risks
EOF
Then give the agent a short instruction: “Read prompt-packet.md first, inspect the target file and the listed context, then work only inside the stated scope.” The packet is not ceremony. It prevents the common failure where an agent tries to be helpful by editing neighboring pages, changing images, or refactoring code unrelated to the task.
Lint The Prompt Before Using It
Prompts are prose, so quality can drift. A tiny checker catches missing structure before a weak prompt reaches an agent. Save this as check-prompt-packet.cjs.
// save as check-prompt-packet.cjs
const fs = require("node:fs");
const file = process.argv[2] || "prompt-packet.md";
const text = fs.readFileSync(file, "utf8");
const required = [
"# Goal",
"# Scope",
"# Context to read",
"# Acceptance criteria",
"# Verification",
"# Return format",
];
const missing = required.filter((heading) => !text.includes(heading));
const hasDoNotTouch = /do not (edit|change|touch)/i.test(text);
const hasCommand = /npm run|npm test|pnpm |yarn |node scripts\//i.test(text);
if (missing.length || !hasDoNotTouch || !hasCommand) {
console.error("Prompt packet is not ready.");
if (missing.length) console.error("Missing headings: " + missing.join(", "));
if (!hasDoNotTouch) console.error("Add an explicit do-not-touch boundary.");
if (!hasCommand) console.error("Add at least one verification command.");
process.exit(1);
}
console.log("Prompt packet looks actionable.");
Run it like this:
node check-prompt-packet.cjs prompt-packet.md
This is intentionally simple. In Masa’s article operations, the most common prompt failures were missing “do not touch” boundaries and missing proof at the end. The same thing happens in product code. If the request does not define completion, the reviewer has to judge by taste.
Turn Fuzzy Goals Into Acceptance Criteria
“Make it better”, “improve SEO”, and “make this production ready” are reasonable human intentions, but they are not reviewable. Convert them into pass/fail checks.
Weak prompt:
Improve this article and make the SEO stronger.
Useful prompt:
Rewrite the article as a practical guide for developers starting with Claude Code.
Acceptance criteria:
- The title includes "Claude Code" and "prompt engineering".
- The description is under 120 characters.
- The body includes at least three concrete use cases.
- There are at least two bad prompt examples and two improved versions.
- The article links to official docs and to related internal articles.
- The final section states what was actually verified.
- Run the code-fence check after editing and report the result.
For implementation work, make the criteria technical.
Acceptance criteria:
- Do not change the public API type.
- Add validation and user-visible error handling.
- Add or update at least one failing-path test.
- Run npm test and npm run build, then report results.
- Explain changed files and relevant files that were inspected but left unchanged.
The point is not to micromanage the agent. The point is to share the review standard before the work starts.
Manage The Context Budget
Claude Code’s Memory documentation explains that CLAUDE.md and auto memory are loaded as context, not as enforced configuration. Longer instructions do not automatically produce better compliance. They can bury the important rule.
Use three layers of context.
| Layer | Examples | Best home |
|---|---|---|
| Always needed | Build commands, naming rules, forbidden areas | CLAUDE.md or AGENTS.md |
| Task-specific | Target file, similar implementation, quality bar | prompt-packet.md |
| Read only if needed | Long specs, old meeting notes, raw logs | Mention the file, do not paste all of it |
The trap is dumping everything into the prompt. If you paste a full meeting transcript, old design notes, generated logs, and the current task into one message, the agent must infer priority. Write the read order instead.
Context to read in order:
1. AGENTS.md for project rules.
2. The target article.
3. One similar high-quality article for tone and structure.
4. Official docs only for tool behavior.
Ignore:
- Old brainstorming notes unless they contradict the current implementation.
- Unrelated product pages.
- Generated files and build output.
This small instruction reduces wandering. It also matters when several workers are editing nearby files in parallel, because the agent should not treat every file it sees as permission to edit.
Separate Examples From Constraints
Examples show the pattern to imitate. Constraints define the boundary. Mixing them creates vague prompts.
Weak prompt:
Make this page like the productivity tips article.
Better prompt:
Reference style:
- Use site/src/content/blog-en/claude-code-productivity-tips.mdx only for section density and CTA placement.
- Do not copy its examples or claims.
Constraints:
- Keep this article focused on prompt engineering.
- Do not introduce pricing claims.
- Preserve heroImage and slug.
Avoid constraints that only say “do not mess this up.” Give the agent the positive boundary too. Instead of “do not add libraries”, write “use only existing dependencies.” Instead of “do not change too much”, write “edit only the listed file and preserve the public API.”
Ask For A Safe Iteration Loop
For serious work, one giant instruction is less reliable than a loop.
- Read the target, rules, and nearest example.
- Plan the intended change briefly.
- Edit only inside scope.
- Verify with commands and manual checks.
- Report changed files, proof, and remaining risk.
You can write the loop directly in the prompt.
Workflow:
- First inspect the target file and the nearest quality reference.
- If the change is larger than two files, explain the plan before editing.
- Edit only the files listed in Scope.
- After editing, run the Verification commands if feasible.
- End with a verification receipt, not a general summary.
A verification receipt is the work receipt. The full pattern is covered in the verification receipt guide, but the minimum format is enough for daily work.
Verification receipt:
- Changed files:
- Commands run:
- Results:
- Manual checks:
- Could not verify:
- Residual risks:
Four Concrete Use Cases
The first use case is bug fixing. Give the symptom, reproduction steps, expected behavior, logs, and allowed files. Ask the agent to explain the likely cause before editing, then make the smallest fix and add a failing-path test. This avoids cosmetic fixes that hide the symptom.
The second use case is a small feature. Describe the user-visible change, whether API or database shape may change, which existing UI pattern to follow, and which tests prove the feature. For a contact form category field, include the options, validation, submitted payload, analytics event, and localization behavior.
The third use case is article rewriting. Provide the reader, search intent, target length, required examples, failure cases, internal links, CTA, and official sources. On ClaudeCodeLab, this prevents a thin summary from replacing a hands-on guide. The “what I actually verified” section is especially important for AdSense quality and reader trust.
The fourth use case is code review. Review prompts need output shape more than creativity. Ask for severity, file and line, reproduction condition, fix direction, and missing tests. Instead of “review everything”, prioritize security, data loss, public API changes, and untested error paths.
Failure Modes To Watch
Failure one: mixing goals. “Refactor, speed it up, improve SEO, and update the CTA” is several jobs. Start with one goal per packet.
Failure two: only writing prohibitions. “Do not break anything” does not tell the agent what it may do. Pair every forbidden area with an allowed area.
Failure three: treating stale knowledge as official behavior. Claude Code, Codex, memory, settings, and AGENTS.md behavior can change. Cite official docs for tool behavior and avoid stronger claims than the docs support.
Failure four: making verification optional. If a command cannot run, the agent should say that. Silence is worse than a reported gap.
Failure five: leaving good prompts trapped in chat. Move working instructions into prompt-packet.md, CLAUDE.md, or a team review checklist. For team continuation, pair this with team handoff rules.
CTA: Standardize Before You Scale
You do not need to rewrite this packet every day. Start with the free cheatsheet for daily checks, compare reusable prompts on the products and templates page, and use Claude Code training or consultation when your team needs repository-specific rules for CLAUDE.md, permissions, review, verification receipts, and article quality.
What I Verified
For this article, I checked the official Claude Code overview for the claim that Claude Code can read a codebase, edit files, and run commands. I checked the Memory documentation for the distinction between CLAUDE.md, auto memory, and enforced configuration. I also checked Anthropic’s prompt engineering overview for the need to define success criteria and empirical tests before improving prompts, and I aligned Codex project-instruction wording with OpenAI’s Codex docs and AGENTS.md guidance. The practical conclusion is to treat prompts as reviewable work contracts, not one-off chat messages.
Free PDF: Claude Code Cheatsheet
Enter your email and download the one-page Claude Code cheatsheet for commands, review habits, and safe workflows.
We handle your data with care and never send spam.
Level up your Claude Code workflow
Start with the free PDF, use Gumroad guides when you need repeatable workflows, and book consultation when rollout or revenue paths need human judgment.
About the Author
Masa
Engineer focused on practical Claude Code workflows. Runs claudecode-lab.com, a 10-language technical media site.
Related Posts
Claude Code Permission Receipt Pattern: Record Scope, Proof, and Rollback
A permission receipt pattern for Claude Code: allowed actions, approval boundaries, proof commands, rollback, and revenue CTA checks.
Safe Agent Harness Design for Claude Code and Codex: Permissions, Checks, and Rollback
Build a practical agent harness for Claude Code and Codex with policy, planning, verification, and recovery layers.
Claude Code Subagents: A Practical Guide to Safe Agent Delegation
Claude Code subagent guide for safe parallel article and code work: delegation rules, prompts, pitfalls, and checks.
Related Products
50 Battle-Tested Claude Code Prompt Templates
Copy, paste, ship. 50 production-ready prompts.
Use proven prompts for code review, refactoring, testing, documentation, debugging, architecture, and incident response.
The Complete Claude Code Setup & Configuration Guide
From install to team-ready workflow.
A practical guide to installation, CLAUDE.md, hooks, MCP servers, permissions, IDE setup, and CI/CD workflows.