Advanced Prompt Engineering for Claude Code and Codex: Practical Task Briefs That Survive Real Work

When Claude Code or Codex gives uneven results, the problem is often not the model. It is usually the way the work was handed over. Advanced prompt engineering is not a magic phrase. It is workflow design: scope, context, constraints, acceptance criteria, verification, and handoff in one package.

This guide shows how to build a reusable “prompt packet” for Claude Code and Codex. A beginner can copy the templates, but the goal is advanced enough for team workflows, parallel workers, production code, and published content. If you have not built the basics yet, start with 5 tips for better prompts and CLAUDE.md best practices.

Tool behavior changes, so use official docs for product-specific claims. For Claude Code, start with the Claude Code overview, Memory, and Anthropic’s prompt engineering overview. For Codex, use the OpenAI Codex docs and AGENTS.md guidance.

Advanced Means A Work Contract

A weak prompt is not only short. It lacks decision rules. It does not say which files are safe to touch. It does not define what “done” means. It does not ask for proof. With those gaps, an agent has to guess.

A practical prompt should behave like a small work contract.

Part	What to write	Failure when missing
Goal	The outcome you need	The answer looks plausible but solves the wrong problem
Scope	Files or areas that may and may not change	Unrelated refactors appear in the diff
Context	Docs, similar code, official references	The agent copies stale patterns or invents behavior
Constraints	Rules and forbidden changes	Dependencies, APIs, or tone drift unexpectedly
Acceptance criteria	How to judge completion	”Looks good” replaces reviewable quality
Verification	Commands and manual checks	Work ends without evidence

Anthropic’s prompt engineering overview explicitly starts from success criteria and empirical tests. That advice matters even more for coding agents, because Claude Code and Codex can read files, edit files, run commands, and produce changes that look complete before they have actually been checked.

Put The Prompt Packet In A File

Typing a long instruction into chat every time is hard to review and easy to forget. A file such as prompt-packet.md makes the task repeatable. The next Bash snippet creates a minimal packet you can paste into a repository.

cat > prompt-packet.md <<'EOF'
# Goal
Improve one published article so it is practical, accurate, and ready for review.

# Scope
May edit:
- site/src/content/blog/example-article.mdx

Do not edit:
- heroImage
- slug
- unrelated articles
- package or deployment files

# Context to read
- AGENTS.md
- site/src/content/blog/claude-md-best-practices.mdx
- Official docs relevant to the article topic

# Constraints
- Preserve existing frontmatter keys unless this task explicitly changes them.
- Use copy-pasteable examples, not pseudocode.
- Avoid unsupported claims. Link to official docs for tool behavior.
- Keep paragraphs short enough for mobile reading.

# Acceptance criteria
- updatedDate is 2026-06-02.
- The article has at least three concrete use cases.
- The article names specific pitfalls and how to avoid them.
- The article includes an internal link, an official external link, and a natural CTA.
- The final section explains what was actually verified.

# Verification
- node scripts/check-code-fences.mjs
- node scripts/check-updated-article-quality.mjs
- Read the diff once as a critical reviewer.

# Return format
- Changed files
- Key improvements
- Checks run
- Residual risks
EOF

Then give the agent a short instruction: “Read prompt-packet.md first, inspect the target file and the listed context, then work only inside the stated scope.” The packet is not ceremony. It prevents the common failure where an agent tries to be helpful by editing neighboring pages, changing images, or refactoring code unrelated to the task.

Lint The Prompt Before Using It

Prompts are prose, so quality can drift. A tiny checker catches missing structure before a weak prompt reaches an agent. Save this as check-prompt-packet.cjs.

// save as check-prompt-packet.cjs
const fs = require("node:fs");

const file = process.argv[2] || "prompt-packet.md";
const text = fs.readFileSync(file, "utf8");

const required = [
  "# Goal",
  "# Scope",
  "# Context to read",
  "# Acceptance criteria",
  "# Verification",
  "# Return format",
];

const missing = required.filter((heading) => !text.includes(heading));
const hasDoNotTouch = /do not (edit|change|touch)/i.test(text);
const hasCommand = /npm run|npm test|pnpm |yarn |node scripts\//i.test(text);

if (missing.length || !hasDoNotTouch || !hasCommand) {
  console.error("Prompt packet is not ready.");
  if (missing.length) console.error("Missing headings: " + missing.join(", "));
  if (!hasDoNotTouch) console.error("Add an explicit do-not-touch boundary.");
  if (!hasCommand) console.error("Add at least one verification command.");
  process.exit(1);
}

console.log("Prompt packet looks actionable.");

Run it like this:

node check-prompt-packet.cjs prompt-packet.md

This is intentionally simple. In Masa’s article operations, the most common prompt failures were missing “do not touch” boundaries and missing proof at the end. The same thing happens in product code. If the request does not define completion, the reviewer has to judge by taste.

Turn Fuzzy Goals Into Acceptance Criteria

“Make it better”, “improve SEO”, and “make this production ready” are reasonable human intentions, but they are not reviewable. Convert them into pass/fail checks.

Weak prompt:

Improve this article and make the SEO stronger.

Useful prompt:

Rewrite the article as a practical guide for developers starting with Claude Code.

Acceptance criteria:
- The title includes "Claude Code" and "prompt engineering".
- The description is under 120 characters.
- The body includes at least three concrete use cases.
- There are at least two bad prompt examples and two improved versions.
- The article links to official docs and to related internal articles.
- The final section states what was actually verified.
- Run the code-fence check after editing and report the result.

For implementation work, make the criteria technical.

Acceptance criteria:
- Do not change the public API type.
- Add validation and user-visible error handling.
- Add or update at least one failing-path test.
- Run npm test and npm run build, then report results.
- Explain changed files and relevant files that were inspected but left unchanged.

The point is not to micromanage the agent. The point is to share the review standard before the work starts.

Manage The Context Budget

Claude Code’s Memory documentation explains that CLAUDE.md and auto memory are loaded as context, not as enforced configuration. Longer instructions do not automatically produce better compliance. They can bury the important rule.

Use three layers of context.

Layer	Examples	Best home
Always needed	Build commands, naming rules, forbidden areas	`CLAUDE.md` or `AGENTS.md`
Task-specific	Target file, similar implementation, quality bar	`prompt-packet.md`
Read only if needed	Long specs, old meeting notes, raw logs	Mention the file, do not paste all of it

The trap is dumping everything into the prompt. If you paste a full meeting transcript, old design notes, generated logs, and the current task into one message, the agent must infer priority. Write the read order instead.

Context to read in order:
1. AGENTS.md for project rules.
2. The target article.
3. One similar high-quality article for tone and structure.
4. Official docs only for tool behavior.

Ignore:
- Old brainstorming notes unless they contradict the current implementation.
- Unrelated product pages.
- Generated files and build output.

This small instruction reduces wandering. It also matters when several workers are editing nearby files in parallel, because the agent should not treat every file it sees as permission to edit.

Separate Examples From Constraints

Examples show the pattern to imitate. Constraints define the boundary. Mixing them creates vague prompts.

Weak prompt:

Make this page like the productivity tips article.

Better prompt:

Reference style:
- Use site/src/content/blog-en/claude-code-productivity-tips.mdx only for section density and CTA placement.
- Do not copy its examples or claims.

Constraints:
- Keep this article focused on prompt engineering.
- Do not introduce pricing claims.
- Preserve heroImage and slug.

Avoid constraints that only say “do not mess this up.” Give the agent the positive boundary too. Instead of “do not add libraries”, write “use only existing dependencies.” Instead of “do not change too much”, write “edit only the listed file and preserve the public API.”

Ask For A Safe Iteration Loop

For serious work, one giant instruction is less reliable than a loop.

Read the target, rules, and nearest example.
Plan the intended change briefly.
Edit only inside scope.
Verify with commands and manual checks.
Report changed files, proof, and remaining risk.

You can write the loop directly in the prompt.

Workflow:
- First inspect the target file and the nearest quality reference.
- If the change is larger than two files, explain the plan before editing.
- Edit only the files listed in Scope.
- After editing, run the Verification commands if feasible.
- End with a verification receipt, not a general summary.

A verification receipt is the work receipt. The full pattern is covered in the verification receipt guide, but the minimum format is enough for daily work.

Verification receipt:
- Changed files:
- Commands run:
- Results:
- Manual checks:
- Could not verify:
- Residual risks:

Four Concrete Use Cases

The first use case is bug fixing. Give the symptom, reproduction steps, expected behavior, logs, and allowed files. Ask the agent to explain the likely cause before editing, then make the smallest fix and add a failing-path test. This avoids cosmetic fixes that hide the symptom.

The second use case is a small feature. Describe the user-visible change, whether API or database shape may change, which existing UI pattern to follow, and which tests prove the feature. For a contact form category field, include the options, validation, submitted payload, analytics event, and localization behavior.

The third use case is article rewriting. Provide the reader, search intent, target length, required examples, failure cases, internal links, CTA, and official sources. On ClaudeCodeLab, this prevents a thin summary from replacing a hands-on guide. The “what I actually verified” section is especially important for AdSense quality and reader trust.

The fourth use case is code review. Review prompts need output shape more than creativity. Ask for severity, file and line, reproduction condition, fix direction, and missing tests. Instead of “review everything”, prioritize security, data loss, public API changes, and untested error paths.

Failure Modes To Watch

Failure one: mixing goals. “Refactor, speed it up, improve SEO, and update the CTA” is several jobs. Start with one goal per packet.

Failure two: only writing prohibitions. “Do not break anything” does not tell the agent what it may do. Pair every forbidden area with an allowed area.

Failure three: treating stale knowledge as official behavior. Claude Code, Codex, memory, settings, and AGENTS.md behavior can change. Cite official docs for tool behavior and avoid stronger claims than the docs support.

Failure four: making verification optional. If a command cannot run, the agent should say that. Silence is worse than a reported gap.

Failure five: leaving good prompts trapped in chat. Move working instructions into prompt-packet.md, CLAUDE.md, or a team review checklist. For team continuation, pair this with team handoff rules.

CTA: Standardize Before You Scale

You do not need to rewrite this packet every day. Start with the free cheatsheet for daily checks, compare reusable prompts on the products and templates page, and use Claude Code training or consultation when your team needs repository-specific rules for CLAUDE.md, permissions, review, verification receipts, and article quality.

What I Verified

For this article, I checked the official Claude Code overview for the claim that Claude Code can read a codebase, edit files, and run commands. I checked the Memory documentation for the distinction between CLAUDE.md, auto memory, and enforced configuration. I also checked Anthropic’s prompt engineering overview for the need to define success criteria and empirical tests before improving prompts, and I aligned Codex project-instruction wording with OpenAI’s Codex docs and AGENTS.md guidance. The practical conclusion is to treat prompts as reviewable work contracts, not one-off chat messages.

Advanced Prompt Engineering for Claude Code and Codex: Practical Task Briefs That Survive Real Work

Advanced Means A Work Contract

Put The Prompt Packet In A File

Lint The Prompt Before Using It

Turn Fuzzy Goals Into Acceptance Criteria

Manage The Context Budget

Separate Examples From Constraints

Ask For A Safe Iteration Loop

Four Concrete Use Cases

Failure Modes To Watch

CTA: Standardize Before You Scale

What I Verified

Free PDF: Claude Code Cheatsheet

Level up your Claude Code workflow

Related Posts

Claude Code Permission Receipt Pattern: Record Scope, Proof, and Rollback

Safe Agent Harness Design for Claude Code and Codex: Permissions, Checks, and Rollback

Claude Code Subagents: A Practical Guide to Safe Agent Delegation

Related Products

50 Battle-Tested Claude Code Prompt Templates

The Complete Claude Code Setup & Configuration Guide