Feature Flags with Claude Code: Safe Rollouts, Experiments, and Kill Switches
Feature flag practice for Claude Code: rollouts, experiments, kill switches, targeting, cleanup, and safe prompts.
Start with the Operating Rule, Not the Toggle
A feature flag is a runtime switch: code can be deployed while a capability stays off, rolls out gradually, or turns off quickly when something breaks. The beginner mistake is not the if statement. The real mistake is treating every flag the same. A release flag, an experiment flag, and a kill switch have different lifetimes, owners, metrics, and cleanup rules.
Claude Code can generate the UI branch in seconds. Production work needs more structure: a safe default, a targeting context, server/client boundaries, exposure logging, guardrail metrics, rollback instructions, and a date when short-lived flags are removed. In Masa’s site and small SaaS workflows, the best prompt is not “add feature flags.” It is “show me what fails, what I can switch off, and how we will know whether the rollout is healthy.”
Use primary docs for the mental model. OpenFeature separates the application-facing evaluation API from the provider behind it and uses an evaluation context for user, app, and environment data. LaunchDarkly documents flag use cases such as release flags, experiment flags, and kill switches. Unleash documents a lifecycle that moves flags through Define, Develop, Production, Cleanup, and Archived states. Anthropic’s Claude Code guidance also emphasizes giving the agent a verification path.
Primary references used for this refresh:
- OpenFeature introduction
- OpenFeature evaluation context
- LaunchDarkly creating flags
- LaunchDarkly targeting rules
- LaunchDarkly guarded rollouts
- Unleash feature flag lifecycle
- Claude Code best practices
Separate Release Flags, Experiments, and Kill Switches
Classify flags by lifetime before writing code. A release flag hides unfinished work, rolls it out to a growing audience, then disappears after 100% rollout. An experiment flag tests a hypothesis and must record exposure plus outcome metrics. A kill switch is a longer-lived safety control for external API failures, cost spikes, slow recommendation services, or risky automation.
| Use case | Flag type | Success metric | Failure action |
|---|---|---|---|
| Roll out a new SaaS checkout to 25% of Pro accounts | Release | Checkout completion, payment error rate | Turn off checkout_v2_release |
| Compare pricing page CTA copy | Experiment | Signup start rate, paid-intent clicks | Stop experiment and serve control |
| Move a blog affiliate block into the article body | Experiment | Product clicks, read completion | Return block to article footer |
| Disable recommendations during a vendor incident | Kill switch | p95 latency, 5xx rate | Turn off recommendations_enabled |
This article pairs well with A/B testing with Claude Code and analytics implementation with Claude Code, because flags without measurement become guesswork. For monetized sites, the natural CTA is not only a button click: protect affiliate revenue, AdSense quality, read completion, and paid consultation intent. If you want a packaged workflow, see the ClaudeCodeLab products or book a consultation.
A Minimal Config and Evaluator
Start with a small, vendor-neutral pattern. Your app should evaluate a flag by key, default value, and context; the backing provider can later become LaunchDarkly, Unleash, OpenFeature, a JSON file, or an internal service. The following snippet is intentionally small enough to paste into flag-demo.ts and run with npx tsx flag-demo.ts.
type FlagValue = boolean | string | number;
type FlagKind = "release" | "experiment" | "kill_switch";
type Plan = "free" | "pro" | "enterprise";
type Role = "user" | "admin";
type Operator = "equals" | "in";
type FlagContext = {
targetingKey: string;
plan: Plan;
country: string;
role: Role;
appVersion: string;
};
type FlagRule = {
attribute: keyof Omit<FlagContext, "targetingKey">;
operator: Operator;
values: string[];
value: FlagValue;
percentage?: number;
};
type FlagConfig = {
key: string;
kind: FlagKind;
enabled: boolean;
defaultValue: FlagValue;
offValue: FlagValue;
owner: string;
removeAfter?: string;
rules: FlagRule[];
};
const registry: Record<string, FlagConfig> = {
checkout_v2_release: {
key: "checkout_v2_release",
kind: "release",
enabled: true,
defaultValue: false,
offValue: false,
owner: "growth-platform",
removeAfter: "2026-07-15",
rules: [
{
attribute: "role",
operator: "equals",
values: ["admin"],
value: true,
},
{
attribute: "plan",
operator: "in",
values: ["pro", "enterprise"],
value: true,
percentage: 25,
},
],
},
pricing_copy_2026_06: {
key: "pricing_copy_2026_06",
kind: "experiment",
enabled: true,
defaultValue: "control",
offValue: "control",
owner: "monetization",
removeAfter: "2026-06-30",
rules: [
{
attribute: "country",
operator: "in",
values: ["JP", "US", "DE"],
value: "simple",
percentage: 50,
},
],
},
recommendations_enabled: {
key: "recommendations_enabled",
kind: "kill_switch",
enabled: true,
defaultValue: true,
offValue: false,
owner: "sre",
rules: [],
},
};
function bucketFor(flagKey: string, targetingKey: string): number {
const input = `${flagKey}:${targetingKey}`;
let hash = 0;
for (const char of input) {
hash = (hash * 31 + char.charCodeAt(0)) >>> 0;
}
return hash % 100;
}
function ruleMatches(
flagKey: string,
rule: FlagRule,
context: FlagContext,
): boolean {
const actual = String(context[rule.attribute]);
const matched =
rule.operator === "equals"
? actual === rule.values[0]
: rule.values.includes(actual);
if (!matched) return false;
if (rule.percentage === undefined) return true;
return bucketFor(flagKey, context.targetingKey) < rule.percentage;
}
export function evaluateFlag<T extends FlagValue = FlagValue>(
key: string,
context: FlagContext,
): T {
const flag = registry[key];
if (!flag) return false as T;
if (!flag.enabled) return flag.offValue as T;
for (const rule of flag.rules) {
if (ruleMatches(flag.key, rule, context)) {
return rule.value as T;
}
}
return flag.defaultValue as T;
}
const demoContexts: FlagContext[] = [
{
targetingKey: "user_001",
plan: "pro",
country: "JP",
role: "user",
appVersion: "1.8.0",
},
{
targetingKey: "user_002",
plan: "free",
country: "BR",
role: "admin",
appVersion: "1.8.0",
},
];
for (const context of demoContexts) {
console.log(context.targetingKey, {
checkout: evaluateFlag<boolean>("checkout_v2_release", context),
pricingCopy: evaluateFlag<string>("pricing_copy_2026_06", context),
recommendations: evaluateFlag<boolean>(
"recommendations_enabled",
context,
),
});
}
The important details are boring on purpose. Unknown flags fail closed. Percentage rollout uses a stable targetingKey, not Math.random(). Each temporary flag has an owner and removeAfter date. The registry can later move to a control plane, but the application contract stays small.
Keep Server Evaluation and Client Display Separate
Evaluate anything related to billing, authorization, quota, inventory, or backend cost on the server. Client-side flags are fine for already-authorized UI copy, layout, onboarding hints, or low-risk visual changes. Do not ship secret targeting rules to the browser and do not rely on hidden buttons as access control.
type User = {
id: string;
plan: "free" | "pro" | "enterprise";
role: "user" | "admin";
};
type RequestLike = {
headers: {
get(name: string): string | null;
};
};
export function buildFlagContext(
user: User,
request: RequestLike,
): FlagContext {
return {
targetingKey: user.id,
plan: user.plan,
role: user.role,
country: request.headers.get("x-country") ?? "US",
appVersion: process.env.NEXT_PUBLIC_APP_VERSION ?? "dev",
};
}
export function getServerFlagSnapshot(context: FlagContext) {
return {
checkoutV2: evaluateFlag<boolean>("checkout_v2_release", context),
pricingCopy: evaluateFlag<string>("pricing_copy_2026_06", context),
};
}
type PricingFlags = {
pricingCopy: string;
};
export function PricingCta({ flags }: { flags: PricingFlags }) {
const label =
flags.pricingCopy === "simple"
? "Start with the free plan"
: "Start free trial";
return <a href="/signup">{label}</a>;
}
This keeps the React component dumb. The server produces a small snapshot, and the client renders it. When prompting Claude Code, say that permission and billing checks must stay server-side and that client code may only consume a pre-evaluated snapshot.
Roll Out with Observability, Not Hope
A safe rollout is not simply “start at 1%.” It is a plan with a ramp schedule, metrics, and a rollback threshold. Unleash gradual rollouts combine percentage, stickiness, and constraints. LaunchDarkly guarded rollouts connect rollouts to metrics and can pause or roll back when regressions are detected. Even if you use a small internal evaluator, copy that operating model.
Track three layers. Exposure tells you who saw which flag value. The primary metric tells you whether the intended behavior improved. Guardrails tell you whether you damaged speed, errors, revenue quality, support load, or trust.
type FlagExposure = {
flagKey: string;
value: FlagValue;
targetingKey: string;
route: string;
evaluatedAt: string;
};
export function trackFlagExposure(event: FlagExposure) {
console.log(
JSON.stringify({
event_name: "feature_flag_exposure",
...event,
}),
);
}
For checkout, watch 5xx rate, payment failures, and support tickets. For a monetized blog, do not look only at affiliate clicks; watch read completion, bounce rate, Core Web Vitals, and paid-intent clicks. For AI features, watch token spend, p95 latency, and per-user quotas. A flag that raises clicks while lowering buyer quality is not a win.
Concrete Failure Modes
The first failure is random assignment on every page load. If a user can refresh from A to B, exposure and conversion data are corrupted. Use a stable targeting key.
The second failure is client-only entitlement. Hiding a premium button in React does not protect the backend API. Flags can shape UX, but they are not authorization.
The third failure is an unsafe default. A missing release flag should usually return false. If a typo serves true, you just launched by accident.
The fourth failure is never deleting temporary flags. Six months later, checkout_v2_release becomes mystery logic. Move release and experiment flags into cleanup as soon as the decision is made.
The fifth failure is over-nested rules. Parent flags, child flags, and overlapping percentage rollouts make it hard to explain who actually sees the feature. Keep dependencies rare and documented.
Safer Claude Code Prompts
Claude Code can read files, edit code, run tests, and continue through a task. Give it the safety boundaries and verification commands up front.
Add a feature flag workflow to this repository.
The first flag is checkout_v2_release for a staged rollout.
Constraints:
- Evaluate billing and authorization flags on the server.
- Unknown release flags must return false.
- Use a stable targetingKey for percentage rollout.
- Include owner and removeAfter in the flag registry.
- Do not modify unrelated files.
Required output:
- Minimal flag registry and evaluateFlag function
- Exposure event type
- At least three product use cases
- Failure examples and rollback steps
- Test commands that were run
Use a separate review prompt before merging:
Review this feature flag implementation.
Focus on defaults, server/client boundaries, stable bucketing,
missing exposure events, cleanup dates, and rollback behavior.
List findings by severity and point to exact files.
These prompts move Claude Code away from a decorative toggle and toward an operating system your team can maintain.
Cleanup Is Part of Shipping
Every feature flag starts aging the day it is created. Release flags should disappear after full rollout. Experiment flags should be removed after the winner is selected. Kill switches can stay, but they need an owner, a runbook, and alerts. Put owner, removeAfter, metrics, and the planned removal PR into your pull request template.
I verified the evaluator pattern in this article as a runnable TypeScript demo. The same targetingKey lands in the same bucket, unknown flags return the safe fallback, and the kill switch has an explicit off value. Masa’s practical note from monetized content work is simple: when a flag touches revenue, measure quality as well as clicks. Start with one release flag, one experiment flag, and one kill switch before introducing a full platform.
Free PDF: Claude Code Cheatsheet
Enter your email and download the one-page Claude Code cheatsheet for commands, review habits, and safe workflows.
We handle your data with care and never send spam.
Level up your Claude Code workflow
Start with the free PDF, use Gumroad guides when you need repeatable workflows, and book consultation when rollout or revenue paths need human judgment.
About the Author
Masa
Engineer focused on practical Claude Code workflows. Runs claudecode-lab.com, a 10-language technical media site.
Related Posts
Claude Code Permission Safety Ladder: Expand Access Without Losing Control
A beginner-friendly ladder for moving Claude Code from read-only to limited edits, proof commands, and deploy checks.
Claude Code Small PR Proof Pack: Make Tiny Changes Reviewable
A practical proof pack for Claude Code PRs: diff, checks, public URL, CTA path, and rollback note.
Claude Code Review Gate Before Commit: Diff, Tests, Public URL, and CTA Checks
A commit-time review gate for Claude Code work: diff scope, build, public URL, revenue CTA links, missing tests, and unrelated files.
Related Products
50 Battle-Tested Claude Code Prompt Templates
Copy, paste, ship. 50 production-ready prompts.
Use proven prompts for code review, refactoring, testing, documentation, debugging, architecture, and incident response.
The Complete Claude Code Setup & Configuration Guide
From install to team-ready workflow.
A practical guide to installation, CLAUDE.md, hooks, MCP servers, permissions, IDE setup, and CI/CD workflows.