用 Claude Code 做真实项目的开发估算工作流

开发估算不是猜一个“几天能做完”的数字。真正有用的估算，要把范围、前提、未知项、风险、评审时间和验证工作讲清楚，让产品、工程和客户看到同一份证据。

新手最常见的错误，是只估编码时间。“给个人资料加一个手机号字段”看起来很小，但可能涉及数据库 migration、API 类型、表单校验、CSV 导出、审计日志、测试、发布说明和评审排期。这里的 migration 指数据库结构或数据变更。代码可以回滚，数据变更却不一定容易恢复，所以必须单独估。

Claude Code 不能把估算变成精确预言。它真正有价值的地方，是先读代码仓库，找出影响范围，把假设和未知项放进表格，再把风险显性化。不要让它直接编一个日期，而要让它减少遗漏。

这个做法和官方 Scrum Guide 的经验主义很一致：透明、检查、适应。要把多个 Issue 或 PR 作为一批工作跟踪，可以参考 GitHub 的 milestones 文档。相对估算可以参考 Atlassian 的估算指南。Claude Code 的基础用法见官方 overview 和 CLI reference。

站内相关内容可以继续读：代码库导航指南、Bug 报告模板和代码评审清单。

先拆开估算材料

在讨论日期之前，先把信息分成五类。

材料	通俗意思	Claude Code 能帮什么
scope	本次做什么、不做什么	找变更文件、关联功能、测试范围
assumptions	估算成立的前提	产品规则、权限、发布路径
unknowns	还不知道的事	缺少的文件、问题、负责人
risk buffer	为失败和等待留的余量	migration、认证、计费、评审排队
evidence	数字为什么可信	过去 PR、git 历史、测试数量

不要只说“2 天”。更好的说法是：“低位 1.5 天，通常 3 天；如果 CRM 集成也在范围内，高位 5 天。”范围不是不专业，而是说明哪些已知、哪些仍会变化。

flowchart LR
  A["需求"] --> B["repo scan"]
  B --> C["任务拆分"]
  C --> D["前提表"]
  D --> E["风险台账"]
  E --> F["估算范围"]
  F --> G["评审提示词"]
  G --> H["客户摘要"]

四个真实用例

第一个用例是 SaaS 的个人资料字段变更。添加 phone_number 可能影响数据库、API 校验、UI、搜索、CSV 导出、审计日志、隐私处理和测试。如果这是个人信息，还要估日志脱敏、删除请求和导出请求。

第二个用例是遗留页面的 bug 修复。“筛选不起作用”可能牵涉旧查询工具、缓存、URL 参数、测试 fixture 和 E2E 验证。先让 Claude Code 画出影响范围，再进入修复，会比直接让它改代码安全。

第三个用例是外包提案或内部 DX 需求。客户说“做一个管理后台”，实际可能包含登录、角色、审计、CSV、通知、权限交接和运维文档。Claude Code 可以读已有 Issue 和代码，发现需求里没写但很可能需要的工作。

第四个用例是内容站或商业化页面。一个 MDX 修改可能还包含速度、内部链接、OGP、结构化数据、本地化、截图和 AdSense 友好布局。发布质量比文本编辑更大。

常见失败模式

第一种失败，是把第一反应当承诺。“大概一天”在扫描代码前只是愿望。一旦这个愿望被客户记住，后面更真实的估算都会像延期。

第二种失败，是把评审和验证当成免费。实现 6 小时，不代表测试、评审修复、staging 检查、发布说明和部署协调也为零。

第三种失败，是把未知项藏进模糊 buffer。“加 30%”没有意义，除非你说清楚原因：外部 API 未确认、回滚方案未设计、评审人排期、测试数据不足或产品规则不清。

第四种失败，是相信没有证据的 AI 数字。Claude Code 说“3 天”但没有列文件、过去 PR、测试和风险时，那只是包装漂亮的段落。

Step 1: 只读扫描仓库

先看仓库外形，不要马上编辑。

git status --short
git branch --show-current
git rev-parse --show-toplevel

rg --files \
  -g '!*node_modules*' \
  -g '!dist' \
  -g '!build' \
  -g '!coverage' \
  -g '!*.lock' \
  | sort \
  | head -200

find . -maxdepth 3 \( \
  -name package.json -o \
  -name pyproject.toml -o \
  -name go.mod -o \
  -name Cargo.toml -o \
  -name AGENTS.md -o \
  -name CLAUDE.md -o \
  -name README.md \
\) -print

然后让 Claude Code 输出只读地图。

claude -p "
请做只读 repo scan。
不要编辑、创建文件或安装依赖。

输出:
1. apps、packages、services
2. runtime、test、build 入口
3. 应忽略的生成目录
4. 这次估算必须阅读的10个文件
5. 当前还不能估算的原因
"

Step 2: 拆成可评审任务

按可评审、可验证的单位拆分，而不是越碎越好。

claude -p "
任务: 允许用户在个人资料中添加、查看和编辑手机号。

请拆成可评审的工作项。
每项包含:
- 名称
- 可能涉及的文件
- 实现内容
- 测试内容
- 完成条件
- size: small, medium, large

同时列出不在范围内的内容。
把猜测的产品规则单独放到 assumptions。
"

通常会拆成 DB、API、UI、测试和发布工作。如果某一项不能放进一个 PR，就先拆小再估。

Step 3: 做 assumptions table

前提最容易在后期引发争议，所以要写出来。

| ID | Assumption | Why it matters | Owner | Confirm by |
| --- | --- | --- | --- | --- |
| A1 | Phone number is optional | Required fields change validation and migration | PM | 2026-06-05 |
| A2 | Web only, no mobile app change | Mobile release adds review and store delay | PM | 2026-06-05 |
| A3 | Existing user rows stay null | Backfill work is not included | Tech lead | 2026-06-06 |

claude -p "
Review this assumptions table.
Find assumptions that could break the estimate.
Add missing owners, deadlines, and questions.
Move anything risky into a risk register.
"

Step 4: 建 risk register

risk register 就是风险台账，记录什么情况会让计划失准。

| Risk | Trigger | Impact | Mitigation | Buffer |
| --- | --- | --- | --- | --- |
| DB rollback is unclear | migration changes existing rows | High | dry-run and rollback plan | 0.5-1 day |
| External CRM stores phone | CRM field mapping appears | Medium | check integration owner | 0.5 day |
| Review queue is full | no reviewer within 24h | Medium | book review slot early | 1 day |
| Test data is missing | no edge-case users | Medium | create fixtures first | 0.5 day |

buffer 不是偷懒余量，而是给未知、失败和排队时间留空间。认证、计费、个人数据、删除流程和 migration 一般都需要比纯 UI 改动更明确的风险说明。

Step 5: 计算估算范围

下面的脚本可以直接复制运行，适合团队讨论。

// estimate-range.mjs
const tasks = [
  { name: "Repo scan and design check", hours: 2, risk: 1.1 },
  { name: "DB migration and schema", hours: 4, risk: 1.4 },
  { name: "API contract and validation", hours: 5, risk: 1.2 },
  { name: "Profile UI update", hours: 6, risk: 1.2 },
  { name: "Tests and fixtures", hours: 5, risk: 1.3 },
  { name: "Review fixes and release note", hours: 3, risk: 1.2 },
];

const base = tasks.reduce((sum, task) => sum + task.hours, 0);
const likely = tasks.reduce((sum, task) => sum + task.hours * task.risk, 0);
const low = Math.max(base * 0.8, base - 4);
const high = likely * 1.35;

const day = 6;
const format = (hours) => `${hours.toFixed(1)}h / ${(hours / day).toFixed(1)}d`;

console.log(`Low:    ${format(low)}`);
console.log(`Likely: ${format(likely)}`);
console.log(`High:   ${format(high)}`);

node estimate-range.mjs

不要把倍数当真理。risk: 1.4 的意思是“这里不确定性更高”。如果改了倍数，要在 Issue 或 PR 里记录原因。

Step 6: 批判性评审估算

在发给客户前，让 Claude Code 找问题。

You are a critical project estimation reviewer.

Review this estimate before I share it with a client.
Find:
1. hidden scope
2. weak assumptions
3. missing tests
4. missing rollout or rollback work
5. fake precision
6. tasks that should be split

Return findings first.
Then provide a revised low / likely / high range.
Do not make the estimate look more certain than the evidence supports.

措辞很重要。让它“写漂亮一点”，会得到漂亮文字。让它“批判性评审”，才会得到有价值的摩擦。

Step 7: 写客户摘要

把内部记录改成短决策文档。

# Development Estimate Summary

## Scope
- Add optional phone number to user profile.
- Update DB schema, API validation, profile UI, and tests.
- Include release note and manual verification.

## Not included
- SMS notification.
- Mobile app release.
- Historical data backfill.
- CRM integration changes unless confirmed.

## Estimate
- Low: 3 business days
- Likely: 4-5 business days
- High: 7 business days if CRM or migration rollback work expands

## Assumptions
- Phone number is optional.
- Web only.
- Existing users can keep the value empty.

## Risks
- DB rollback plan must be reviewed before implementation.
- Reviewer availability may add one calendar day.

## Next decision
- Confirm whether CRM and mobile app are in scope by 2026-06-05.

最后把估算放回 GitHub Issue 或 milestone，方便以后对比实际结果。

## Estimate
- Low:
- Likely:
- High:
- Confidence: Low / Medium / High

## Scope
- [ ]

## Out of scope
- [ ]

## Assumptions
- [ ]

## Risks
- [ ]

## Verification
- [ ] Unit tests:
- [ ] Integration tests:
- [ ] Manual check:

## Actual result
- Started:
- Merged:
- Extra work found:
- What to adjust next time:

Actual result 是团队学习的位置。下一次估算会从观点变成证据。

咨询与商业化

估算是很适合咨询转化的主题，因为读者通常需要把自己的仓库、客户沟通和团队规则落地。ClaudeCodeLab 可以协助设计估算模板、Claude Code 评审提示词、PR 清单和团队导入规则。如果你想把这套流程接到自己的 Issue 或提案流程里，可以通过 Claude Code 培训与咨询发送现状。

实际验证结果

Masa 用这套流程重新评估过一个小型个人资料字段变更。最初直觉是“半天 UI 工作”。repo scan 之后，范围变成 DB schema、API 校验、CSV 导出、审计日志、测试和评审排队。最终给客户的范围是通常 4-5 个工作日，高位是 7 天。Claude Code 有用的地方不是预测日期，而是早早暴露隐藏文件和未知项。