我如何使用 Claude Code：规划与执行的分离

我如何使用 Claude Code：规划与执行的分离
How I use Claude Code: Separation of planning and execution

原始链接: https://boristane.com/blog/how-i-use-claude-code/

## 使用Claude进行AI辅助编码的规范流程这位开发者使用Claude Code成功的9个月经验围绕一个核心原则：**将规划与执行分离**。与典型的“提示-修复-重复”方法不同，这种方法优先进行彻底的前期工作以确保质量和控制。流程始于**深入研究**，Claude分析相关代码并生成详细报告，保存在持久的markdown文件中。在进入**规划**阶段之前，会对报告进行*审查*以确认理解，规划也记录在markdown中。至关重要的是，计划并非立即最终确定；它会经历多个**注释周期**——开发者直接在计划中添加内联注释，修正假设并注入领域专业知识，Claude会迭代更新它。只有在存在一个经过充分验证的计划*之后*，才会通过单个提示触发**实现**，指示Claude完成所有任务并在计划文档中跟踪进度。在实现过程中，更正简洁明了，并参考现有代码以保持一致性。这种流程在单个、长时间的会话中进行，利用了Claude的优势（理解代码、提出解决方案），同时保持了开发者对架构和产品决策的控制。关键在于注释后的计划——一种共享的、可变的状态，确保一致性并防止浪费精力。

## Claude 代码工作流：规划与执行最近的 Hacker News 讨论详细描述了一种使用 Claude（以及类似 LLM）进行编码的工作流，强调严格区分规划和执行。核心思想是避免直接要求 AI 编写代码，而是引导它经历一个多步骤的过程。这包括详细的头脑风暴、创建高级和实现设计方案（通常在 markdown 文件中），然后利用代理/子代理进行任务实现、规格审查和代码审查。最终结果是用于人工审查的拉取请求。用户强调了提示 LLM 进行“深入”理解的重要性，因为表面阅读会产生较差的结果。然而，一些开发者认为这个过程比直接编写代码*更*费力，尤其是对于经验丰富的程序员。人们也对该方法在大型代码库中变得笨重以及可能浪费 token 表示担忧。讨论中提出的替代方案包括使用工单系统进行任务管理，以及使用 Plannotator 等工具进行方案迭代。最终，讨论指出需要受控的工作流，用户保持控制权，而不是完全依赖 AI 工具。

I’ve been using Claude Code as my primary development tool for approx 9 months, and the workflow I’ve settled into is radically different from what most people do with AI coding tools. Most developers type a prompt, sometimes use plan mode, fix the errors, repeat. The more terminally online are stitching together ralph loops, mcps, gas towns (remember those?), etc. The results in both cases are a mess that completely falls apart for anything non-trivial.

The workflow I’m going to describe has one core principle: never let Claude write code until you’ve reviewed and approved a written plan. This separation of planning and execution is the single most important thing I do. It prevents wasted effort, keeps me in control of architecture decisions, and produces significantly better results with minimal token usage than jumping straight to code.

flowchart LR
    R[Research] --> P[Plan]
    P --> A[Annotate]
    A -->|repeat 1-6x| A
    A --> T[Todo List]
    T --> I[Implement]
    I --> F[Feedback & Iterate]

Phase 1: Research

Every meaningful task starts with a deep-read directive. I ask Claude to thoroughly understand the relevant part of the codebase before doing anything else. And I always require the findings to be written into a persistent markdown file, never just a verbal summary in the chat.

read this folder in depth, understand how it works deeply, what it does and all its specificities. when that’s done, write a detailed report of your learnings and findings in research.md

study the notification system in great details, understand the intricacies of it and write a detailed research.md document with everything there is to know about how notifications work

go through the task scheduling flow, understand it deeply and look for potential bugs. there definitely are bugs in the system as it sometimes runs tasks that should have been cancelled. keep researching the flow until you find all the bugs, don’t stop until all the bugs are found. when you’re done, write a detailed report of your findings in research.md

Notice the language: “deeply”, “in great details”, “intricacies”, “go through everything”. This isn’t fluff. Without these words, Claude will skim. It’ll read a file, see what a function does at the signature level, and move on. You need to signal that surface-level reading is not acceptable.

The written artifact (research.md) is critical. It’s not about making Claude do homework. It’s my review surface. I can read it, verify Claude actually understood the system, and correct misunderstandings before any planning happens. If the research is wrong, the plan will be wrong, and the implementation will be wrong. Garbage in, garbage out.

This is the most expensive failure mode with AI-assisted coding, and it’s not wrong syntax or bad logic. It’s implementations that work in isolation but break the surrounding system. A function that ignores an existing caching layer. A migration that doesn’t account for the ORM’s conventions. An API endpoint that duplicates logic that already exists elsewhere. The research phase prevents all of this.

Phase 2: Planning

Once I’ve reviewed the research, I ask for a detailed implementation plan in a separate markdown file.

I want to build a new feature <name and description> that extends the system to perform <business outcome>. write a detailed plan.md document outlining how to implement this. include code snippets

the list endpoint should support cursor-based pagination instead of offset. write a detailed plan.md for how to achieve this. read source files before suggesting changes, base the plan on the actual codebase

The generated plan always includes a detailed explanation of the approach, code snippets showing the actual changes, file paths that will be modified, and considerations and trade-offs.

I use my own .md plan files rather than Claude Code’s built-in plan mode. The built-in plan mode sucks. My markdown file gives me full control. I can edit it in my editor, add inline notes, and it persists as a real artifact in the project.

One trick I use constantly: for well-contained features where I’ve seen a good implementation in an open source repo, I’ll share that code as a reference alongside the plan request. If I want to add sortable IDs, I paste the ID generation code from a project that does it well and say “this is how they do sortable IDs, write a plan.md explaining how we can adopt a similar approach.” Claude works dramatically better when it has a concrete reference implementation to work from rather than designing from scratch.

But the plan document itself isn’t the interesting part. The interesting part is what happens next.

The Annotation Cycle

This is the most distinctive part of my workflow, and the part where I add the most value.

flowchart TD
    W[Claude writes plan.md] --> R[I review in my editor]
    R --> N[I add inline notes]
    N --> S[Send Claude back to the document]
    S --> U[Claude updates plan]
    U --> D{Satisfied?}
    D -->|No| R
    D -->|Yes| T[Request todo list]

After Claude writes the plan, I open it in my editor and add inline notes directly into the document. These notes correct assumptions, reject approaches, add constraints, or provide domain knowledge that Claude doesn’t have.

The notes vary wildly in length. Sometimes a note is two words: “not optional” next to a parameter Claude marked as optional. Other times it’s a paragraph explaining a business constraint or pasting a code snippet showing the data shape I expect.

Some real examples of notes I’d add:

“use drizzle:generate for migrations, not raw SQL” — domain knowledge Claude doesn’t have
“no — this should be a PATCH, not a PUT” — correcting a wrong assumption
“remove this section entirely, we don’t need caching here” — rejecting a proposed approach
“the queue consumer already handles retries, so this retry logic is redundant. remove it and just let it fail” — explaining why something should change
“this is wrong, the visibility field needs to be on the list itself, not on individual items. when a list is public, all items are public. restructure the schema section accordingly” — redirecting an entire section of the plan

Then I send Claude back to the document:

I added a few notes to the document, address all the notes and update the document accordingly. don’t implement yet

This cycle repeats 1 to 6 times. The explicit “don’t implement yet” guard is essential. Without it, Claude will jump to code the moment it thinks the plan is good enough. It’s not good enough until I say it is.

Why This Works So Well

The markdown file acts as shared mutable state between me and Claude. I can think at my own pace, annotate precisely where something is wrong, and re-engage without losing context. I’m not trying to explain everything in a chat message. I’m pointing at the exact spot in the document where the issue is and writing my correction right there.

This is fundamentally different from trying to steer implementation through chat messages. The plan is a structured, complete specification I can review holistically. A chat conversation is something I’d have to scroll through to reconstruct decisions. The plan wins every time.

Three rounds of “I added notes, update the plan” can transform a generic implementation plan into one that fits perfectly into the existing system. Claude is excellent at understanding code, proposing solutions, and writing implementations. But it doesn’t know my product priorities, my users’ pain points, or the engineering trade-offs I’m willing to make. The annotation cycle is how I inject that judgement.

The Todo List

Before implementation starts, I always request a granular task breakdown:

add a detailed todo list to the plan, with all the phases and individual tasks necessary to complete the plan - don’t implement yet

This creates a checklist that serves as a progress tracker during implementation. Claude marks items as completed as it goes, so I can glance at the plan at any point and see exactly where things stand. Especially valuable in sessions that run for hours.

Phase 3: Implementation

When the plan is ready, I issue the implementation command. I’ve refined this into a standard prompt I reuse across sessions:

implement it all. when you’re done with a task or phase, mark it as completed in the plan document. do not stop until all tasks and phases are completed. do not add unnecessary comments or jsdocs, do not use any or unknown types. continuously run typecheck to make sure you’re not introducing new issues.

This single prompt encodes everything that matters:

“implement it all”: do everything in the plan, don’t cherry-pick
“mark it as completed in the plan document”: the plan is the source of truth for progress
“do not stop until all tasks and phases are completed”: don’t pause for confirmation mid-flow
“do not add unnecessary comments or jsdocs”: keep the code clean
“do not use any or unknown types”: maintain strict typing
“continuously run typecheck”: catch problems early, not at the end

I use this exact phrasing (with minor variations) in virtually every implementation session. By the time I say “implement it all,” every decision has been made and validated. The implementation becomes mechanical, not creative. This is deliberate. I want implementation to be boring. The creative work happened in the annotation cycles. Once the plan is right, execution should be straightforward.

Without the planning phase, what typically happens is Claude makes a reasonable-but-wrong assumption early on, builds on top of it for 15 minutes, and then I have to unwind a chain of changes. The “don’t implement yet” guard eliminates this entirely.

Feedback During Implementation

Once Claude is executing the plan, my role shifts from architect to supervisor. My prompts become dramatically shorter.

flowchart LR
    I[Claude implements] --> R[I review / test]
    R --> C{Correct?}
    C -->|No| F[Terse correction]
    F --> I
    C -->|Yes| N{More tasks?}
    N -->|Yes| I
    N -->|No| D[Done]

Where a planning note might be a paragraph, an implementation correction is often a single sentence:

“You didn’t implement the deduplicateByTitle function.”
“You built the settings page in the main app when it should be in the admin app, move it.”

Claude has the full context of the plan and the ongoing session, so terse corrections are enough.

Frontend work is the most iterative part. I test in the browser and fire off rapid corrections:

“wider”
“still cropped”
“there’s a 2px gap”

For visual issues, I sometimes attach screenshots. A screenshot of a misaligned table communicates the problem faster than describing it.

I also reference existing code constantly:

“this table should look exactly like the users table, same header, same pagination, same row density.”

This is far more precise than describing a design from scratch. Most features in a mature codebase are variations on existing patterns. A new settings page should look like the existing settings pages. Pointing to the reference communicates all the implicit requirements without spelling them out. Claude would typically read the reference file(s) before making the correction.

When something goes in a wrong direction, I don’t try to patch it. I revert and re-scope by discarding the git changes:

“I reverted everything. Now all I want is to make the list view more minimal — nothing else.”

Narrowing scope after a revert almost always produces better results than trying to incrementally fix a bad approach.

Staying in the Driver’s Seat

Even though I delegate execution to Claude, I never give it total autonomy over what gets built. I do the vast majority of the active steering in the plan.md documents.

This matters because Claude will sometimes propose solutions that are technically correct but wrong for the project. Maybe the approach is over-engineered, or it changes a public API signature that other parts of the system depend on, or it picks a more complex option when a simpler one would do. I have context about the broader system, the product direction, and the engineering culture that Claude doesn’t.

flowchart TD
    P[Claude proposes changes] --> E[I evaluate each item]
    E --> A[Accept as-is]
    E --> M[Modify approach]
    E --> S[Skip / remove]
    E --> O[Override technical choice]
    A & M & S & O --> R[Refined implementation scope]

Cherry-picking from proposals: When Claude identifies multiple issues, I go through them one by one: “for the first one, just use Promise.all, don’t make it overly complicated; for the third one, extract it into a separate function for readability; ignore the fourth and fifth ones, they’re not worth the complexity.” I’m making item-level decisions based on my knowledge of what matters right now.

Trimming scope: When the plan includes nice-to-haves, I actively cut them. “remove the download feature from the plan, I don’t want to implement this now.” This prevents scope creep.

Protecting existing interfaces: I set hard constraints when I know something shouldn’t change: “the signatures of these three functions should not change, the caller should adapt, not the library.”

Overriding technical choices: Sometimes I have a specific preference Claude wouldn’t know about: “use this model instead of that one” or “use this library’s built-in method instead of writing a custom one.” Fast, direct overrides.

Claude handles the mechanical execution, while I make the judgement calls. The plan captures the big decisions upfront, and selective guidance handles the smaller ones that emerge during implementation.

Single Long Sessions

I run research, planning, and implementation in a single long session rather than splitting them across separate sessions. A single session might start with deep-reading a folder, go through three rounds of plan annotation, then run the full implementation, all in one continuous conversation.

I am not seeing the performance degradation everyone talks about after 50% context window. Actually, by the time I say “implement it all,” Claude has spent the entire session building understanding: reading files during research, refining its mental model during annotation cycles, absorbing my domain knowledge corrections.

When the context window fills up, Claude’s auto-compaction maintains enough context to keep going. And the plan document, the persistent artifact, survives compaction in full fidelity. I can point Claude to it at any point in time.

The Workflow in One Sentence

Read deeply, write a plan, annotate the plan until it’s right, then let Claude execute the whole thing without stopping, checking types along the way.

That’s it. No magic prompts, no elaborate system instructions, no clever hacks. Just a disciplined pipeline that separates thinking from typing. The research prevents Claude from making ignorant changes. The plan prevents it from making wrong changes. The annotation cycle injects my judgement. And the implementation command lets it run without interruption once every decision has been made.

Try my workflow, you’ll wonder how you ever shipped anything with coding agents without an annotated plan document sitting between you and the code.