EvanFlow – 一种由测试驱动开发驱动的 Claude 代码反馈循环。
EvanFlow – A TDD driven feedback loop for Claude Code

原始链接: https://github.com/evanklem/evanflow

## Evanflow:基于TDD的开发循环,由Claude Code驱动 Evanflow是一个结构化的、迭代的软件开发流程,专为与Claude Code配合使用而设计,利用16项技能和2个定制子代理,引导项目从头脑风暴到实施。该循环以“让我们evanflow一下”开始,依次进行头脑风暴、计划、执行、TDD和迭代阶段——始终在设计和计划批准以及每次迭代后进行人工检查点。 至关重要的是,Evanflow *不会* 自动驾驶开发。它在每次潜在的git操作前都会暂停,等待您的明确指示——没有自动提交或强制流程。该系统优先考虑有纪律的迭代,专注于垂直切片TDD,并结合检查以防止常见的LLM失败模式(幻觉、范围蔓延、上下文漂移)。 对于复杂的任务,Evanflow可以利用编码员/监督员代理并行编码,并通过集成测试确保代码质量。它可以通过Claude Code插件、CLI或手动设置进行安装,并内置了防止危险git命令的安全措施。Evanflow旨在成为一个强大但受控的助手——一个指挥家,而不是自动驾驶仪——用于构建健壮的软件。

Hacker News 新闻 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 EvanFlow – 一种为 Claude 代码设计的 TDD 驱动的反馈循环 (github.com/evanklem) 13 分,由 evanklem2004 发布 1 小时前 | 隐藏 | 过去 | 收藏 | 4 条评论 帮助 s20n 16 分钟前 | 下一个 [–] EvanFlow - 想法像蝴蝶一样飞来?回复 bseitz 13 分钟前 | 父评论 | 下一个 [–] 哦,他不知道,所以他把它们赶走 jamesbfb 6 分钟前 | 根评论 | 父评论 | 下一个 [–] Oooohhhh 回复 jtfrench 28 分钟前 | 上一个 | 下一个 [–] 在循环时,如何处理“愚蠢区域”规避?回复 考虑申请 YC 2026 夏季批次!申请截止至 5 月 4 日 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系方式 搜索:
相关文章

原文

A TDD-driven iterative feedback loop for software development with Claude Code.

16 cohesive skills + 2 custom subagents walk an idea from brainstorm through implementation, with checkpoints throughout where you stay in control. One entry point: say "let's evanflow this" and the orchestrator runs the loop.

brainstorm → plan → execute (sequential or parallel) → tdd → iterate → STOP

The loop is conductor, not autopilot: real checkpoints at design approval, plan approval, and after iteration. The agent stops short of every git operation and waits for your direction. No auto-commits. No forced ceremony. No "must invoke a skill" tax.


The recommended path — Claude Code's plugin marketplace:

/plugin marketplace add evanklem/evanflow
/plugin install evanflow@evanflow

Restart, then try:

"Let's evanflow this — I want to add a small feature that does X."

evanflow-go fires and walks the loop. The git-guardrails hook auto-activates with the plugin (no settings.json edit needed). Skills appear under the evanflow: namespace (e.g., /evanflow:evanflow-go).

See Installation below for two alternative paths.


What Makes It a Feedback Loop

The loop is built around discipline that compounds across iterations, not single-shot generation. Every step has a checkpoint that gates the next:

  • Brainstorm clarifies intent, proposes 2–3 approaches with embedded grill (stress-test) → you approve the design
  • Plan maps file structure first (deep modules, deletion test) → you approve the plan
  • Execute runs task-by-task with inline verification → blockers stop the loop and surface to you
  • TDD is vertical-slice only: one failing test → minimal impl → repeat. Tests verify behavior through public interfaces, so they survive refactors
  • Iterate re-reads the diff with fresh eyes, runs quality checks, screenshots UI changes, and runs against a Five Failure Modes checklist (hallucinated actions, scope creep, cascading errors, context loss, tool misuse). Hard cap of 5 iterations
  • STOP. Report. Await your direction. The agent never auto-commits, never auto-stages, never proposes a PR

For plans with 3+ truly independent units, the loop forks into a parallel coder/overseer orchestration: one coder per unit (using vertical-slice TDD with a RED checkpoint), one overseer per coder (read-only review subagent that can't modify code), plus an integration overseer that runs named integration tests at every touchpoint. The integration tests are the executable contract — interfaces can't drift if both sides have to satisfy the same passing test.

Hard Rules Baked Into the Loop

Several rules come from 2025-2026 industry research on agentic coding failure modes and are baked into every skill:

  • Never invent values — file paths, env vars, IDs, function names, library APIs. If unsure, the agent stops and asks. (Action-hallucination is the most dangerous agent failure.)
  • Assertion-correctness warning — research shows 62% of LLM-generated test assertions are wrong. Both evanflow-tdd and the overseer review explicitly check whether a one-character bug in the implementation would still let the assertion pass.
  • Watch for context driftevanflow-compact triggers when symptoms appear (re-asking established questions, contradicting earlier decisions). Industry data: ~65% of enterprise AI coding failures trace to context drift, not raw token exhaustion.
  • Five Failure Modes pass in iterate + overseer review — explicit check against hallucinated actions, scope creep, cascading errors, context loss, tool misuse.
  • No skill tax — ad-hoc questions don't require a skill invocation. Skills are tools, not a tollbooth.

Skill Purpose
evanflow-brainstorming Clarify intent, propose 2–3 approaches with embedded grill (stress-test). Mockup quick-mode for visual-only requests.
evanflow-writing-plans File structure first, bite-sized tasks, embedded grill. Step 2.5 offers evanflow-coder-overseer if the plan is parallelizable.
evanflow-executing-plans Task-by-task with inline verification. Step 0 re-offers parallel path. Hands off to iterate, then STOPS.
evanflow-tdd Vertical-slice TDD. One test → one impl → repeat. Behavior through public interface. Assertion-correctness warning.
evanflow-iterate Self-review loop after implementation. Re-read diff, fix issues, run quality checks, screenshot UI (via headless Chromium). Five Failure Modes checklist. Hard cap of 5 iterations.

Special-Purpose (8 skills)

Skill Purpose
evanflow-go Single entry point. Say "let's evanflow this" and it walks the whole loop.
evanflow-glossary Extract canonical domain terms into CONTEXT.md. Flag ambiguities and synonyms.
evanflow-improve-architecture Surface refactor opportunities via the deletion test + deep-modules vocabulary.
evanflow-design-interface "Design it twice" — spawn 3+ parallel sub-agents with radically different constraints, compare on depth/simplicity/efficiency.
evanflow-debug Root-cause discipline. Hypothesis stated explicitly, embedded grill before fixing, failing test first.
evanflow-review Both halves of code review (giving + receiving). Don't capitulate to feedback you can't justify.
evanflow-prd Synthesize a PRD from existing context. For substantial new features.
evanflow-qa Conversational bug discovery → issue draft. Asks before filing.
Skill Purpose
evanflow-compact Long-session context management. Strategies for proactive summarization at clean boundaries. Drift symptoms checklist.
Skill Purpose
evanflow The index. Shared vocabulary + when to invoke each evanflow-* skill.

In agents/ — invoked via Agent tool with subagent_type: parameter:

Subagent Tool restrictions Purpose
evanflow-coder Read, Edit, Write, Glob, Grep, Bash, TodoWrite Implementation subagent for evanflow-coder-overseer. Tools + system prompt prevent git ops, out-of-scope edits, value hallucination.
evanflow-overseer Read, Grep, Glob (no Edit/Write/Bash) Read-only review subagent. Tools physically enforce "report findings, never fix."

hooks/block-dangerous-git.sh — PreToolUse hook that blocks destructive git ops (git push, git reset --hard, git clean -f, git branch -D, git checkout ., git restore .). Auto-activates with the plugin install path.


Hard Rules (apply to every skill)

  1. Never auto-commit, never auto-stage, never auto-finish. Every git write op requires you to explicitly ask in the current turn.
  2. Never invent values. File paths, env vars, IDs, function names, library APIs — if unsure, the agent stops and asks.
  3. No skill tax. Ad-hoc questions don't require a skill invocation. Skills are tools, not a tollbooth.
  4. No forced spec/plan paths. Files live where you want them.
  5. Verify before claiming done. Quality checks (typecheck, lint, test) run before any "done" report.

  • Claude Code (any recent version)
  • Bash — for the bundled hook script (Linux, macOS, or Windows + WSL)
  • jq — used by the hook script to parse Claude's JSON tool input. Install via apt install jq, brew install jq, or your platform's package manager. If jq is missing, the guardrail hook fails silently and dangerous git ops are NOT blocked.

Optional but recommended:

  • chromium or google-chrome — for evanflow-iterate's visual verification of UI changes (chromium --headless --screenshot=...). Falls back gracefully if missing — the skill flags it and asks you to verify visually.

Three paths, in priority order. All three end with the same skill set in your .claude/skills/. The plugin path additionally auto-wires the guardrail hook.

Path 1 — Claude Code Plugin Marketplace (recommended)

This is the cleanest install. Skills, agents, AND the guardrail hook all activate automatically.

/plugin marketplace add evanklem/evanflow
/plugin install evanflow@evanflow

Restart Claude Code (or /reload-plugins). Skills appear namespaced as /evanflow:evanflow-go, /evanflow:evanflow-tdd, etc. Auto-invocation via "let's evanflow this" still works regardless of namespace.

To uninstall: /plugin uninstall evanflow@evanflow.

Path 2 — npx skills@latest add CLI

Works against any GitHub repo with SKILL.md-shaped folders. Installs skills only — does not install the guardrail hook or custom subagents (you'd add those manually if you want them).

# Install all 16 skills at once
npx skills@latest add evanklem/evanflow -s '*' -y

# Or install individual skills
npx skills@latest add evanklem/evanflow/evanflow-go
npx skills@latest add evanklem/evanflow/evanflow-tdd
# ...

This places skills under ~/.claude/skills/ (global) or .claude/skills/ (project, auto-detected).

For users who want full control, no CLI dependencies.

git clone https://github.com/evanklem/evanflow.git
cd evanflow

# Skills (project-level — adjust to ~/.claude/skills/ for global)
mkdir -p .claude/skills
cp -r skills/* .claude/skills/

# Agents (custom subagents used by evanflow-coder-overseer)
mkdir -p .claude/agents
cp agents/*.md .claude/agents/

# Git guardrails hook (optional but recommended)
mkdir -p .claude/hooks
cp hooks/block-dangerous-git.sh .claude/hooks/
chmod +x .claude/hooks/block-dangerous-git.sh

Then register the hook in your .claude/settings.json:

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash",
        "hooks": [
          {
            "type": "command",
            "command": "\"$CLAUDE_PROJECT_DIR\"/.claude/hooks/block-dangerous-git.sh"
          }
        ]
      }
    ]
  }
}

Optionally, paste examples/CLAUDE.md.snippet into your project's CLAUDE.md to brief Claude about EvanFlow's conventions.

Restart Claude Code. Try saying:

"Let's evanflow this — I want to add a small feature that does X."

evanflow-go should fire and walk you through the loop. To verify the guardrail hook (paths 1 and 3 only): try git reset --hard HEAD from the Bash tool — it should be blocked with "BLOCKED: ... matches dangerous pattern".


Every skill has a clear structure with a ## Hard Rules section. To adapt to your project:

  • Replace <frontend> and <backend> placeholders in skills like evanflow-writing-plans with your actual paths if you find yourself answering the same question repeatedly.
  • Document your project's quality checks in your CLAUDE.md — exact typecheck, lint, and test commands. The skills reference these abstractly.
  • Adapt the visual verification step in evanflow-iterate if you don't have chromium available — substitute google-chrome --headless or another tool.
  • Edit the cohesion contract template in evanflow-coder-overseer to match your project's conventions (your authentication middleware name, your DB write helper, etc.).

The skills are designed to be edited. Treat them as starting points, not gospel.

If you fork to make a vendor-specific variant (your-name-flow), great — that's the spirit.


How EvanFlow Works End-to-End

You say: "let's evanflow this — I want to add a feature that does X"
           │
           ▼
       evanflow-go (the conductor)
           │
           ├─ Phase 0: Restate idea, scope check
           ├─ Phase 1: evanflow-brainstorming (CHECKPOINT: design approval)
           ├─ Phase 2: evanflow-writing-plans (CHECKPOINT: plan approval)
           │            └─ Step 2.5: parallelization check
           ├─ Phase 3: evanflow-executing-plans (sequential)
           │            OR
           │            evanflow-coder-overseer (parallel)
           │              ├─ contract with named tests + integration tests
           │              ├─ RED checkpoint (all coders write failing tests, orchestrator verifies)
           │              ├─ GREEN phase (vertical-slice TDD per coder)
           │              ├─ per-coder overseers (review, never fix)
           │              └─ integration overseer (runs touchpoint tests)
           ├─ Phase 4: evanflow-iterate (5x cap, Five Failure Modes pass)
           └─ Phase 5: STOP. Report what was done. Await your direction.

Cross-cutting: evanflow-compact runs at clean boundaries when context gets heavy.

Special-purpose skills (evanflow-debug, evanflow-improve-architecture, evanflow-design-interface, evanflow-glossary, evanflow-prd, evanflow-qa, evanflow-review) are pulled in mid-flow when relevant.


.
├── .claude-plugin/
│   ├── plugin.json          — plugin identity (name, description, version)
│   └── marketplace.json     — marketplace manifest (lists EvanFlow as one bundled plugin)
├── skills/                  — 16 SKILL.md folders
│   ├── evanflow/
│   ├── evanflow-go/
│   ├── evanflow-brainstorming/
│   ... (etc)
├── agents/                  — 2 custom subagent definitions
│   ├── evanflow-coder.md
│   └── evanflow-overseer.md
├── hooks/
│   ├── hooks.json           — auto-activated when plugin installs
│   └── block-dangerous-git.sh
├── examples/
│   └── CLAUDE.md.snippet    — for the manual-copy install path
├── docs/
│   └── skills-audit.md      — verdict on all 38 candidate skills considered
├── README.md
└── LICENSE                  — MIT

EvanFlow synthesizes ideas from:

  • mattpocock/skills by Matt Pocock — vertical-slice TDD, deep modules, deletion test, design-it-twice, ubiquitous language, grill-me, caveman.
  • superpowers by Jesse Vincent — verification-before-completion, code review patterns, parallel agent dispatch, finishing-a-development-branch (the 4-option presentation).
  • git-guardrails-claude-code — bundled in hooks/ (script copied verbatim). Original by Matt Pocock.

Industry research informing the design:


MIT. See LICENSE.


Issues and pull requests welcome. EvanFlow is opinionated by design — proposals to add ceremony or auto-actions will be politely declined. Proposals to further reduce ceremony, sharpen rules, or add evidence-backed improvements are very welcome.

联系我们 contact @ memedata.com