缺失的一层

缺失的一层
The Missing Layer

原始链接: https://yagmin.com/blog/the-missing-layer/

## “氛围编码”的局限性与流程工程的必要性作者反对完全自动化的“氛围编码”——让AI在极少人工监督的情况下生成代码——尽管他每天都在使用Claude和Cursor等AI工具。虽然功能强大，“氛围编码”会引入一种“容错缺陷”，类似于用略有不准确的尺子建造，导致普遍且难以修复的技术债务。重复尝试自动化修正会变得适得其反，类似于楼梯悖论。规范驱动开发（SDD）通过优先考虑详细的规范和人工审查提供了一种解决方案，但会引入“文档债务”，因为维护上下文变得具有挑战性。核心问题不是LLM本身，而是业务决策与最终代码之间的脱节。作者建议转向**流程工程**，通过一个新的“上下文层”来实现，该层动态地将业务知识直接链接到代码库。这一层既能被人类理解，也能被AI理解，从而消除冗余的“上下文工程”，并允许LLM参与到*整个*开发过程，而不仅仅是代码生成。本质上，弥合初始规划与最终实现之间的差距是释放AI在软件开发中真正潜力的关键。

## AI辅助开发中缺失的一层最近在Hacker News上出现了一场关于扩展AI辅助软件开发的挑战的讨论，起因是一篇博客文章，该文章强调了代码在反复修改时固有的不稳定性。核心问题不仅仅是*编写*代码，而是弥合人类意图（产品决策、利益相关者需求）与机器执行之间的差距。许多评论者认为，当前的工作流程——利益相关者会议、经理总结、工程师为LLM重新理解上下文——效率低下且容易出错。提出的解决方案范围从正式的“流程工程”层和中间语言（例如，用于产品思考的YAML）到更好的工具来管理规范和文档。一个关键点是，代码本身并不总是足以捕捉需求背后的*原因*。讨论涉及对LLM的确定性输入的需求，可能通过结构化格式和自动文档更新来实现。最终，观点倾向于需要一个“缺失的层”，为AI提供结构化的上下文，而不是仅仅依赖自然语言，从而构建可靠且易于维护的软件。许多人认为，LLM准确性的最新改进，加上计算能力的不断提高，正在迅速提高这种方法的可行性。

原文

Vibe coding is too much work.

I don't want to orchestrate a barnyard of agents to generate, refactor, validate, test, and document code I don't see, all while burning through a coal mine of tokens. I don't want to "live dangerously" or YOLO code.

I'm not a Luddite who hates AI or moving fast. I use Claude and Cursor daily; I've founded companies and worked as a startup engineer for over 20 years. But I wouldn't want to vibe code anything I might want to extend. Vibe coding leads you to uncanny valley of technical debt.

It technically works...
source: YouTube

The Magic Ruler

Imagine you find a "magic ruler" that lets you construct any building instantly, just by thinking of it. The ruler has one small flaw: it slightly changes the definition of an inch each time it measures something. It is great for building huts and cottages, but larger buildings are unstable.

Enticed by the potential of your magic ruler, you work to "manage away" any errors it creates. You discover that by referencing the measurements to one another, you can eliminate more than half of the errors. Encouraged, you add failure detection and create a workflow that regenerates broken structures until they work.

Despite these changes, each new project comes back with several obvious flaws that somehow avoided your detection.

A Lack of Tolerance

Despite feeling like a solution is near at hand, this "measurement bug" persists no matter how much tooling you add. Your magic ruler is powerful, but it has a tolerance flaw that permeates every aspect of construction.

Trying to automate fixes is reminiscent of the staircase paradox, where you keep reshaping a problem in a way that seems productive but never actually improves precision. It feels like you are "approaching the limit," but no matter how small you make the steps, the area never changes. Similarly, an automated process cannot add information or reduce uncertainty without an external signal. Harnesses around vibe coding produce some signal (the code works or it doesn't), but it is a blunt and limited approach.

This is not to say vibe coding has no use. It can produce large structures, but not precise ones. With our magic ruler and a brute force approach we might generate the Great Pyramids, but it takes a different approach to build a cathedral.

Spec-Driven Development

At the other end of the spectrum is spec-driven development (SDD), which comes in many flavors. Roughly, you write a detailed specification that includes context and business concerns, then iterate on an implementation plan with the LLM. Only after this do you generate code, review the output and iterate.

SDD solves the tolerance issue. We review every measurement (the code) and are active through planning and iteration. We take advantage of automation while staying connected to the code.

Spec writing creates a new problem, though, which worsens over time. Specs tend to be verbose. "Context engineering" has a lot to do: explain the feature, define where the logic goes, detail functional changes, explain the codebase architecture, define rules for validation and testing. To avoid repeating all this work every time, we create shared Markdown files like ARCHITECTURE.md to lessen our load.

We also need to describe the world beyond the codebase: business concerns, usage patterns, scaling and infrastructure details, design principles, user flows, secondary and third-party systems. Adding more documentation reduces the work for each spec, but it creates a new, subtle tranche of technical debt: documentation debt.

Documentation is hard to maintain because it has no connection to the code. Having an LLM tweak the documentation after every merge is "vibe documenting." If context documentation grows over time, it will eventually become corrupted by subtle contradictions and incompatible details spread across files. Errors in documentation won't directly affect the code, unless we rely on those documents to generate our code.

Software Development Is Not Engineering

There is a more problematic aspect of spec-driven development. Writing specs front-loads most of our effort in the writing and planning stages. We can break up an implementation plan into phases, but there are strong incentives to build out specs holistically.

Let's say your organization wants to add "dark mode" to your site. How does that happen? A site-wide feature usually requires several people to hash out the concerns and explore costs vs. benefits. Does the UI theming support dark mode already? Where will users go to toggle dark mode? What should the default be? If we change the background color we will need to swap the font colors. What about borders and dividers? What about images? What about the company blog, and the FAQ area, which look integrated but run on a different frontend? What about that third-party widget with a static white background?

Multiple stakeholders raise concerns and add details through discussion. Once everything is laid out, someone gathers up those pieces into a coherent plan. Others review it, then the work is broken up into discrete tasks for the engineers to implement.

This is a standard pattern for feature development. And SDD completely undermines this approach.

Let's say you are an engineer at this company and are tasked with implementing all of the dark mode changes. A bundle of tickets was added to your board and they will be the focus of your sprint. You read through the tickets and begin writing your spec.

Many of the tickets are broken up by pages or parts of the site you are meant to update separately, but you know there is an advantage to having the LLM generate a single cohesive plan, since context building is so painful. You repeat most of the concerns brought up during the stakeholder meeting and add more context about the codebase.

You spent almost a day on the spec but it was worth it, since it saves you so much time. It takes a couple more days to verify all the pages look good and address several small bugs that slipped through the cracks. Specs take a while to get right, but the LLM was able to do most of the heavy lifting.

Except the LLM didn't do most of the heavy lifting. That happened in the stakeholder meeting that took an hour from six different employees. Your manager aggregated the plan and broke it down into tasks, which you then rolled back up and re-contextualized. You spent most of your week building and refining a one-off spec. After the plan was executed by the LLM, you still had to review the code, confirm the output and make final revisions.

LLMs can dramatically speed up feature development, but organizations are getting in their own way by duplicating effort and trying to bifurcate decisions between product and engineering teams.

The Process Gap

We are past the stage of determining what an LLM can do. Now we face more interesting questions: which human decisions do we want to preserve? How can we differentiate informed choices from "best guess" generated answers?

Because LLMs have no memory, we think of "context engineering" as a tax we have to pay to get accurate results. Context is not ephemeral, but we act like it is because there is no connection between the business and the code. What we need is a new approach that bridges this gap.

There is an established pattern in software that keeps repeating every few decades, which is applicable to this problem. When scripting languages emerged, many programmers disparaged them because "script-only" engineers never bothered learning important constructs, like memory allocation or pointers. Scripted languages are much slower and allow less control, but they enable faster development and let engineers spend time on higher-level complexities, like state management.

The solution is to create a new abstraction layer that can serve as a source of truth for both humans and LLMs. This context layer would dynamically link context directly to the source code. If we had this, we would no longer need to perform "context engineering" in every prompt or spec. Instead, organizations could establish a well-structured layer of context that would be understandable by non-engineers and generally useful across the organization. This enables the next stage of LLM development: process engineering.

Process Engineering

When stakeholders have a meeting to design a feature, that is a form of process engineering. Since we lack a context layer to hold that knowledge or connect it to code, someone must manually re-contextualize feature goals into engineering tasks. Engineers use LLMs to generate code at the final step of this process and suffer because they need to rebuild all of the context that has been developed along the way.

Process engineering widens the aperture so LLMs are included in the entire process. Knowledge that is gained in a meeting can be added directly to the context layer and available for the LLM. When it is time to generate code, that knowledge is structured in a way that is accessible to the LLM.

The Context Layer

What are some characteristics of this context layer?

It should be understandable and editable by both humans and LLMs.
It must directly connect to the code.
Changes to the context layer must trigger changes to the code, and vice versa.

This creates a dynamic link between code and process as determined by humans. Context grows iteratively. Knowledge is shared. The abstraction layer becomes a functional artifact in your development process.

How We Get There

Here's the good news: we have all the pieces.

For process: we have graphs, user stories, user flows, requirements and guardrails.
For code: we have epics, tickets, specs and context documents.

We only need a way to connect the two and keep them up to date. This was an impossible challenge a year ago, but now any modern LLM can make this work.

I have a proof of concept that I hope will start the ball rolling. Stay tuned.