使用人工智能编写更慢但更好的代码

使用人工智能编写更慢但更好的代码
Using AI to write better code more slowly

原始链接: https://nolanlawson.com/2026/05/25/using-ai-to-write-better-code-more-slowly/

与人们普遍认为 AI 编程只会快速生成低质量“垃圾代码”的观点相反，大语言模型（LLM）其实可以成为严谨、高质量开发过程中的强大工具。开发者不应仅利用智能体来批量制造代码，而应利用它们来严格审阅和优化工作成果。通过部署多个 AI 模型来审查合并请求（pull requests），开发者可以有效地识别 Bug、执行架构原则并捕获边缘情况下的故障。作者建议采用一种工作流：由 AI 识别并对问题进行分级——从关键的安全漏洞到微小的代码质量改进——从而让开发者能够系统性地修复、记录并验证代码。虽然这种方法未必能提高原始的开发速度或代码行数，但它能显著改善代码库的健康状况。通过使用 AI 来审核、测试和解释代码，开发者可以从“垃圾代码轰炸”转向一种有条不紊、追求质量的过程。归根结底，这种“慢节奏”的氛围编程将 AI 视为一种用于深度、缜密工程的超级助手，优先考虑长期的可维护性，而非表面的开发速度。

在最近的一场 Hacker News 讨论中，用户“kiba”分享了他们将大语言模型（LLM）作为编程导师而非替代手动编码工具的观点。他们通过先编写有瑕疵的初步代码，再利用 LLM 进行修正，从而建立起更紧密的反馈循环，最终得到可用的代码。然而，该用户强调这个过程是有意放慢的。他们警告称，过度依赖 AI 生成的代码会削弱个人的阅读理解和问题解决能力。为了保持敏锐，他们优先自己编写代码，仅使用 AI 来识别和解释错误。最终，该用户认为 LLM 在引导和理解复杂、陌生的代码库（如 OpenSCAD）方面潜力巨大，而这些代码库曾让他们感到难以入手。

原文

A lot of people seem convinced that the point of AI coding is to write low-quality code as fast as possible. Spew out barely-passable slop, open massive PRs, and merge them unvetted. Ship it!

But the thing is, LLMs are very flexible. And you can use them just as effectively to write high-quality code more slowly.

This statement seems completely obvious to me at this point, and I almost didn’t want to write this post for that reason. But there seem to be enough people convinced that LLMs are only good as slop cannons that it’s worth making the opposite case.

If Mythos taught us anything, it’s that LLM agents are really good at finding bugs. Throw them at a codebase enough times, and they will find so many bugs that you’ll barely know what to do with them.

Like many others, I’ve also found this is true of non-Mythos models – some may be better than others at finding subtle bugs or avoiding false positives, but the fact is that the latest public models from Anthropic and OpenAI are good enough to find plenty of bugs in an unscrutinized codebase.

The problem is not so much finding the bugs, but instead prioritizing and validating them. For this reason I have a Claude skill I adapted from this article‘s core insight, which is that the more, different models you throw at a PR review, the less likely you are to get hallucinations or bogus bugs.

The skill says (paraphrasing):

Run a Claude sub-agent, Codex, and Cursor Bugbot to find bugs in this PR ranked by critical/high/medium/low. Once they’re all done, review their findings, do your own research to rule out false positives, and write a final report.

That’s basically it. You can add your own definition of “bug” if you want – mine has stipulations about the KISS and DRY principles, writing accessible HTML/JSX, using proper indexes for SQL queries, etc.

In my experience, this skill always finds tons of bugs in a PR, and the false positive rate is near zero. It finds so many bugs that you’ll be bored senseless if you try to tackle them all. They’ll range from critical security or correctness bugs to the more mundane medium-level perf bugs to low-level “this comment is misleading”-type bugs.

My typical workflow is:

Have an agent fix all the criticals and highs (with my guidance on the proper solution), then repeat until no criticals/highs
Skip highs/mediums where the juice isn’t worth the squeeze (e.g. 100 lines of code to fix a narrow edge case)
Abandon the PR if it has so many criticals that I realize the whole approach is misguided

When I use this technique, I haven’t necessarily seen my velocity go up. If anything, the review process often finds pre-existing bugs, so I end up on a tangential side-quest where I’m writing unit tests and fixing subtle flaws that pre-date the PR. This is the opposite of the “10x productivity” slop-cannon style of development that most people imagine when they think of vibe coding, but I find it very satisfying.

It’s a great way to improve the overall health of the codebase while also teaching you about the odd corners of it. In my experience, the happy-path of a complex architecture is less interesting than its failure modes. And pre-LLMs, this is usually how I got familiar with a codebase anyway: understanding where the assumptions break down, and then getting my hands dirty to fix it.

If you’re the kind of person who is skeptical that AI coding is good for anything, then I doubt this post will persuade you. But if you’re the kind of developer who uses agents to write multi-hundred-line PRs that you barely understand yourself, I’d invite you to slow down a bit and try this other, slower style of “vibe coding.” Ask an agent how your PR works and how it might fail. Have it write Markdown docs with Mermaid charts if necessary. Use Matt Pocock’s /grill-me skill until you understand the entire PR front-to-back.

You might not be more “productive” in terms of raw lines of code. You might burn a ton of tokens just to find out that your entire plan was wrongheaded from the start. But I find this style of coding to be a more super-powered version of the kind of programming I was already trying to do before LLMs: careful, methodical, quality-obsessed, focused on making things better for the next coder.

So take a deep breath, slow down, try this technique, and see if you don’t enjoy writing better code more slowly.

使用人工智能编写更慢但更好的代码 Using AI to write better code more slowly

使用人工智能编写更慢但更好的代码
Using AI to write better code more slowly