存在人工智能代码审查泡沫。
There is an AI code review bubble

原始链接: https://www.greptile.com/blog/ai-code-review-bubble

人工智能代码审查领域正在蓬勃发展,众多参与者涌入市场——从OpenAI和Anthropic等科技巨头到Greptile等专业初创公司。虽然许多公司声称具有卓越的漏洞检测能力,但作者(来自Greptile)认为性能是主观的,最终需要个人测试。 Greptile的差异化之处不在于立竿见影的性能声明,而在于以**独立性、自主性和反馈循环**为中心的长期愿景。他们认为代码*审查*代理应与代码*生成*代理分离——避免利益冲突——并专注于构建实现代码验证(审查、测试、质量保证)的*完全自动化*。 与构建用于辅助人类审查的AI*工具*的竞争对手不同,Greptile设想一个尽可能减少人工干预的未来,作为一种“后台自动化”或“管道”产品。他们已经采取了诸如Claude Code插件等集成措施,从而建立了一个编码代理解决审查反馈的循环,直到自动批准。 选择代码审查工具是一个长期的决定,Greptile旨在为用户准备一个AI处理大部分代码验证的未来,使人类工程师能够专注于创新和高级设计。

黑客新闻 新的 | 过去的 | 评论 | 提问 | 展示 | 工作 | 提交 登录 存在一个AI代码审查泡沫 (greptile.com) 12 分,由 dakshgupta 1小时前 | 隐藏 | 过去的 | 收藏 | 1 条评论 personjerry 12分钟前 [–] 我不太理解这如何与竞争对手区分开来。> 独立性 任何在代码审查而非代码生成上运行的“代理”都是“独立的”?> 自主性 大多数其他代码审查工具也可以自动化并集成。> 循环 你也可以ping其他代码审查工具以获取更多审查… 我觉得这篇文章实际上适得其反,提出了问题但没有充分解决它们。回复 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请YC | 联系 搜索:
相关文章

原文

Today, we're in the hard seltzer era of AI code review: everybody's doing them. OpenAI, Anthropic, Cursor, Augment, now Cognition, and even Linear. Of course, there's also the "White Claws" of code review: pure-play code review agents like Greptile (that's us!), CodeRabbit, Macroscope, and a litter of fledgling YC startups. Then there are the adjacent Budweisers of this world:

Amazingly, these two were announced practically within 24 hours of each other.

As the proprietors of an, er, AI code review tool suddenly beset by an avalanche of competition, we're asking ourselves: what makes us different?

How does one differentiate?

Based on our benchmarks, we are uniquely good at catching bugs. However, if all company blogs are to be trusted, this is something we have in common with every other AI code review product. One just has to try a few, and pick the one that feels the best.

Unfortunately, code review performance is ephemeral and subjective, and is ultimately not an interesting way to discern the agents before trying them. It's useless for me to try to convince you that we're the best. You should just try us and make up your own mind.

Instead of telling you how our product is differentiated, I am going to tell you how our viewpoint is differentiated - how we think code review will look in the long-term, and what we're doing today to prepare our customers for that future.

Our thesis can be distilled into three pillars: independence, autonomy, and feedback loops.

We strongly believe that the review agent should be different from the coding agent. We are opinionated on the importance of independent code validation agents. In spite of multiple requests, we have never shipped codegen features. We don't write code; an auditor doesn't prepare the books, a fox doesn't guard the henhouse, and a student doesn't grade their own essays.

Today's agents are better than the median human code reviewer at catching issues and enforcing standards, and they're only getting better. It's clear that in the future a large percentage of code at companies will be auto-approved by the code review agent. In other words, there will be some instances where a human writes a ticket, an agent writes the PR, and another agent validates, approves, and merges it.

This might seem far-fetched but the counterfactual is Kafkaesque. A human rubber-stamping code being validated by a super intelligent machine is the equivalent of a human sitting silently in the driver's seat of a self-driving car, "supervising".

If agents are approving code, it would be quite absurd and perhaps non-compliant to have the agent that wrote the code also approve the code. Only once would you have X write a PR, then have X approve and merge it to realize the absurdity of what you just did.

Something that Greptiles generally agree on is that everything that can be automated, will be automated.

Code validation - which to us is the combination of review, test, and QA, is an excellent candidate for full automation. It's work that humans don't want to do, and aren't particularly good at. It also requires little in the way of creative expression, unlike programming. In addition, success is generally pretty well-defined. Everyone wants correct, performant, bug-free, secure code.

While some other products have built out great UIs for humans to review code in an AI-assisted paradigm, we have chosen to build for what we consider to be an inevitable future - one where code validation requires vanishingly little human participation. We have no code review UI, and view ourselves as more of a background automation or "pipes" product. Human engineers should be focused only on two things - coming up with brilliant ideas for what should exist, and expressing their vision and taste to agents that do the cruft of turning it all into clean, performant code.

Not long ago, we released our Claude Code plugin. It can do many things - but most notably, you can ask Claude Code to pull down and address Greptile's comments from the PR. You can ask it to keep going until there are no new comments, waiting a few minutes for a review after each push.

This is a step towards the future we're excited about: Human expresses intent, coding agent executes, validation/review agent finds issues and hands them back - kicking off a loop until it approves and merges. If there is ambiguity at any point, the agents Slack the human to clarify.

The question of how these things are different is important. Unlike picking IDEs and coding agents that ostensibly have low switching costs, code review products are harder to rip out, so your decision will very likely turn out to be a long-term one, especially if you're a large company.

We've been around for about as long as AI code review has been around. It has gone from a fringe interest of the world's most adventurous vibecoders to a mainstream product that our enterprise users (including two of the Mag7) often describe as a "no-brainer" purchase.

Yet, our guess on where this goes is about as good as anyone else's. Meanwhile, we'll keep doing what we've always done - trying to make things our users love.

联系我们 contact @ memedata.com