光标最新的“浏览器实验”暗示成功,但缺乏证据。
Cursor's latest "browser experiment" implied success without evidence

原始链接: https://embedding-shapes.github.io/cursor-implied-success-without-evidence/

## 光标浏览器的实验:批判性分析 光标最近发表了一篇博客文章,详细介绍了他们尝试使用“自主编码代理”从头开始构建一个网络浏览器,运行一周,生成了超过一百万行代码。该项目托管在 GitHub 上 ([https://github.com/wilsonzlin/fastrender](https://github.com/wilsonzlin/fastrender)),旨在测试代理编码在大型项目中的可扩展性。 尽管光标声称他们的系统允许代理在大型代码库上并发工作,冲突最小,并“取得有意义的进展”,但独立分析表明该项目**无法运行**。代码无法编译,充斥着错误和警告,并且在其历史记录中没有成功构建的证据。 尽管将这项工作定义为“构建浏览器”,但光标并未声称它*有效*,而是依赖于含糊的语言和一张截图。批评者认为这造成了一种误导性的成功印象,但没有提供可重现的演示,甚至没有提供可编译的代码版本。 核心主张——即扩展自主编码是乐观的——仍然没有得到证实,因为输出虽然数量庞大,但缺乏一个可运行浏览器的基本功能。

## 光标浏览器的声明受到质疑 光标公司首席执行官最近声称他们使用GPT-5.2构建了一个浏览器,这一说法正面临强烈质疑。虽然公司博客和推特帖子暗示使用人工智能代理成功生成了超过一百万行代码,但开发者发现该项目实际上无法运行。 分析显示,这款“从零开始”构建的浏览器严重依赖于现有的代码库,例如最初由Mozilla开发的Servo项目。此外,编译代码的尝试总是失败,引发了人们的质疑:尽管展示了截图作为证据,但一个可用的浏览器是否曾经被创建出来。 这一事件凸显了人们对未经证实的AI声明以及对其缺乏辨别力的担忧。虽然像Codex和Claude这样的工具被认为是真正对开发者有帮助的,但许多人认为光标公司的做法是故意误导的炒作。这场讨论强调了在评估AI驱动的“成功”时,验证和批判性评估的重要性,并提醒人们,即使是先进的工具也需要专业知识,并不能保证完全自动化。
相关文章

原文

On January 14th 2026, Cursor published a blog post titled "Scaling long-running autonomous coding" (https://cursor.com/blog/scaling-agents)

In the blog post, they talk about their experiments with running "coding agents autonomously for weeks" with the explicit goal of

understand[ing] how far we can push the frontier of agentic coding for projects that typically take human teams months to complete

They talk about some approaches they tried, why they think those failed, and how to address the difficulties.

Finally they arrived at a point where something "solved most of our coordination problems and let us scale to very large projects without any single agent", which then led to this:

To test this system, we pointed it at an ambitious goal: building a web browser from scratch. The agents ran for close to a week, writing over 1 million lines of code across 1,000 files. You can explore the source code on GitHub (https://github.com/wilsonzlin/fastrender)

This is where things get a bit murky and unclear. They claim "Despite the codebase size, new agents can still understand it and make meaningful progress" and "Hundreds of workers run concurrently, pushing to the same branch with minimal conflicts", but they never actually say if this is successful or not, is it actually working? Can you run this browser yourself? We don't know and they never say explicitly.

After this, they embed the following video:

And below it, they say "While it might seem like a simple screenshot, building a browser from scratch is extremely difficult.".

They never actually claim this browser is working and functional

error: could not compile 'fastrender' (lib) due to 34 previous errors; 94 warnings emitted

And if you try to compile it yourself, you'll see that it's very far away from being a functional browser at all, and seemingly, it never actually was able to build.

Multiple recent GitHub Actions runs on main show failures (including workflow-file errors), and independent build attempts report dozens of compiler errors, recent PRs were all merged with failing CI, and going back in the Git history from most recent commit back 100 commits,
I couldn't find a single commit that compiled cleanly.

I'm not sure what the "agents" they unleashed on this codebase actually did, but they seemingly never ran "cargo build" or even less "cargo check", because both of those commands surface 10s of errors (which surely would balloon should we solve them) and about 100 warnings. There is an open GitHub issue in their repository about this right now: https://github.com/wilsonzlin/fastrender/issues/98

And diving into the codebase, if the compilation errors didn't make that clear already, makes it very clear to any software developer that none of this is actually engineered code. It is what is typically known as "AI slop", low quality something that surely represents something, but it doesn't have intention behind it, and it doesn't even compile at this point.

They later start to talk about what's next, but not a single word about how to run it, what to expect, how it's working or anything else. Cursor's blog post provides no reproducible demo and no known-good revision (tag/release/commit) to verify the screenshots, beyond linking the repo.

Regardless of intent, Cursor's blog post creates the impression of a functioning prototype while leaving out the basic reproducibility markers one would expect from such claim. They never explicitly claim it's actually working, so no one can say they lied at least.

They finish off the article saying:

But the core question, can we scale autonomous coding by throwing more agents at a problem, has a more optimistic answer than we expected.

Which seems like a really strange conclusion to arrive at, when all they've proved so far, is that agents can output millions of tokens and still not end up with something that actually works.

A "browser experiment" doesn't need to rival Chrome. A reasonable minimum bar is: it compiles on a supported toolchain and can render a trivial HTML file. Cursor's post doesn’t demonstrate that bar, and current public build attempts fail at this too.

Cursor never says "this browser is production-ready", but they do frame it as "building a web browser from scratch" and "meaningful progress" and then use a screenshot and "extremely difficult" language, wanting to give the impression that this experiment actually was a success.

The closest they get to implying that this was a success, is this part:

Hundreds of agents can work together on a single codebase for weeks, making real progress on ambitious projects.

But this extraordinary claim isn't backed up by any evidence. In the blog post they never provide a working commit, build instructions or even a demo that can be reproduced.

I don't think anyone expects this browser to be the next Chrome, but I do think that if you claim you've built a browser, it should at least be able to demonstrate being able to be compiled + loading a basic HTML file at the very least.

联系我们 contact @ memedata.com