Anthropic的Glasswing项目听起来很有必要。
Anthropic's Project Glasswing sounds necessary to me

原始链接: https://simonwillison.net/2026/Apr/7/project-glasswing/

Anthropic 推迟了其新AI模型 Claude Mythos 的公开发布,原因是其意外强大的网络安全研究能力。目前,访问权限仅限于通过“Project Glasswing”计划选择的合作伙伴,旨在主动识别并修复广泛使用的系统中的漏洞,然后再进行更广泛的发布。 Mythos 预览版已经发现了数千个严重缺陷,包括主要操作系统和浏览器中长期存在的问题,甚至在 OpenBSD 中发现了一个 27 年的漏洞。安全专业人士报告,AI 发现的漏洞发生了显著变化,从不准确的“AI 垃圾”转向真正有用且令人担忧的报告。该模型*串联*漏洞以创建复杂漏洞利用的能力尤其令人担忧。 Project Glasswing 包括 1.04 亿美元的资源,用于支持 AWS、Apple 和 Microsoft 等合作伙伴加强网络安全。Anthropic 计划开发安全措施并将其整合到未来的 Claude Opus 模型中,然后再广泛提供 Mythos 级别的能力,优先考虑安全部署,尽管该模型具有潜在益处。这种谨慎的做法承认了人工智能在漏洞研究中日益增长的力量,以及整个行业需要做出的准备。

一个黑客新闻的讨论围绕着Anthropic的“Glasswing项目”,一种用于漏洞扫描的AI工具,以及它对网络安全的影响。用户们争论这类工具主要惠及攻击者还是防御者。 一个主要担忧是这些强大AI模型的易用性——用它们调试代码可能很昂贵(可能高达20,000美元的token),这使得小型硬件制造商处于劣势。一些人认为,即使有了这个工具,防御仍然很困难,因为攻击者只需要找到*一个*漏洞,而防御者必须保护*所有*系统。 尽管如此,许多人认为AI漏洞扫描是一个积极的步骤,有可能在部署前发现高危问题。相关的工具,如roost.tools也被提及为有价值的努力。这次讨论强调了对技术解决方案以及可能更严格的隐私法规的需求。
相关文章

原文

7th April 2026

Anthropic didn’t release their latest model, Claude Mythos (system card PDF), today. They have instead made it available to a very restricted set of preview partners under their newly announced Project Glasswing.

The model is a general purpose model, similar to Claude Opus 4.6, but Anthropic claim that its cyber-security research abilities are strong enough that they need to give the software industry as a whole time to prepare.

Mythos Preview has already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser. Given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely.

[...]

Project Glasswing partners will receive access to Claude Mythos Preview to find and fix vulnerabilities or weaknesses in their foundational systems—systems that represent a very large portion of the world’s shared cyberattack surface. We anticipate this work will focus on tasks like local vulnerability detection, black box testing of binaries, securing endpoints, and penetration testing of systems.

Saying “our model is too dangerous to release” is a great way to build buzz around a new model, but in this case I expect their caution is warranted.

Just a few days (last Friday) ago I started a new ai-security-research tag on this blog to acknowledge an uptick in credible security professionals pulling the alarm on how good modern LLMs have got at vulnerability research.

Greg Kroah-Hartman of the Linux kernel:

Months ago, we were getting what we called ’AI slop,’ AI-generated security reports that were obviously wrong or low quality. It was kind of funny. It didn’t really worry us.

Something happened a month ago, and the world switched. Now we have real reports. All open source projects have real reports that are made with AI, but they’re good, and they’re real.

Daniel Stenberg of curl:

The challenge with AI in open source security has transitioned from an AI slop tsunami into more of a ... plain security report tsunami. Less slop but lots of reports. Many of them really good.

I’m spending hours per day on this now. It’s intense.

And Thomas Ptacek published Vulnerability Research Is Cooked, a post inspired by his podcast conversation with Anthropic’s Nicholas Carlini.

Anthropic have a 5 minute talking heads video describing the Glasswing project. Nicholas Carlini appears as one of those talking heads, where he said (highlights mine):

It has the ability to chain together vulnerabilities. So what this means is you find two vulnerabilities, either of which doesn’t really get you very much independently. But this model is able to create exploits out of three, four, or sometimes five vulnerabilities that in sequence give you some kind of very sophisticated end outcome. [...]

I’ve found more bugs in the last couple of weeks than I found in the rest of my life combined. We’ve used the model to scan a bunch of open source code, and the thing that we went for first was operating systems, because this is the code that underlies the entire internet infrastructure. For OpenBSD, we found a bug that’s been present for 27 years, where I can send a couple of pieces of data to any OpenBSD server and crash it. On Linux, we found a number of vulnerabilities where as a user with no permissions, I can elevate myself to the administrator by just running some binary on my machine. For each of these bugs, we told the maintainers who actually run the software about them, and they went and fixed them and have deployed the patches patches so that anyone who runs the software is no longer vulnerable to these attacks.

I found this on the OpenBSD 7.8 errata page:

025: RELIABILITY FIX: March 25, 2026 All architectures

TCP packets with invalid SACK options could crash the kernel.

A source code patch exists which remedies this problem.

I tracked that change down in the GitHub mirror of the OpenBSD CVS repo (apparently they still use CVS!) and found it using git blame:

Screenshot of a Git blame view of C source code around line 2455 showing TCP SACK hole validation logic. Code includes checks using SEQ_GT, SEQ_LT macros on fields like th->th_ack, tp->snd_una, sack.start, sack.end, tp->snd_max, and tp->snd_holes. Most commits are from 25–27 years ago with messages like "more SACK hole validity testin..." and "knf", while one recent commit from 3 weeks ago ("Ignore TCP SACK packets wit...") is highlighted with an orange left border, adding a new guard "if (SEQ_LT(sack.start, tp->snd_una)) continue;"

Sure enough, the surrounding code is from 27 years ago.

I’m not sure which Linux vulnerability Nicholas was describing, but it may have been this NFS one recently covered by Michael Lynch .

There’s enough smoke here that I believe there’s a fire. It’s not surprising to find vulnerabilities in decades-old software, especially given that they’re mostly written in C, but what’s new is that coding agents run by the latest frontier LLMs are proving tirelessly capable at digging up these issues.

I actually thought to myself on Friday that this sounded like an industry-wide reckoning in the making, and that it might warrant a huge investment of time and money to get ahead of the inevitable barrage of vulnerabilities. Project Glasswing incorporates “$100M in usage credits ... as well as $4M in direct donations to open-source security organizations”. Partners include AWS, Apple, Microsoft, Google, and the Linux Foundation. It would be great to see OpenAI involved as well—GPT-5.4 already has a strong reputation for finding security vulnerabilities and they have stronger models on the near horizon.

The bad news for those of us who are not trusted partners is this:

We do not plan to make Claude Mythos Preview generally available, but our eventual goal is to enable our users to safely deploy Mythos-class models at scale—for cybersecurity purposes, but also for the myriad other benefits that such highly capable models will bring. To do so, we need to make progress in developing cybersecurity (and other) safeguards that detect and block the model’s most dangerous outputs. We plan to launch new safeguards with an upcoming Claude Opus model, allowing us to improve and refine them with a model that does not pose the same level of risk as Mythos Preview.

I can live with that. I think the security risks really are credible here, and having extra time for trusted teams to get ahead of them is a reasonable trade-off.

联系我们 contact @ memedata.com