Project Glasswing:确保人工智能时代的 критически важного программного обеспечения
Project Glasswing: Securing critical software for the AI era

原始链接: https://www.anthropic.com/glasswing

## Claude Mythos 预览:网络安全领域的变革 Anthropic 的 Claude Mythos 预览版是一种高度先进的 AI 模型,展示了能够自主识别和利用主要操作系统、网络浏览器和核心软件(如 OpenBSD、FFmpeg 和 Linux 内核)中数千个先前未知(零日)漏洞的突破性能力。这些漏洞,其中一些已存在数十年,是在*没有*人工指导的情况下被发现的,凸显了 AI 驱动的安全测试的重大飞跃。 这项能力正在促使人们紧急呼吁进行网络安全大修。来自 Cisco、AWS、Microsoft、CrowdStrike、JPMorganChase 和 Google 的专家强调,AI 从根本上改变了威胁形势,使漏洞发现的速度*更快,无论是对于攻击者还是防御者*。 Anthropic 正在启动“Glasswing 项目”,向合作伙伴提供 Mythos 预览版的访问权限,用于漏洞检测和修复,并提供 1 亿美元的额度支持。该项目旨在分享经验并制定新的安全标准,围绕漏洞披露、软件更新和安全开发实践。虽然 Mythos 预览版不会公开发布,但 Anthropic 计划将其安全措施集成到未来的 Claude 模型中,最终目标是在大规模范围内安全部署类似的能力。这项举措强调了主动的 AI 驱动安全措施以及科技行业和政府之间的合作的迫切需求。

Anthropic 启动了 **Glasswing 项目**,该项目利用他们新的 **Claude Mythos Preview** AI 模型主动识别关键软件中的安全漏洞。Mythos Preview 的早期结果令人鼓舞,已经发现了主要操作系统和浏览器中数千个高危缺陷。 该模型将在初步研究阶段结束后,以每百万输入/输出 token 25 美元/125 美元的价格提供给参与者,研究阶段由 1 亿美元的模型使用额度资助。这使其定价与 Anthropic 的其他模型(如 GPT 4.5 和 5.4 Pro)具有竞争力。 Hacker News 上的讨论显示出对 Anthropic 安全和伦理声明的怀疑,一些用户指责该公司虚伪,甚至通过游说监管来积极反对开源 AI 开发。另一些人则认为,简单地提示 AI 模型关注安全性就是一个直接的解决方案。
相关文章

原文

Identifying vulnerabilities and exploits with Claude Mythos Preview

Over the past few weeks, we have used Claude Mythos Preview to identify thousands of zero-day vulnerabilities (that is, flaws that were previously unknown to the software’s developers), many of them critical, in every major operating system and every major web browser, along with a range of other important pieces of software.

In a post on our Frontier Red Team blog, we provide technical details for a subset of these vulnerabilities that have already been patched and, in some cases, the ways that Mythos Preview found to exploit them. It was able to identify nearly all of these vulnerabilities—and develop many related exploits—entirely autonomously, without any human steering. The following are three examples:

  • Mythos Preview found a 27-year-old vulnerability in OpenBSD—which has a reputation as one of the most security-hardened operating systems in the world and is used to run firewalls and other critical infrastructure. The vulnerability allowed an attacker to remotely crash any machine running the operating system just by connecting to it;
  • It also discovered a 16-year-old vulnerability in FFmpeg—which is used by innumerable pieces of software to encode and decode video—in a line of code that automated testing tools had hit five million times without ever catching the problem;
  • The model autonomously found and chained together several vulnerabilities in the Linux kernel—the software that runs most of the world’s servers—to allow an attacker to escalate from ordinary user access to complete control of the machine.

We have reported the above vulnerabilities to the maintainers of the relevant software, and they have all now been patched. For many other vulnerabilities, we are providing a cryptographic hash of the details today (see the Red Team blog), and we will reveal the specifics after a fix is in place.

Evaluation benchmarks such as CyberGym reinforce the substantial difference between Mythos Preview and our next-best model, Claude Opus 4.6:

In addition to our own work, many of our partners have already been using Claude Mythos Preview for several weeks. This is what they’ve found:

The powerful cyber capabilities of Claude Mythos Preview are a result of its strong agentic coding and reasoning skills. For example, as shown in the evaluation results below, the model has the highest scores of any model yet developed on a variety of software coding tasks.

More information on the model’s capabilities, its safety properties, and its general characteristics can be found in the Claude Mythos Preview system card.

We do not plan to make Claude Mythos Preview generally available, but our eventual goal is to enable our users to safely deploy Mythos-class models at scale—for cybersecurity purposes, but also for the myriad other benefits that such highly capable models will bring. To do so, we need to make progress in developing cybersecurity (and other) safeguards that detect and block the model’s most dangerous outputs. We plan to launch new safeguards with an upcoming Claude Opus model, allowing us to improve and refine them with a model that does not pose the same level of risk as Mythos Preview2.

Plans for Project Glasswing

Today’s announcement is the beginning of a longer-term effort. To be successful, it will require broad involvement from across the technology industry and beyond.

Project Glasswing partners will receive access to Claude Mythos Preview to find and fix vulnerabilities or weaknesses in their foundational systems—systems that represent a very large portion of the world’s shared cyberattack surface. We anticipate this work will focus on tasks like local vulnerability detection, black box testing of binaries, securing endpoints, and penetration testing of systems.

Anthropic’s commitment of $100M in model usage credits to Project Glasswing and additional participants will cover substantial usage throughout this research preview. Afterward, Claude Mythos Preview will be available to participants at $25/$125 per million input/output tokens (participants can access the model on the Claude API, Amazon Bedrock, Google Cloud’s Vertex AI, and Microsoft Foundry).

In addition to our commitment of model usage credits, we’ve donated $2.5M to Alpha-Omega and OpenSSF through the Linux Foundation, and $1.5M to the Apache Software Foundation to enable the maintainers of open-source software to respond to this changing landscape (maintainers interested in access can apply through the Claude for Open Source program).

We intend for this work to grow in scope and continue for many months, and we’ll share as much as we can so that other organizations can apply the lessons to their own security. Partners will, to the extent they’re able, share information and best practices with each other; within 90 days, Anthropic will report publicly on what we’ve learned, as well as the vulnerabilities fixed and improvements made that can be disclosed. We will also collaborate with leading security organizations to produce a set of practical recommendations for how security practices should evolve in the AI era. This will potentially include:

  • Vulnerability disclosure processes;
  • Software update processes;
  • Open-source and supply-chain security;
  • Software development lifecycle and secure-by-design practices;
  • Standards for regulated industries;
  • Triage scaling and automation; and
  • Patching automation.

Anthropic has also been in ongoing discussions with US government officials about Claude Mythos Preview and its offensive and defensive cyber capabilities. As we noted above, securing critical infrastructure is a top national security priority for democratic countries—the emergence of these cyber capabilities is another reason why the US and its allies must maintain a decisive lead in AI technology. Governments have an essential role to play in helping maintain that lead, and in both assessing and mitigating the national security risks associated with AI models. We are ready to work with local, state, and federal representatives to assist in these tasks.

We are hopeful that Project Glasswing can seed a larger effort across industry and the public sector, with all parties helping to address the biggest questions around the impact of powerful models on security. We invite other AI industry members to join us in helping to set the standards for the industry. In the medium term, an independent, third-party body—one that can bring together private- and public-sector organizations—might be the ideal home for continued work on these large-scale cybersecurity projects.

联系我们 contact @ memedata.com