Anthropic 放弃了其核心安全承诺。

Anthropic 放弃了其核心安全承诺。
Anthropic ditches its core safety promise

原始链接: https://www.cnn.com/2026/02/25/tech/anthropic-safety-policy-change

Anthropic是一家以安全为首要原则成立的AI公司，由于竞争加剧和政治环境变化，正在修订其AI开发方法。之前，Anthropic 实行严格的“负责任扩展政策”，如果AI能力超过安全控制，可能会暂停开发。现在，它正在采用一个更灵活、非约束性的安全框架，并制定公开的目标，而不是坚定的承诺。这一变化源于Anthropic 面临来自五角大楼的压力，五角大楼威胁要撤销2亿美元的合同，除非该公司取消其安全措施——特别是关于AI控制武器和国内监控方面的措施。虽然政策转变与五角大楼的争端没有直接关联，但Anthropic 承认其最初的政策并未获得业界的广泛支持，并且现在与华盛顿当前对监管的立场不符。 Anthropic 认为，在其他人不暂停开发的情况下暂停开发，最终可能会*降低*安全性。该公司现在将定期发布关于其安全计划的报告，以实现透明度，但承认需要在快速发展的AI领域快速适应。

Hacker News新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交登录 [重复] Anthropic放弃其核心安全承诺 (cnn.com)551点由motbus3 1天前发布 | 隐藏 | 过去 | 收藏 | 3评论帮助 awithrow 1天前 | 下一个 [–] 在此处深入讨论：https://news.ycombinator.com/item?id=47145963回复dang 1天前 | 父 | 下一个 [–] 评论已移至此处。谢谢！回复ChrisArchitect 1天前 | 前一个 [–] [重复] https://news.ycombinator.com/item?id=47145963回复指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请YC | 联系搜索：

Anthropic, a company founded by OpenAI exiles worried about the dangers of AI, is loosening its core safety principle in response to competition.

Instead of self-imposed guardrails constraining its development of AI models, Anthropic is adopting a nonbinding safety framework that it says can and will change.

In a blog post Tuesday outlining its new policy, Anthropic said shortcomings in its two-year-old Responsible Scaling Policy could hinder its ability to compete in a rapidly growing AI market.

The announcement is surprising, because Anthropic has described itself as the AI company with a “soul.” It also comes the same week that Anthropic is fighting a significant battle with the Pentagon over AI red lines.

The policy change is separate and unrelated to Anthropic’s discussions with the Pentagon, according to a source familiar with the matter. Defense Secretary Pete Hegseth gave Anthropic CEO Dario Amodei an ultimatum on Tuesday to roll back the company’s AI safeguards or risk losing a $200 million Pentagon contract. The Pentagon threatened to put Anthropic on what is effectively a government blacklist.

But the company said in its blog post that its previous safety policy was designed to build industry consensus around mitigating AI risks – guardrails that the industry blew through. Anthropic also noted its safety policy was out of step with Washington’s current anti-regulatory political climate.

Anthropic’s previous policy stipulated that it should pause training more powerful models if their capabilities outstripped the company’s ability to control them and ensure their safety — a measure that’s been removed in the new policy. Anthropic argued that responsible AI developers pausing growth while less careful actors plowed ahead could “result in a world that is less safe.”

As part of the new policy, Anthropic said it will separate its own safety plans from its recommendations for the AI industry.

Anthropic wrote that it had hoped its original safety principles “would encourage other AI companies to introduce similar policies. This is the idea of a ‘race to the top’ (the converse of a ‘race to the bottom’), in which different industry players are incentivized to improve, rather than weaken, their models’ safeguards and their overall safety posture.”

The company now suggests that hasn’t played out.

In a statement to CNN, an Anthropic spokesperson described the updated policy as “the strongest to date on the level of public accountability and transparency.”

“We’ve gone a significant step further from our prior policies by committing to publicly publish detailed reports at regular intervals on our plans to strengthen our risk mitigations, as well as the threat models and capabilities of all our models,” the statement said. “From the beginning, we’ve said the pace of AI and uncertainties in the field would require us to rapidly iterate and improve the policy.”

Anthropic’s new safety policy includes a “Frontier Safety Roadmap” that outlines the company’s self-imposed guidelines and safeguards. But the company acknowledged the new framework is more flexible than its past policy.

“Rather than being hard commitments, these are public goals that we will openly grade our progress towards,” the company said in its blog post.

The change comes a day after Defense Secretary Pete Hegseth gave Anthropic CEO Dario Amodei a Friday deadline to roll back the company’s AI safeguards, or risk losing a $200 million Pentagon contract and being put on what is effectively a government blacklist.

Anthropic has concerns over two issues that it isn’t willing to drop, according to a source familiar with the company’s meeting with Hegseth: AI-controlled weapons and mass domestic surveillance of American citizens. Anthropic believes AI is not reliable enough to operate weapons, and there are no laws or regulations yet that cover how AI could be used in mass surveillance, a source said.

AI researchers applauded Anthropic’s stance on social media on Tuesday and expressed concerns about the idea of AI being used for government surveillance.

The company has long positioned itself as the AI business that prioritizes safety. Anthropic has published research showing how its own AI models could be capable of blackmail under certain conditions. The company recently donated $20 million to Public First Action, a political group pushing for AI safeguards and education.

But the company has faced increasing pressure and competition from both the government and its rivals. Hegseth, for example, plans to invoke the Defense Production Act on Anthropic and designate the company a supply chain risk if it does not comply with the Pentagon’s demands, CNN reported on Tuesday. OpenAI and Anthropic have also been locked in a race to launch new enterprise AI tools in a bid to win the workplace.

Jared Kaplan, Anthropic’s chief science officer, suggested in an interview with Time that the change was made in the name of safety more than increased competition.

“We felt that it wouldn’t actually help anyone for us to stop training AI models,” Kaplan told the magazine. “We didn’t really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments … if competitors are blazing ahead.”

CNN’s Hadas Gold contributed to this story.

This story has been updated with additional information.

Anthropic 放弃了其核心安全承诺。 Anthropic ditches its core safety promise

Anthropic 放弃了其核心安全承诺。
Anthropic ditches its core safety promise