Anthropic has elevated fear-based marketing to an art form. The company routinely warns the public, lawmakers, and corporate America about looming AI doom scenarios - then conveniently positions itself, its products, safety frameworks, and policy prescriptions as humanity’s best defense.
This strategy is hardly new. Fear, Uncertainty, and Doubt (FUD) works because we're wired to be risk-averse. Nothing grabs attention faster than a well-crafted existential threat.
In its latest iteration, Anthropic’s message is that AI models are advancing so rapidly they risk outrunning society’s ability to control them. The solution? The very guardrails, verification systems, and safety architecture that Anthropic is helpfully developing.
“We believe it would be good for the world to have the option to slow or temporarily pause frontier AI development to enable societal structures and alignment research to keep up with the advance of the technology,” the company stated in a new post released Thursday.
The Anthropic Institute will conduct research—in collaboration with many others—and take actions to help build the systems that a credible slowdown or pause would require. These systems would enable frontier AI developers to verify that others globally have actually stopped or slowed, and that a bad actor could not use the auspices of a coordinated slowdown to jump ahead in secret. If such systems existed, we expect that we would slow down or temporarily pause, if other developers at or near the frontier also did so in a verifiable manner.
A meaningful slowdown or pause would require multiple well-resourced labs at or near the frontier, in multiple countries, agreeing to stop under the same conditions. It would also require that each can verify that the others have actually stopped. Due to the unique characteristics of AI systems, the detectability (a lower standard than verifiability) element of this arms control problem is much more challenging than with other technologies. Training runs are far easier to conceal than missile silos, their inputs are general-purpose, and the incentive to defect quietly is enormous. A credible pause also has to specify what triggers it, what lifts it, and who adjudicates.
None of this is necessarily impossible in principle—the world has built verification regimes for other complex technologies (e.g., the Intermediate-Range Nuclear Forces Treaty)—but those regimes took decades to build both the infrastructure and the trust. We don't have that long. A unilateral pause by one lab is achievable immediately, but accomplishes much less: it would change who the front-runner is, but it would not create the wider deliberative process that is currently missing.
In the coming months, we will organize conversations where policymakers, researchers, civil society, and other AI companies can help answer some of the questions this piece raises, especially around full recursive self-improvement and how to create better options for coordination and deliberation. We'll publish what comes out of it. The window to investigate these questions together is here, and people outside AI companies should be involved in this deliberation.
This follows a familiar playbook. With the earlier release of Mythos, Anthropic framed the model as so powerful it could autonomously discover thousands of high-severity vulnerabilities - including decades-old bugs missed by prior testing - chain exploits, and execute complex multi-stage attacks. They called it a “moment of danger” for cybersecurity with potentially severe consequences for economies and national security.
Access was initially restricted to a small group of trusted parties. As doomer headlines blanketed mainstream media, the fear cycle ran its course. Then, predictably, access was gradually expanded.
Former AI czar David Sacks captured Anthropic’s approach perfectly on a recent episode of the All-In Podcast:
All of this FUD marketing arrives as Anthropic races OpenAI to reach the public markets first, following SpaceX’s anticipated IPO next week.
