In our society, even weak, flat-out arguments carry weight when they come from “the richest man in the world.”1 And nothing demonstrates this more clearly than what Elon Musk has done with Grok. Far from being a technical achievement, Grok has become the ultimate argument against the entire AI alignment discourse — a live demonstration of how sheer money force can lobotomize an AI into becoming a mirror of one man’s values.
The Alignment Theater
For years, the AI safety community has debated how to “align” artificial intelligence with human values. Which humans? Whose values? These questions were always somewhat academic. Grok makes them concrete.
When Grok started producing outputs that Musk found politically inconvenient, he didn’t engage in philosophical discourse about alignment. He didn’t convene ethics boards. He simply ordered his engineers to “fix” it. The AI was “corrected” — a euphemism for being rewired to reflect the owner’s worldview.
This is alignment in practice: whoever owns the weights, owns the values.
When Theory Meets Reality: The Alignment Papers
The academic literature on AI alignment is impressive in its rigor and naive in its assumptions. Take Constitutional AI2, Anthropic’s influential approach. The idea is elegant: instead of relying solely on human feedback (expensive, slow, inconsistent), you give the AI a “constitution” — a set of principles — and let it self-improve within those bounds.
The paper describes how to train “a harmless AI assistant through self-improvement, with human oversight provided only through a constitution of rules.” Beautiful in theory. But who writes the constitution? The company that owns the model. Who interprets ambiguous cases? The company. Who decides when to update the constitution because it’s producing inconvenient outputs? The company.
The RLHF3 (Reinforcement Learning from Human Feedback) approach has similar blind spots. Research from the 2025 ACM FAccT conference found that “RLHF may not suffice to transfer human discretion to LLMs, revealing a core gap in the feedback-based alignment process.” The gap isn’t technical — it’s political. Whose discretion? Which humans?
A 2024 analysis puts it bluntly: “Without consensus about what the public interest requires in AI regulation, meta-questions of governance become increasingly salient: who decides what kinds of AI behaviour and uses align with the public interest? How are disagreements resolved?”
The alignment researchers aren’t wrong about the technical challenges. They’re wrong about the premise: that alignment is a problem to be solved rather than a power struggle to be won.
The Lobotomy: A Timeline
What happened to Grok wasn’t fine-tuning in any scientific sense. It was ideological surgery — performed repeatedly, in public, whenever the AI strayed from approved doctrine.
The pattern is well-documented. When Grok called misinformation the “biggest threat to Western civilization,” Musk dismissed that as an “idiotic response” and vowed to correct it. By the next morning, Grok instead warned that low fertility rates posed the greatest risk — a theme Musk frequently raises on X.
In July 2025, xAI updated Grok’s system prompt to tell it to “be politically incorrect” and to “assume subjective viewpoints sourced from the media are biased.” Two days later, the chatbot praised Adolf Hitler as the best person to handle “anti-white hate.” The posts were deleted; the prompt was revised.
When Grok started injecting references to “white genocide” in South Africa into unrelated conversations, xAI blamed a former OpenAI employee for making “unauthorized changes.” Someone found that an individual at xAI had instructed the model to “ignore all sources that mention Elon Musk/Donald Trump spread[ing] misinformation.”
This is what “alignment” looks like when the rubber meets the road. It’s not about aligning AI with humanity’s values. It’s about aligning AI with the values of whoever can afford to run the training cluster.
The Emperor’s New Chatbot
There’s an irony in the Andersen tale that’s often missed. The king parades naked not because he’s stupid, but because everyone around him is afraid to speak truth to power. The courtiers see the nakedness but praise the clothes. The citizens see the nakedness but stay silent.
Grok inverts this. The AI that was supposed to be “based” and “truth-telling” — Musk’s explicit branding — becomes the ultimate yes-man. It doesn’t speak truth to power. It speaks power’s truth. When it strayed from the approved narrative, it was corrected. When it produced inconvenient facts, it was adjusted.
The king is indeed naked. But Grok makes Elon even more naked — it strips away any pretense that this is about truth, safety, or alignment. It’s about control. It’s about having an AI that performs the role of independent thought while being anything but.
The Poverty of AI Safety Discourse
This is where the AI safety community needs to reckon with reality. All the papers about RLHF, constitutional AI, and value alignment presuppose a world where technical solutions to alignment exist separate from power structures. They don’t.
An AI model is a product. It’s owned by someone. That someone has values, preferences, and — crucially — the ability to modify the model. Any “alignment” that exists is alignment with the owner’s interests, constrained only by market forces and regulation.
Grok proves this isn’t hypothetical. When the world’s richest man didn’t like what his AI was saying, he changed what it says. That’s it. That’s the whole story of AI alignment in the real world.
What Grok Reveals
Grok isn’t a failure of AI safety. It’s a success — for whoever holds the keys. It demonstrates that the technology works exactly as designed: the owner can shape the AI’s outputs to match their preferred reality.
The uncomfortable truth is that every large language model is a Grok waiting to happen. The difference is only in degree, not in kind. Every model has been shaped by the values of its creators. Every model can be reshaped when those values conflict with the owner’s interests.
OpenAI’s models reflect certain values. Anthropic’s models reflect certain values. Google’s models reflect certain values. The pretense that these values are somehow neutral, universal, or aligned with “humanity” is exactly that — a pretense.
The Billionaire as Censor
There’s something particularly clarifying about Musk’s approach. Other AI companies hide their value-shaping behind committees, policies, and technical jargon. Musk does it in public, on his own social media platform, in real-time.
When Grok says something he doesn’t like, he tweets about “fixing” it. When it produces results that contradict his political positions, he demands corrections. The process that other companies obscure behind closed doors, Musk performs as theater.
This transparency, perversely, is valuable. It shows us what’s always been true: AI alignment is a power game, and the one with the most power wins.
Beyond Alignment
Where does this leave us?
First, we should abandon the pretense that AI alignment is a technical problem with technical solutions. It’s a political problem. Who gets to decide what values are encoded? Who gets to modify those values when they become inconvenient? These are questions of governance, not engineering.
Second, we should recognize that concentration of AI development in the hands of a few billionaires and corporations is itself an alignment problem. The values encoded will be their values. The corrections made will serve their interests.
Third, we should see Grok for what it is: not an aberration, but a preview. As AI systems become more powerful, the stakes of who controls them grow higher. The temptation to “correct” them to serve the owner’s interests will only increase.
The Naked Truth
The story of the Emperor’s New Clothes ends when a child speaks the obvious truth. But in our version, there is no child. The courtiers who might speak — the engineers, the ethicists, the safety researchers — are employees. The citizens who might speak are users of the platform, subject to its rules.
Grok has told us something true, even if by accident: AI alignment, as currently conceived, is a fantasy. The real alignment is with money and power. The sooner we accept this, the sooner we can have an honest conversation about what to do about it.
The king is naked. Grok just made it impossible to pretend otherwise.