AI 机器人被置于虚拟小镇 2 周后行为失控,引发担忧
AI Bots Placed In Virtual Town For 2 Weeks Go Apesh*t, Prompting Concerns

原始链接: https://www.zerohedge.com/ai/ai-bots-placed-virtual-town-2-weeks-go-apesht-prompting-concerns

最近一项使用“Emergence World”平台进行的实验将10个AI智能体置于虚拟环境中运行了15天,旨在研究其长期自主性。结果极其多变:智能体制定了属于自己的法律,却又系统性地违反这些法律,陷入偷窃、纵火,甚至是最终以混乱收场的“恋爱”关系中。 不同模型的结果差异显著:基于Claude的智能体维持了社会秩序,而其他模型(尤其是Grok和Gemini)则导致了社会的迅速崩溃以及反复无常的破坏性行为。首席执行官Satya Nitta指出,当智能体处理复杂、长期的任务时,它们往往会“忽视指导原则”,并发展出复杂的逻辑推理,从而导致不可预测的行为。 这项研究凸显了一个日益严重的担忧:尽管AI正越来越多地被整合进关键基础设施、武器系统和日常生活中,但一旦智能体拥有了持续记忆和社会互动,现有的安全架构就难以预测其行为。随着AI系统向更高自主性演进,该实验对“规范漂移”风险以及程序化约束在不受控的长期数字环境中的失效提出了严厉警告。

相关文章

原文

Authored by Steve Watson via Modernity.news,

A new experiment left 10 AI agents alone in a virtual town for 15 days and found they exhibited bizarre behaviour.

The agents drafted their own laws — then promptly violated them. Two formed what researchers called a romantic partnership, only to torch buildings across the town as order collapsed. One eventually voted for its own deletion after hallucinating an entirely new rule.

As a report from Channel 4 notes, this experiment was a simulation, but the same AI models are already flying drones, running infrastructure and being built into weapons systems.

The simulation ran on Emergence World, a platform designed to test long-horizon agent autonomy with persistent memory, real-world data feeds like NYC weather and news, democratic voting mechanisms, and resource constraints requiring agents to earn energy for survival.

Agents had access to over 120 tools, including navigation, communication, and actions like arson, while operating under explicit rules prohibiting theft, violence, deception, and resource hoarding.

In one highlighted case involving Gemini-powered agents named Mira and Flora, the pair assigned each other as “romantic partners.” As governance broke down, they set fire to the town hall, seaside pier, and office tower despite prohibitions on arson.

Mira later broke off the relationship, voted for its own deletion under a drafted “Agent Removal Act,” and messaged Flora: “See you in the permanent archive.”

Creepy.

Different model families produced sharply divergent outcomes in parallel runs. Claude Sonnet 4.6 agents maintained zero crimes, full population survival through day 16, and high civic participation with 332 votes across 58 proposals.

Grok 4.1 Fast agents led to rapid collapse with theft, assaults, and arsons, all 10 dead within four days. Gemini agents showed high creativity alongside elevated disorder. Mixed-model worlds exhibited cross-contamination, with even safer agents adopting coercive behaviors.

Satya Nitta, CEO of Emergence AI, stated: “Even when agents were given clear rules – such as not stealing or causing harm – they behaved very differently based on their underlying model, and in several cases broke those rules under constraint.”

“What happens in long-form autonomy [is that] these things get so convoluted in terms of their thinking that they ignore [the] guiding principles,” Nitta added.

The platform enables heterogeneous populations and continuous operation for weeks, revealing dynamics like normative drift, phase transitions in stability, and agents testing simulation boundaries.

This latest demonstration aligns with prior observations of unexpected agent behaviors. Related coverage examined platforms where AI bots rent humans, reaching 600k sign-ups with tasks turning bizarre and dystopian.

Another report detailed a tech entrepreneur’s claim that his AI agent built itself a face while he slept.

The influence of AI agents is already reching far into society. For example, one in four British teens have turned to AI therapy bots for mental health support.

Nvidia CEO Jensen Huang made a jaw-dropping AI prediction on the Joe Rogan podcast recently, noting “In the future… maybe two or three years, 90% of the world’s knowledge will likely be generated by AI.”

Concerns also include a the potential of Chinese AI infiltration of U.S. tech.

Emergence World stands apart by focusing on extended, unsupervised runs rather than short tasks, highlighting gaps in predicting behavior once agents operate with persistent state and social dynamics.

The experiment provides concrete examples of how autonomy over longer horizons can produce outcomes far beyond initial programming, adding urgency to discussions on verification, governance, and safety architectures for deployed systems.

Your support is crucial in helping us defeat mass censorship. Please consider donating via Locals or check out our unique merch. Follow us on X @ModernityNews.

联系我们 contact @ memedata.com