弧形-AGI-3
ARC-AGI-3

原始链接: https://arcprize.org/arc-agi/3

## ARC-AGI-3:一项衡量真正人工智能的基准 ARC-AGI-3 是一项具有挑战性的基准,旨在衡量通往通用人工智能(AGI)的进展。它不同于专注于静态问题解决的传统人工智能测试,而是评估人工智能体在**动态、新颖环境中学习和适应**的能力——就像人类一样。 该基准要求人工智能体探索、设定自身目标、构建对世界的理解,并通过经验不断改进,*无需*依赖明确的指令。成功定义为在可解决的环境套件中匹配人类效率。 ARC-AGI-3 独特地衡量随*时间*推移的智能,评估诸如长期规划、记忆和信念更新等因素——这些是人工智能目前落后于人类的领域。其设计优先考虑人类的简单性,避免依赖预先存在的知识,并防止基于记忆的解决方案,从而提供对真正学习和适应能力的可靠衡量。最终,ARC-AGI-3 旨在量化人工智能与人类智能之间的差距。

相关文章

原文

What is ARC-AGI-3?

ARC-AGI-3 is an interactive reasoning benchmark which challenges AI agents to explore novel environments, acquire goals on the fly, build adaptable world models, and learn continuously.

A 100% score means AI agents can beat every game as efficiently as humans.

Instead of solving static puzzles, agents must learn from experience inside each environment—perceiving what matters, selecting actions, and adapting their strategy without relying on natural-language instructions.

How it measures intelligence

  • 100% human-solvable environments
  • Skill-acquisition efficiency over time
  • Long-horizon planning with sparse feedback
  • Experience-driven adaptation across multiple steps

As long as there is a gap between AI and human learning, we do not have AGI.

ARC-AGI-3 makes that gap measurable by testing intelligence across time, not just final answers—capturing planning horizons, memory compression, and the ability to update beliefs as new evidence appears.

Design principles

  • Easy for humans to pick up quickly
  • No pre-loaded knowledge or hidden prompts
  • Clear goals + meaningful feedback
  • Novelty that prevents brute-force memorization
联系我们 contact @ memedata.com