测量通往通用人工智能的进展：认知框架

测量通往通用人工智能的进展：认知框架
Measuring progress toward AGI: A cognitive framework

原始链接: https://blog.google/innovation-and-ai/models-and-research/google-deepmind/measuring-agi-cognitive-framework/

为了更好地评估人工智能，一项新的评估协议正在被提议，以将人工智能的表现*与*人类认知能力进行基准测试。这包括在各种任务上测试人工智能，从具有代表性的人群中建立人类表现基线，然后将人工智能的结果映射到人类分布。为了付诸实践，一个20万美元的Kaggle黑客马拉松——“衡量通往AGI的进展：认知能力”——即将启动。它侧重于人工智能评估目前缺乏的五个关键领域：学习、元认知、注意力、执行功能和社会认知。鼓励参与者使用Kaggle的社区基准平台开发新的评估方法，并将其与领先的人工智能模型进行测试。奖金从每个赛道的1万美元到2.5万美元的大奖，投稿截止日期为3月17日至4月16日。

原文

To understand AI capabilities across these cognitive abilities, we propose a three-stage evaluation protocol that benchmarks system performance in relation to human capabilities:

Evaluate AI systems across a broad suite of cognitive tasks covering each ability, using held-out test sets to prevent data contamination
Collect human baselines for the same tasks from a demographically representative sample of adults
Map each AI system’s performance relative to the distribution of human performance in each ability

Defining these cognitive abilities is a crucial first step, but we need more than a framework to measure progress. To put this theory into practice, we are launching a new Kaggle hackathon — “Measuring progress toward AGI: Cognitive abilities”. The hackathon encourages the community to design evaluations for five cognitive abilities where the evaluation gap is the largest: learning, metacognition, attention, executive functions and social cognition.

Participants can use Kaggle's newly launched Community Benchmarks platform to build and test their evaluations against a lineup of frontier models.

We are offering a total prize pool of $200,000: $10,000 awards for the top two submissions in each of the five tracks, and $25,000 grand prizes for the four absolute best overall submissions. Submissions are open March 17 through April 16, and we’ll announce the results June 1. Head over to the Kaggle website to start building.

测量通往通用人工智能的进展：认知框架 Measuring progress toward AGI: A cognitive framework

测量通往通用人工智能的进展：认知框架
Measuring progress toward AGI: A cognitive framework