人工智能解决了国际数学奥林匹克银牌级别的问题

人工智能解决了国际数学奥林匹克银牌级别的问题
AI solves International Math Olympiad problems at silver medal level

原始链接: https://deepmind.google/discover/blog/ai-solves-imo-problems-at-silver-medal-level/

本文介绍的是负责开发两个人工智能 (AI) 系统的团队：AlphaProof 和 AlphaGeometry 2。AlphaProof 专注于数学证明，而 AlphaGeometry 2 专注于几何推理。这两个项目的领导团队均包括多名人员：Thomas Hubert、Rishi Mehta、Laurent Sartran、Thang Luong、Trieu Trinh 和 Yuri Chervonyi 等。每个团队的主要贡献者包括 Hussain Masoom、Ingrid von Glehn、Mirek Olšák、Swaroop Mishra、Xiaomeng Yang 等。此外，几位顾问和精益专家在整个开发过程中提供了指导，包括 David Silver、Demis Hassabis、Quoc Le 和 Pushmeet Kohli。其他成员协助评估语言推理系统的质量、支持计算规定以及管理整个项目。最后，他们对 Lean 和 MathLib 社区表示感谢，是他们使 AlphaProof 的创建成为可能。

文本描述了由两个名为 Gemini 和 AlphaZero 的网络组成的系统。 Gemini 是一种改进的语言模型，能够将自然语言翻译为正式语言。 AlphaZero 专注于解决复杂问题，不受自然语言的阻碍。它的方法避免了法学硕士遇到的潜在“幻觉”问题，因为它专注于使用精益定理证明者的命令在证明空间内玩游戏。 AlphaZero 在适应各种游戏时取得了令人印象深刻的结果，展示了其在不同战略领域的学习和卓越能力。例如，它击败了国际象棋程序 Stockfish 的强大版本，这证明了它的多功能性。当遇到数学问题时，用户通过要求“所有偶数”概念的形式化来测试 AlphaZero 的性能。 AlphaZero 生成的答案需要对导入和语法中的错误进行微调。然而，AlphaZero 未能提出问题的两个方向，这表明还有改进的空间。 Leaning 除了是一种形式语言之外，还可以充当定理证明者，允许用户证明陈述。与 IMO 参与者不同，用户不能仅仅依靠自动化工具来解决问题，因为他们必须展示自己的工作。解决 IMO 问题需要将最初的问题形式化，这是数学家乔治·波利亚 (George Pólya) 强调的问题解决过程的一个关键方面。在比赛中，金牌和银牌获得者代表了 IMO 参与者中前 25%，这意味着人工智能解决 IMO 问题的能力比四分之三的参赛高中生更好。尽管解决多个问题需要更长的时间，但人工智能在解决单个 IMO 问题方面的成功仍然很重要。与学生的完成时间进行比较可能会提供有关人工智能能力的不准确观点。相反，报告成功解决的问题的百分比或获得的分数可能会提供更清晰的评估。最终，人工智能在解决数学问题方面被证明是有效的，只要选择适当的问题——那些对人类和人工智能来说都足够困难的问题。考虑到人工智能可以毫不费力地克服容易暴力破解的任务，这有助于确保对人工智能优势的公平评估。相比之下，IMO 参与者需要独特的、具有挑战性的问题解决方案才能与人工智能竞争。总之，人工智能在解决数学挑战方面表现出了非凡的效率，尽管成功程度取决于任务的复杂性。虽然无意将人工智能与

We thank the International Mathematical Olympiad organization for their support.

AlphaProof development was led by Thomas Hubert, Rishi Mehta and Laurent Sartran; AlphaGeometry 2 and natural language reasoning efforts were led by Thang Luong.

AlphaProof was developed with key contributions from Hussain Masoom, Aja Huang, Miklós Z. Horváth, Tom Zahavy, Vivek Veeriah, Eric Wieser, Jessica Yung, Lei Yu, Yannick Schroecker, Julian Schrittwieser, Ottavia Bertolli, Borja Ibarz, Edward Lockhart, Edward Hughes, Mark Rowland, Grace Margand. Alex Davies and Daniel Zheng led the development of informal systems such as final answer determination, with key contributions from Iuliya Beloshapka, Ingrid von Glehn, Yin Li, Fabian Pedregosa, Ameya Velingker and Goran Žužić. Oliver Nash, Bhavik Mehta, Paul Lezeau, Salvatore Mercuri, Lawrence Wu, Calle Soenne, Thomas Murrills, Luigi Massacci and Andrew Yang advised and contributed as Lean experts. Past contributors include Amol Mandhane, Tom Eccles, Eser Aygün, Zhitao Gong, Richard Evans, Soňa Mokrá, Amin Barekatain, Wendy Shang, Hannah Openshaw, Felix Gimeno. This work was advised by David Silver and Pushmeet Kohli.

The development of AlphaGeometry 2 was led by Trieu Trinh and Yuri Chervonyi, with key contributions by Mirek Olšák, Xiaomeng Yang, Hoang Nguyen, Junehyuk Jung, Dawsen Hwang and Marcelo Menegali. The development of the natural language reasoning system was led by Golnaz Ghiasi, Garrett Bingham, YaGuang Li, with key contributions by Swaroop Mishra, Nigamaa Nayakanti, Sidharth Mudgal, Qijun Tan, Junehyuk Jung, Hoang Nguyen, Alex Zhai, Dawsen Hwang, Mingyang Deng, Clara Huiyi Hu, Jarrod Kahn, Maciej Kula, Cosmo Du. Both AlphaGeometry and natural language reasoning systems were advised by Quoc Le.

David Silver, Quoc Le, Demis Hassabis, and Pushmeet Kohli coordinated and managed the overall project.

We’d also like to thank Insuk Seo, Evan Chen, Zigmars Rasscevskis, Kari Ragnarsson, Junhwi Bae, Jeonghyun Ahn, Jimin Kim, Hung Pham, Nguyen Nguyen, Son Pham, and Pasin Manurangsi who helped evaluate the quality of our language reasoning system. Jeff Stanway, Jessica Lo, Erica Moreira and Kareem Ayoub for their support for compute provision and management. Prof Gregor Dolinar and Dr Geoff Smith MBE from the IMO Board, for the support and collaboration; and Tu Vu, Hanzhao Lin, Chenkai Kuang, Vikas Verma, Yifeng Lu, Vihan Jain, Henryk Michalewski, Xavier Garcia, Arjun Kar, Lampros Lamprou, Kaushal Patel, Ilya Tolstikhen, Olivier Bousquet, Anton Tsitsulin, Dustin Zelle, CJ Carey, Sam Blackwell, Abhi Rao, Vahab Mirrokni, Behnam Neyshabur, Ethan Dyer, Keith Rush, Moritz Firsching, Dan Shved, Ihar Bury, Divyanshu Ranjan, Hadi Hashemi, Alexei Bendebury, Soheil Hassas Yeganeh, Shibl Mourad, Simon Schmitt, Satinder Baveja, Chris Dyer, Jacob Austin, Wenda Li, Heng-tze Cheng, Ed Chi, Koray Kavukcuoglu, Oriol Vinyals, Jeff Dean and Sergey Brin for their support and advice.

Finally, we’d like to thank the many contributors to the Lean and Mathlib projects, without whom AlphaProof wouldn’t have been possible.

人工智能解决了国际数学奥林匹克银牌级别的问题 AI solves International Math Olympiad problems at silver medal level

人工智能解决了国际数学奥林匹克银牌级别的问题
AI solves International Math Olympiad problems at silver medal level