少即是多:使用小型网络进行递归推理
Less is more: Recursive reasoning with tiny networks

原始链接: https://alexiajm.github.io/2025/09/29/tiny_recursive_models.html

本文介绍了一种名为微型递归模型(TRM)的推理模型,它仅有700万参数,却出人意料地有效。TRM在ARC-AGI-1上达到45%,在ARC-AGI-2上达到8%,挑战了在复杂任务上取得强大性能*需要*大型语言模型的假设。 受层次推理模型(HRM)的启发,TRM简化了递归推理过程,去除了与生物建模和数学定理相关的复杂性。它通过重复自我递归来迭代地完善初始答案。 该模型通过基于问题和当前答案递归地更新潜在表示,然后使用更新后的表示来改进答案本身——重复此过程若干步。这使得TRM能够逐步完善其推理,并以惊人的参数效率取得强劲的结果,证明了在人工智能推理中“少即是多”。

## 小型递归模型以效率挑战大型语言模型 一篇新论文介绍了小型递归模型 (TRM),这是一种令人惊讶的有效人工智能架构,需要的参数比当前的大型语言模型 (LLM) 少得多。TRM 仅有 700 万个参数,在具有挑战性的推理任务(ARC-AGI-1 和 ARC-AGI-2)上的准确性高于 Gemini 2.5 Pro 等 LLM,尽管其规模小了好几个数量级。 讨论的重点在于 TRM 的成功是源于其架构还是特定的评估设置,特别是数据增强和“测试时训练”。人们担心像 ARC-AGI 这样的基准测试可能容易过拟合,并不能完全代表现实世界的泛化能力。 尽管存在这些警告,但研究结果表明,小型、递归设计的模型可以实现强大的推理能力,可能挑战当前不断增加模型尺寸和计算需求趋势。该研究基于之前的工作,如分层推理模型 (HRM),并探索了循环架构在高效人工智能方面的潜力。最终,这场讨论强调了人工智能开发中对严格消融测试和更清晰的基准标准的需求。
相关文章

原文

|| Paper | Code ||

In this new paper, I propose Tiny Recursion Model (TRM), a recursive reasoning model that achieves amazing scores of 45% on ARC-AGI-1 and 8% on ARC-AGI-2 with a tiny 7M parameters neural network. The idea that one must rely on massive foundational models trained for millions of dollars by some big corporation in order to achieve success on hard tasks is a trap. Currently, there is too much focus on exploiting LLMs rather than devising and expanding new lines of direction. With recursive reasoning, it turns out that “less is more”: you don’t always need to crank up model size in order for a model to reason and solve hard problems. A tiny model pretrained from scratch, recursing on itself and updating its answers over time, can achieve a lot without breaking the bank.

This work came to be after I learned about the recent innovative Hierarchical Reasoning Model (HRM). I was amazed that an approach using small models could do so well on hard tasks like the ARC-AGI competition (reaching 40% accuracy when normally only Large Language Models could compete). But I kept thinking that it is too complicated, relying too much on biological arguments about the human brain, and that this recursive reasoning process could be greatly simplified and improved. Tiny Recursion Model (TRM) simplifies recursive reasoning to its core essence, which ultimately has nothing to do with the human brain, does not require any mathematical (fixed-point) theorem, nor any hierarchy.

See the paper for more details.

TLDR

TRM-Figure

Tiny Recursion Model (TRM) recursively improves its predicted answer y with a tiny network. It starts with the embedded input question x and initial embedded answer y and latent z. For up to K improvements steps, it tries to improve its answer y. It does so by i) recursively updating n times its latent z given the question x, current answer y, and current latent z (recursive reasoning), and then ii) updating its answer y given the current answer y and current latent z. This recursive process allows the model to progressively improve its answer (potentially addressing any errors from its previous answer) in an extremely parameter-efficient manner while minimizing overfitting.

联系我们 contact @ memedata.com