2025年末对人工智能的反思

2025年末对人工智能的反思
Reflections on AI at the End of 2025

## LLM：从随机鹦鹉到潜在的AGI 近年来，人们对大型语言模型（LLM）的理解发生了转变。过时的“随机鹦鹉”理论——即LLM缺乏理解力——在模型展示出越来越复杂的推理能力时，已被 largely 驳斥，尤其是在链式思考（CoT）提示的出现之后。CoT通过在模型的表示中启用内部搜索*和*通过强化学习来改善输出，使模型能够学习用于获得期望结果的最佳token序列。由于具有可验证奖励的强化学习，扩展不再受token数量限制，这预示着持续且重大的进展潜力。尽管存在不完善之处，LLM辅助编程正在获得认可，为许多开发者提供了可观的投资回报。虽然有些人正在探索Transformer之外的其他架构，但作者认为，即使在当前框架下，LLM仍然有可能实现通用人工智能（AGI）。ARC测试曾经被认为是一个主要障碍，但现在正被优化的模型和利用CoT的大型LLM超越。最终，人工智能面临的最大挑战不是技术进步，而是确保其安全发展以避免生存风险。

最近一篇关于2025年底人工智能的文章在Hacker News上引发讨论，焦点在于古德哈特法则可能在人工智能驱动的代码优化中应用。最初的发帖者danielfalbo 思考，为了速度不遗余力地优化代码，是否会导致功能上更快但人类难以理解或维护的解决方案——将指标置于底层质量之上。讨论的问题是，这样的结果是否可以接受，以及人工智能是否可能作为副产品发展出解释能力。一个关键点是，当前的AI训练方法（微调而非完全重新训练）是否保留了足够的“通用性”来允许真正具有创新性的解决方案，或者它们是否限制了创造性问题解决。另一条引人注目的评论强调了人工智能发展最根本的担忧：避免生存风险。

antirez 1 hour ago. 3842 views.

* For years, despite functional evidence and scientific hints accumulating, certain AI researchers continued to claim LLMs were stochastic parrots: probabilistic machines that would: 1. NOT have any representation about the meaning of the prompt. 2. NOT have any representation about what they were going to say. In 2025 finally almost everybody stopped saying so.

* Chain of thought is now a fundamental way to improve LLM output. But, what is CoT? Why it improves output? I believe it is two things: 1. Sampling in the model representations (that is, a form of internal search). After information and concepts relevant to the prompt topic is in the context window, the model can better reply. 2. But if you mix this to reinforcement learning, the model also learns to put one token after the other (each token will change the model state) in order to converge to some useful reply.

* The idea that scaling is limited to the number of tokens we have, is no longer true, because of reinforcement learning with verifiable rewards. We are still not at AlphaGo move 37 moment, but is this really impossible in the future? There are certain tasks, like improving a given program for speed, for instance, where in theory the model can continue to make progress with a very clear reward signal for a very long time. I believe improvements to RL applied to LLMs will be the next big thing in AI.

* Programmers resistance to AI assisted programming has lowered considerably. Even if LLMs make mistakes, the ability of LLMs to deliver useful code and hints improved to the point most skeptics started to use LLMs anyway: now the return on the investment is acceptable for many more folks. The programming world is still split among who uses LLMs as colleagues (for instance, all my interaction is via the web interface of Gemini, Claude, …), and who uses LLMs as independent coding agents.

* A few well known AI scientists believe that what happened with Transformers can happen again, and better, following different paths, and started to create teams, companies to investigate alternatives to Transformers and models with explicit symbolic representations or world models. I believe that LLMs are differentiable machine trained on a space able to approximate discrete reasoning steps, and it is not impossible they get us to AGI even without fundamentally new paradigms appearing. It is likely that AGI can be reached independently with many radically different architectures.

* There is who says chain of thought changed LLMs nature fundamentally, and this is why they, in the past, claimed LLMs were very limited, and now are changing their mind. They say, because of CoT, LLMs are now a different thing. They are lying. It is still the same architecture with the same next token target, and the CoT is created exactly like that, token after token.

* The ARC test today looks a lot less insurmountable than initially thought: there are small models optimized for the task at hand that perform decently well on ARC-AGI-1, and very large LLMs with extensive CoT achieving impressive results on ARC-AGI-2 with an architecture that, according to many folks, would not deliver such results. ARC, in some way, transitioned from being the anti-LLM test to a validation of LLMs.

* The fundamental challenge in AI for the next 20 years is avoiding extinction.

2025年末对人工智能的反思 Reflections on AI at the End of 2025

2025年末对人工智能的反思
Reflections on AI at the End of 2025