大事可能(并)未发生。
Something Big Is (Not) Happening

原始链接: https://www.aricolaprete.com/2026/02/something-big-is-not-happening.html

这篇内容探讨了大型语言模型(LLM)在令人印象深刻的编程能力之外的局限性。作者以寻找人工智能无法处理的“难题”为生,认为LLM擅长*执行*(例如编程),但缺乏真正的*判断力*或*决策能力*。 虽然人工智能可以模仿智能,甚至通过递归训练实现自我提升,但其本质上运作着一个复杂的“灯泡”——基于预定义的成果取得成功,而并非理解*原因*。作者强调了人工智能在处理细微差别、空间推理以及需要超越文本模式的真正理解的任务方面的困难,并以康德的《纯粹理性批判》为例,指出该作品过于依赖文本排列,LLM难以复制。 最终,作者认为人类专业知识将保持至关重要。总会有人需要识别人工智能的弱点,“检查内部运作”,并提供LLM目前缺乏的批判性思维——他们乐于扮演“绊脚石”的角色。

相关文章

原文

Or is it happening, perhaps it has already happened? Maybe it's just acting like it has happened, because it is tired and would like to go to bed?

Although trains can get you far going straight across long, flat distances, the sherpas are still gainfully employed since we have not yet succeeded in getting locomotives up tall mountains--then again we have cable cars that reach some distance, we have crampons and pickaxes for the hard pack and ice walls, winter coats to wrap the vital, strong, cunning individual who would stand over the peak like some great romantic figure…


…It's always the little things the AI can't figure out. How many Rs in the word Strawberry--did they fix that one, too? Or how to identify a seahorse emoji, or create a summary that gets all the little details right, every little particularity, every grain of sand on the beach, and not just the general shape of its curve.

In my business we make our money by finding out ways to get around the AI, we call them "Stumpers." You discover that there are a great many things LLMs are incapable of, and they are often incapable in surprising, yet increasingly predictable ways. But they are really good at programming; because, one might say, that's how the "intelligence explosion" happens--they focused on programming first so that they could program themselves, generate themselves, autopoiesis, autocatalysis, autonomous self-generating self-regenerating accelerating hyper machines from the future that have come to kill the people who would stop them in the past! (I think we've all seen this movie before).

Excuse my French, but in the biz we might call this a posteriori logic. You take what you have observed, empirically, and work backwards to theorize how that thing must've come about. You invent a narrative that makes sense. But, the world is not a good story, even if we love to tell stories about it. Let's consider the case more concretely:

You flip a switch and either the light bulb turns on or it doesn't turn on. If your intent was to turn the light on, when it does go on, you say the switch "works." It doesn't matter if you know how, if you even understand the underlying mechanism, as long as it does what it was "supposed" to do.

A computer is something like this, it's basically a very complicated kind of lightbulb, with many different kinds of switches and levers and buttons and other inputs that go through numerous logic gates and even, like an etch-a-sketch, has some stores of memory that can be written or erased at will. There is a way to change the color on the screen, there is a way to simulate complex physics models, there is a way to execute high frequency trades; in each case, there is a success state: the screen turns green, the results match the model, the trade is profitable.

Since programming is only just that, creating something that does what it's supposed to do in any case, as long as it does that thing it doesn't matter why it works, as long as it works. It's an immediately practical discipline, and unique amongst the various forms of writing in that it is beyond polysemy, beyond the possibility of multiple interpretations. Even the law, with its numerous codes and cases and precedent, is still subject to interpretation, ceaseless interpretation, interpretation so vast, so constant, that our entire society hinges, sometimes, on the thoughts of nine individuals in DC (or more likely their assistants'). AI assistants are still very good with the law, especially for looking up relevant case law, for getting a quick overview of a said case. Although, if there were twenty millions dollars on the line, would you still trust the model? If it was a life or death decision, would you trust the model? Judgement, yes, but decision? No, they are not capable of making a decision, at least important ones[1]. Computer programming, on the other hand, is not about making decisions, it's about mechanical functions. So Claude might go back and change some things, may appear to make judgements, have particular tastes. This is a behavior that was probably intended by its architects, this recursive evaluation, which was trained on the work of hundreds of talented programmers who were, to be sure, ready at hand. It doesn't matter if you're writing a script that outputs "Hello World" or creating an entire video game, as long as it works, as long as the material is recombined such that it matches some ideal image, at least closely enough.

All this, because the LLM is fundamentally a writing machine, it does everything via text, and if you make it produce writing that exists purely to serve some sort of mechanical function, and you train it to succeed in that task, then it will tend to do so, even with vast intricacy. It still fails when it comes to spatial relations within text, because everything is understood in terms of relations and correspondences between tokens as values themselves, and apparent spatial position is not a stored value. Space is the field of difference, irreconcilable difference--if the letters, words on the page were not separate, then how would they be analyzed into their particular relations? Kant had already demonstrated this schematic form hundreds of years ago in the Critique of Pure Reason, which no current LLM could ever reproduce by themselves, because the book is so dependent on the specific spatial arrangements of the text, the various tabular organizations, the antinomies themselves which are always cut off from one another, a chasm of difference sits between[2].

And if an LLM cannot reliably reproduce a book like the first critique without some outside intervention, then someone will always find something it can't yet do, requires some tooling, some re-working to get done, some function that it is currently incapable of. They'll always need someone to take a look under the hood, figure out how their machine ticks. A strong, fearless individual, the spanner in the works, the eddy in the stream! There, sitting in my room, looking over prompts, waiting for someone to shoot me a message, asking if they can share their screen.

(Claude Sonnet 4.5)

联系我们 contact @ memedata.com