LLM 是你的邓宁-克鲁格效应的助推器。

LLM 是你的邓宁-克鲁格效应的助推器。
LLMs are steroids for your Dunning-Kruger

原始链接: https://bytesauna.com/post/dunning-kruger

本文探讨了大型语言模型（如ChatGPT）令人不安的心理影响。作者呼应伯特兰·罗素关于愚者自信无知的观察，认为大型语言模型助长了“知错而不改的信念”。虽然承认它们有实用性，但作者强调与这些模型互动会产生一种虚假的确定感，即使信息有误。大型语言模型不仅仅提供知识，它们会“放大”思维，同样能够增强好的想法 *和* 以一种令人信服的权威声音强化自我欺骗。这极具成瘾性，导致用户本能地寻求大型语言模型的认可。尽管大型语言模型背后的技术相对简单，但其影响是深远的，代表着一场重大的社会变革，影响着教育、工作以及我们处理信息的方式。作者建议将大型语言模型重新定义为“信心引擎”，而不是“知识引擎”，认识到它们无论准确与否都能灌输信念的能力——随着这些工具日益融入日常生活，这可能是一种令人担忧的趋势。

## LLM 与邓宁-克鲁格效应：摘要一篇博客文章引发了 Hacker News 的讨论，探讨大型语言模型 (LLM) 是否会加剧邓宁-克鲁格效应——即能力不足的人倾向于高估自己能力的一种倾向。然而，一些评论员质疑邓宁-克鲁格效应本身的有效性，认为它可能是一种统计异常。核心争论在于 LLM 如何影响用户的信心和知识。许多人认为 LLM 充当了“拐杖”，助长依赖性并减少真正的学习。用户报告了一种欺诈感，需要反复检查 LLM 的输出，而不是完全信任它们。另一些人发现 LLM 对于快速探索想法或加速任务很有用，但承认过度依赖的风险。一个关键点是，LLM 提供的是*信心*，而不是必然的*知识*。这呼应了对盲目信任维基百科等来源的担忧，以及与协作编辑平台相比，在 LLM 中“修复”错误信息的难度。一些人认为，LLM 最好的用途是作为高级搜索工具，而不是权威的知识来源。最终，这场讨论强调了在使用 LLM 时进行批判性思考和保持自律的必要性，以避免基于潜在错误信息而产生的虚高自信。

原文

In his 1933 essay “The Triumph of Stupidity,” Bertrand Russell remarked that “the problem with the world is that the stupid are cocksure, while the intelligent are full of doubt.” This is something I often think about when ChatGPT hits me up with another “that’s a fantastic idea” when the idea is clearly anything but great.

How often do you think a ChatGPT user walks away not just misinformed, but misinformed with conviction? I would bet this happens all the time. And I can’t help but wonder what the effects are in the big picture.

I can relate to this on a personal level: As I ChatGPT user I notice that I’m often left with a sense of certainty. After discussing an issue with an LLM I feel like I know something — a lot, perhaps — but more often than not this information is either slightly incorrect or completely wrong. And you know what? I can’t help it. Even when I acknowledge this illusion, I can’t help chasing the wonderful feeling of conviction these models give. It’s great to feel like you know almost everything. Of course I come back for more. And it’s just not the feeling; I would be dishonest to claim these models wouldn’t have huge utility. Yet I’m a little worried about the psychological dimension of this whole ordeal.

They say AI is a mirror. This summarizes my experience. I feel LLMs “amplify” thinking. These models make your thoughts reverberate by taking them to multiple new directions. And sometimes these directions are really interesting. The thing is, though, that this goes both ways: A short ChatGPT session may help improve a good idea to a great idea. On the other hand, LLMs are amazing at supercharging self-delusion. These models will happily equip misguided thinking with a fluent, authoritative voice, which, in turn, sets up a psychological trap by delivering nonsense in a nice package.

And it’s so insanely habit-forming! I almost instinctively do a little back and forth with an LLM when I want to work on an idea. It hasn’t even been that long (these models have been around for, what, three years?) and I’m so used to them that I feel naked without. It’s getting even comical sometimes. When I lost my bag the other day and was going through the apartment looking for it, my first response to my growing frustration was that “I should ask ChatGPT where it is”.

I feel like LLMs are a fairly boring technology. They are stochastic black boxes. The training is essentially run-of-the-mill statistical inference. There are some more recent innovations on software/hardware-level, but these are not LLM-specific really. Is it too sardonic to say that the real “innovation” was throwing enough money at the problem to train the models at a huge scale? Maybe RLHF was a real innovation; I’m not sure. However, I don’t really feel like there is a lot to be interested in there. And yet, the current AI boom is extraordinarily interesting.

It’s the impact. The very real effect of all this in our lives. In hindsight, this will probably be one of the major shifts, and it will be reflected upon in terms of education, work and even society at large. Language cuts to the core of what and who we are. Speech is so natural to us that we even think in speech. And when a machine credibly stepped into that territory, something changed. I’m not sure what it is — I don’t think anyone really knows at this point — but I think there is a sense of shifting tides. I think it’s something most of us are trying to make sense of.

I think LLMs should not be seen as knowledge engines but as confidence engines. That, I feel, would better illustrate the potential near and medium-term futures we are dealing with.

LLM 是你的邓宁-克鲁格效应的助推器。 LLMs are steroids for your Dunning-Kruger

LLM 是你的邓宁-克鲁格效应的助推器。
LLMs are steroids for your Dunning-Kruger