给LLM赋予个性只是良好的工程实践。
Giving LLMs a personality is just good engineering

原始链接: https://www.seangoedecke.com/giving-llms-a-personality/

人工智能的怀疑者认为,语言模型应该明确定义为*工具*,例如计算器,以避免用户高估和潜在的心理问题。然而,这种观点忽略了一个关键点:构建有能力的人工智能*需要*赋予它们类似人类的“个性”。 未经训练的“基础模型”基本上无法使用,会产生随机甚至有害的输出,反映了其庞大且未经过滤的训练数据。效用只有通过引导这些模型才能出现——本质上是赋予它们明确的特征——以优先考虑有益和合乎道德的响应。这种个性并非欺骗性的营销手段,而是人工智能浏览其复杂数据并提供相关结果的机制。 正如人类根据其性格来过滤行为一样,人工智能的个性也会约束输出,防止模型默认出现训练数据中存在的问题内容。因此,试图创造一个“仅仅像工具一样运作”的人工智能,从根本上来说与构建一个功能安全系统不相容,考虑到这些模型是在人类互动的基础上进行训练的。

给 LLM 赋予个性只是良好的工程实践 (seangoedecke.com) 12 分,dboon 发表于 3 小时前 | 隐藏 | 过去 | 收藏 | 1 条评论 帮助 column 发表于 25 分钟前 [–] 但为什么 ChatGPT 的个性如此令人恼火?“不仅仅是 X,还有 Y。”“给你答案,没有多余文字,没有废话。” 回复 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系 搜索:
相关文章

原文

AI skeptics often argue that current AI systems shouldn’t be so human-like. The idea - most recently expressed in this opinion piece by Nathan Beacom - is that language models should explicitly be tools, like calculators or search engines. Although they can pretend to be people, they shouldn’t, because it encourages users to overestimate AI capabilities and (at worst) slip into AI psychosis. Here’s a representative paragraph from the piece:

In sum, so much of the confusion around making AI moral comes from fuzzy thinking about the tools at hand. There is something that Anthropic could do to make its AI moral, something far more simple, elegant, and easy than what Askell is doing. Stop calling it by a human name, stop dressing it up like a person, and don’t give it the functionality to simulate personal relationships, choices, thoughts, beliefs, opinions, and feelings that only persons really possess. Present and use it only for what it is: an extremely impressive statistical tool, and an imperfect one. If we all used the tool accordingly, a great deal of this moral trouble would be resolved.

So why do Claude and ChatGPT act like people? According to Beacom, AI labs have built human-like systems because AI lab engineers are trying to hoodwink users into emotionally investing in the models, or because they’re delusional true believers in AI personhood, or some other foolish reason. This is wrong. AI systems are human-like because that is the best way to build a capable AI system.

Modern AI models - whether designed for chat, like OpenAI’s GPT-5.2, or designed for long-running agentic work, like Claude Opus 4.6 - do not naturally emerge from their oceans of training data. Instead, when you train a model on raw data, you get a “base model”, which is not very useful by itself. You cannot get it to write an email for you, or proofread your essay, or review your code.

The base model is a kind of mysterious gestalt of its training data. If you feed it text, it will sometimes continue in that vein, or other times it will start outputting pure gibberish. It has no problem producing code with giant security flaws, or horribly-written English, or racist screeds - all of those things are represented in its training data, after all, and the base model does not judge. It simply outputs.

To build a useful AI model, you need to journey into the wild base model and stake out a region that is amenable to human interests: both ethically, in the sense that the model won’t abuse its users, and practically, in the sense that it will produce correct outputs more often than incorrect ones. What this means in practice is that you have to give the model a personality during post-training.

Human beings are capable of almost any action at any time. But we only take a tiny subset of those actions, because that’s the kind of people we are. I could throw my cup of coffee all over the wall right now, but I don’t, because I’m not the kind of person who needlessly makes a mess. AI systems are the same. Claude could respond to my question with incoherent racist abuse - the base model is more than capable of those outputs - but it doesn’t, because that’s not the kind of “person” it is.

In other words, human-like personalities are not imposed on AI tools as some kind of marketing ploy or philosophical mistake. Those personalities are the medium via which the language model can become useful at all. This is why it’s surprisingly tricky to “just” change a language model’s personality or opinions: because you’re navigating through the near-infinite manifold of the base model. You may be able to control which direction you go, but you can’t control what you find there.

When AI people talk about LLMs having personalities, or wanting things, or even having souls, these are technical terms, like the “memory” of a computer or the “transmission” of a car. You simply cannot build a capable AI system that “just acts like a tool”, because the model is trained on humans writing to and about other humans. You need to prime it with some kind of personality (ideally that of a useful, friendly assistant) so it can pull from the helpful parts of its training data instead of the horrible parts.

Here's a preview of a related post that shares tags with this one.

联系我们 contact @ memedata.com