（评论）

（评论）
(comments)

原始链接: https://news.ycombinator.com/item?id=41541053

通过提供幻觉的数学证据，研究人员反驳了幻觉可以完全消除的传统观点。经过数学证明的幻觉是由于机器学习模型无法准确预测概率而发生的。由于模型在文本生成过程中从概率分布中进行采样，因此控制输出变得具有挑战性，从而导致偶尔出现不准确的情况。尽管是在海量数据集上进行训练的，但该模型的响应并不代表与真相或现实完全脱节，因为它们主要来自与现实生活场景密切相关的基础分布。因此，尝试消除幻觉等同于减轻对齐问题，这意味着这两个概念都解决相同的问题。然而，研究认为，由于模型设计和学习方法的内在特性，完全消除幻觉实际上是不可能的。

> By establishing the mathematical certainty of hallucinations, we challenge the prevailing notion that they can be fully mitigated

Having a mathematical proof is nice, but honestly this whole misunderstanding could have been avoided if we'd just picked a different name for the concept of "producing false information in the course of generating probabilistic text".

"Hallucination" makes it sound like something is going awry in the normal functioning of the model, which subtly suggests that if we could just identify what went awry we could get rid of the problem and restore normal cognitive function to the LLM. The trouble is that the normal functioning of the model is simply to produce plausible-sounding text.

A "hallucination" is not a malfunction of the model, it's a value judgement we assign to the resulting text. All it says is that the text produced is not fit for purpose. Seen through that lens it's obvious that mitigating hallucinations and creating "alignment" are actually identical problems, and we won't solve one without the other.

Yes, exactly, it’s a post-facto value judgment, not a precise term. If I understand the meaning of the word, “hallucination” is all the model does. If it happens to hallucinate something we think is objectively true, we just decide not to call that a “hallucination”. But there’s literally no functional difference between that case and the case of the model saying something that’s objectively false, or something whose objective truth is unknown or undefinable.

I haven’t read the paper yet, but if they resolve this definition usefully, that would be a good contribution.

Exactly this, I've been saying this since the beginning. Every response is a hallucination - a probabilistic string of words divorced from any concept of truth or reality.

By total coincidence, some hallucinations happen to reflect the truth, but only because the training data happened to generally be truthful sentences. Therefore, creating something that imitates a truthful sentence will often happen to also be truthful, but there is absolutely no guarantee or any function that even attempts to enforce that.

All responses are hallucinations. Some hallucinations happen to overlap the truth.

I think you're going too far here.

> By total coincidence, some hallucinations happen to reflect the truth, but only because the training data happened to generally be truthful sentences.

It's not a "total coincidence". It's the default. Thus, the model's responses aren't "divorced from any concept of truth or reality" - the whole distribution from which those responses are pulled is strongly aligned with reality.

(Which is why people started using the term "hallucinations" to describe the failure mode, instead of "fishing a coherent and true sentence out of line noise" to describe the success mode - because success mode dominates.)

Humans didn't invent language for no reason. They don't communicate to entertain themselves with meaningless noises. Most of communication - whether spoken or written - is deeply connected to reality. Language itself is deeply connected to reality. Even the most blatant lies, even all of fiction writing, they're all incorrect or fabricated only at the surface level - the whole thing, accounting for the utterance, what it is about, the meanings, the words, the grammar - is strongly correlated with truth and reality.

So there's absolutely no coincidence that LLMs get things right more often than not. Truth is thoroughly baked into the training data, simply because it's a data set of real human communication, instead of randomly generated sentences.

The problem - as defined by how end users understand it - is that the model itself doesn't know the difference, and will proclaim bullshit with the same level of confidence that it does accurate information.

That's how you end up with grocery store chatbots recommending mixing ammonia and bleach for a cocktail, or lawyers using chatbots to cite entirely fictional case law before a judge in court.

Nothing that comes out of an LLM can be implicitly trusted, so your default assumption must be that everything it gives you needs verification from another source.

Telling people "the truth is baked in" is just begging for a disaster.

> your default assumption must be that everything it gives you needs verification from another source

That depends entirely on what you're doing with the output. If you're using it as a starting point for something that must be true (whether for legal reasons, your own reputation as the ostensible author of this content, your own education, etc.) then yes, verification is required. But if you're using it for something low-stakes that just needs some semblance of coherent verbiage (like the summary of customer reviews on Amazon, or the SEO junk that comes before the recipe on cooking websites which have plenty of fiction whether or not an LLM was involved) then you can totally meet your goals without any verification.

People have been capable of bullshitting at scale for a very long time. There are occasional consequences (hoaxes, scams, etc.) but the guidelines around fide sed vide are ancient; this is just the latest addendum.

This is just moving the goalposts. The post I replied to was claiming that models "have the truth baked in". Real people in the real world are misusing them, in no small part because they don't know that the models are unreliable, and OP's claims only make that worse.

> is that the model itself doesn't know the difference, and will proclaim bullshit with the same level of confidence

which is a good model for what humans do as well

> It's not a "total coincidence". It's the default. Thus, the model's responses aren't "divorced from any concept of truth or reality" - the whole distribution from which those responses are pulled is strongly aligned with reality.

One big caveat here - the responses are strongly aligned with the training data. We can't necessarily say that the training data itself is strongly aligned with reality.

> Even the most blatant lies, even all of fiction writing, they're all incorrect or fabricated only at the surface level - the whole thing, accounting for the utterance, what it is about, the meanings, the words, the grammar - is strongly correlated with truth and reality.

I would reject this pretty firmly. As you said, people write whole novels about imagined worlds and people about magic or technology or whatever that doesn't or can't exist. The LLM may understand what words mean and "know" how to string them together into a meaningful and grammatical sentence, but that's entirely different than a truthful sentence.

Truth requires some mechanism of fact finding, or chains of evidence, or admitting when those chains don't exist. LLMs have nothing like that.

Strongly correlated != 100% overlap

A 99% overlap can still be coincidence.

And even if it was ‘absolutely no coincidence’, that is still only reflective of the reality as perceived by the average of all the people from the training set.

So much truth here, very refreshing to see!

About time too, the sooner we can stop the madness the better, building a society on top of this technology is a movie I'd rather not see.

I don't know about OP, but I'm suggesting that the term 'hallucinate' be abolished entirely as applies to LLMs, not redefined. It draws an arbitrary line in the middle of the set of problems that all amount to "how do we make sure that the output of an LLM is consistently acceptable" and will all be solved using the same techniques if at all.

If people already understand what "hallucination" means, then I think it's perfectly intuitive and educational to say that, actually, the LLM is always doing that, just that some of those hallucinations happen to coincidentally describe something real.

We need to dispell the notion that the LLM "knows" the truth, or is "smart". It's just a fancy stochastic parrot. Whether it's responses reflect a truthful reality or a fantasy it made up is just luck, weighted by (but not constrained to) its training data. Emphasizing that everything is a hallucination does that. I purposefully want to reframe how the word is used and how we think about LLMs.

LLMs do now have a concept of truth now since much of the RLHF is focused on making them more accurate and true.

I think the problem is that humanity has a poor concept of truth. We think of most things as true or not true when much of our reality is uncertain due to fundamental limitations or because we often just don't know yet. During covid for example humanity collectively hallucinated the importance of disinfecting groceries for awhile.

I think taking decisions based on different risk models is not a hallucination.

To the extreme: if during covid someone would live completely off grid (no contact with anyone) would have greatly reduced infection risk, but I would have found the risk model unreasonable.

The problem with LLM-s is that they don't "model" what they are not capable off (the training set is what they know). So it is harder for them to say "I don't know". In a way they are like humans - I seen a lot of times humans preferring to say something rather than admitting they just don't know. It ss an interesting (philosophical) discussion how you can get (as a human or LLM) to the level of introspection required to determine if you know or don't know something.

Exactly, we think of reasoning as knowing the answer but the real key to the enlightenment and age of reason was admitting that we don't know instead of making things up. All those myths are just human hallucinations.

Humans taught themselves not to hallucinate by changing their reward function. Experimentation and observation was valued over the experts of the time and pure philosophy, even over human-generated ideas.

I don't see any reason that wouldn't also work with LLMs. We rewarded them for next-token prediction without regard for truth, but now many variants are being trained or fine-tuned with rewards focused on truth and correct answers. Perplexity and xAI for example.

> LLMs do now have a concept of truth now since much of the RLHF is focused on making them more accurate and true.

Is it? I thought RLHF was mostly focused on making them (1) generate text that looks like a conversation/chat/assistant (2) ensure alignment i.e. censor it (3) make them profusely apologize to set up a facade that makes them look like they care at all.

I don't think one can RLHF the truth because there's no concept of truth/falsehood anywhere in the process.

> humanity collectively hallucinated the importance of disinfecting groceries for awhile

I reject this history.

I homeschooled my kids during covid due to uncertainty and even I didn't reach that level, and nor did anyone I knew in person.

A very tiny number who were egged on by some YouTubers did this, including one person I knew remotely. Unsurprisingly that person was based in SV.

It's not some extremist on YouTube, disinfecting your groceries was the official recommendation of many countries worldwide, including most of Europe. I couldn't say how many people actually followed the recommendation , but I would bet it's way more than a tiny number.

There was a period near the start of the pandemic, especially while the medical establishment was trying to avoid ordinary people wearing masks in order to help stockpile them for high priority workers, when a lot of emphasis was put on surface contact.

If it's extremely important to wear gloves and keep sanitizing your hands after touching every part of the supermarket, it stands to reason that you'd want to sanitize all of the outside packaging that others touched with their diseased hands as soon as you brought it into your house. Otherwise, you'd be expected to sanitize your hands every time you touched those items again, even at home, right?

Of course, surface contact is actually a very minor avenue of infection, and pretty much limited to cases where someone has just sneezed or coughed on a surface that you are touching, and then putting your hand to your nose or maybe eyes or mouth soon after. So sanitizing groceries is essentially pointless, since it only slightly reduces an already very small risk.

Maybe they shouldn’t have mixed truthful data with obviously untruthful data in the same training data set?

Why not make a model only from truthful data? Like exclude all fiction for example.

1) It's impossible to get enough data to train one of these well while also curating it by hand.

2) Even if you could, randomly sampling from a probability distribution will cause it to make stuff up unless you overfitted on the training data. An example that's come up in thread is ISBNs—there isn't going to be enough signal in the training set to reliably encode sufficiently high probability strings for all known ISBNs, so sometimes it will just string together likely numbers.

That wouldn't prevent hallucination. An LLM doesn't know what it doesn't know. It will always try to come up with a response that sounds plausible, based on its knowledge or lack thereof.

In my experience, humans are at least as bad at it as GPT-4, if not far worse. In terms, specifically, of being "factually accurate" and grounded in absolute reality. Humans operate entirely in the probabilistic realm of what seems right to us based on how we were educated, the values we were raised with, our religious beliefs, etc. -- Human beings are all over the map with this.

> In my experience, humans are at least as bad at it as GPT-4, if not far worse.

I had an argument with a former friend recently, because he read some comments on YouTube and was convinced a racoon raped a cat and produced some kind of hybrid offspring that was terrorizing a neighborhood. Trying to explain that different species can't procreate like that resulted in him pointing to the fact that other people believed it in the comments as proof.

Say what you will about LLMs, but they seem to have a better basic education than an awful lot of adults, and certainly significantly better basic reasoning capabilities.

> Trying to explain that different species can't procreate like that resulted in him pointing to the fact that other people believed it in the comments as proof.

Those two species can't interbreed apparently, but considering the number of species that can [1] produce hybrid offspring, some even from different families, it is reasonable to forgive people for entertaining the possibility.

[1] https://en.m.wikipedia.org/wiki/List_of_genetic_hybrids

I don't think it's remotely reasonable. The list you refer to, which I don't need to click on as I'm already familiar with it, is animals within the same family, e.g. bi cats.

Raccoons are not any type of feline, and this should be basic knowledge for any adult in any western country who grew up there and went to school.

There are at least a couple of examples in the article that you refuse to read that describe hybrids from different families. Sorry, but your purported basic knowledge is wrong.

I'm not 'refusing to read' it, I said I'm familiar with it because I've read it numerous times in the past.

Which examples are you referring to? The only real example seems to be fish.

In any case I was using 'family' in a loose sense, not in the stricter scientific biological hierarchy sense.

My basic knowledge is not wrong at all, because my point was that animals that far apart could not reproduce. That's it. The wiki page you linked doesn't really justify your idea that because some hybrids exist people might think any hybrid could exist.

The point is, it's frankly idiotic or at least extremely ignorant for anyone 40 years of age who grew up in the US or any developed country to think that.

I also very much doubt the people who believe a racoon could rape a cat and produce offspring are even aware of that wiki page or any of the examples on it. Hell, I doubt they even know a mule is a hybrid. Your hypothesis doesn't hold water.

Additionally, most of the examples on that page are the result of human intervention and artificial insemination, not wild encounters. Context matters.

This is demonstrably not true. People also bullshit, a lot, but nowhere near the level of an LLM. You won't get fake citations, complete with publication year and ISBN, in a conversation with a human. StackOverflow is not full of down voted answers of people suggesting to use non-existent libraries with complete code examples.

Humans can hallucinate but later determine that what they thought was occurring was not actually real. LLMs can't do that. What you're saying sounds to me rather like what some people are tempted to do on encountering metaphysics: posing questions like "maybe everything is a dream and nothing we experience is real". Which is a logically valid sentence, I guess, but it really is meaningless. The reason we have words like "dreaming" and "awake" is that we have experienced both and know the difference. Ditto "hallucinations". It doesn't seem that there is any difference to LLMs between hallucinations and any other kind of experience. So, I feel like your line of reasoning is off-base somewhat.

It's definitely part of what cognition is, hallucinogens/meditation/etc allows anyone to verify that much.

Intuitively cognition is several systems running in tandem, supervising and cross checking answers, likely iteratively until some threshold is reached.

Wouldn't surprise me if expert/rule systems are up for some kind of comeback; I feel like we need both, tightly integrated.

There's also dreams, and the role they play in awareness, some kind of self reflective work is probably crucial.

That being said, I'm 100% sure there is something in self awareness that is not part of the system and can't be replicated.

I can observe myself from the outside, actions and reactions, thoughts and feelings; which begs the question: who is acting and reacting, thinking and feeling, and what am I if not that?

Both of those terms have precise meanings. They're not the same thing. Summarized --

Cognition: acquiring knowledge and understanding through thought and the senses.

Hallucination: An experience involving the perception of something not present.

With those definitions in mind, hallucination can be defined as false-cognition that is not based in reality. It's not cognition because cognition grants knowledge based on truth and hallucination leads the subject to believe lies.

In other words, "humans are just really good at hallucination" rejects the notion that we're able to perceive actual reality with our senses.

I mean hallucination in the context of this conversation: probabilistic token generation without any real knowledge or understanding.

Maybe if we add a lot of neurons and make it all faster, we would end up with “knowledge” as an emergent feature? Or maybe we wouldn’t.

I don’t know who/how the term was initially coined in this context, but I’m concerned that the things that make it inaccurate are also, perhaps counterintuitively, things that serve the interests of those who would overstate the capabilities of LLMs, and seek to cloud their true nature (along with inherent limitations) to investors and potential buyers. As you already pointed out, the term implies that the problems that are represented are temporary “bugs”, rather than symptoms of the underlying nature of the technology itself.

Which is totally on them. We had to suffer a decade of being told the human brain is a neural net (as if my brain was a numeric matrix, freaking ridiculous) they get to suffer when that dumb analogy comes back to bite them.

How different things would be if the phenomenon had been called "makin' stuff up" instead. Humans make stuff up all the time, and make up far more outrageous things than AIs make up. One has to ask whether humans are really intelligent /not entirely sarcasm.

I'd prefer the phenomenon be called "saying the first thing that comes to your mind" instead, because humans do that a lot as well, and that happens to produce pretty much the same failures as LLMs do.

IOW, humans "hallucinate" exactly the same way LLMs do - they just usually don't say those things out loud, but rather it's a part of the thinking process.

See also: people who are somewhat drunk, or very excited, tend to lose inhibitions around speaking, and end up frequently just blurting whatever comes to their mind verbatim (including apologizing and backtracking and "it's not what I meant" when someone points out the nonsense).

It’s still a bug, even if it’s the only way the system can behave as currently designed. I agree that “hallucination” is a poor term for it, though. For medication, we call a bug a “side effect” even though it’s really just a chemical interaction which, given enough information, could be predicted.

Ultimately the computer only does what we tell it to do. That has always been the case and probably always will be, just as we are the result of our inputs.

As with most bugs, I think to solve hallucinations we will need to better understand the input and its interactions within the system.

I find it helpful to distinguish between bugs and design flaws.

A bug is caused by a poorly implemented version of the design (or a literal bug in the system). Fixing a bug requires identifying where the system varies from the design and bringing it into alignment with the design.

A design flaw is a case where the idealized system as conceived by the engineers is incapable of fully solving the problem statement. Fixing a design flaw may require small tweaks, but it can also mean that the entire solution needs to be thrown out in favor of a new one.

Importantly, what's a design flaw for one problem statement may be just fine or even beneficial for another problem statement. So, more objectively, we might refer to these as design characteristics.

Hallucinations are a special case of two low-level design characteristics of LLMs: first, that they are trained on more data than can reasonably be filtered by a human (and therefore will be exposed to data that the humans wish it weren't) and second, that they produce their text by sampling a distribution of probabilities. These two characteristics mean that controlling the output of an LLM is very very difficult (or, as the article suggests, impossible), leading both to hallucinations and alignment concerns (which are actually the same concept framed slightly differently).

If the problem statement for an LLM application requires more 99% factual accuracy or more than 99% "doesn't produce content that will make investors nervous" accuracy, these design characteristics count as a design flaw.

If that's true for your use case (it's not true for all) then yes, including LLMs in your design would be a design flaw until they get much much better.

I think there's a useful distinction between plausible-seeming text that is wrong in some subtle way, vs text that is completely fabricated to match a superficial output format, and the latter is what I wish people used "hallucination" to mean. A clear example of this is when you ask an LLM for some sources, with ISBNs, and it just makes up random titles and ISBNs that it knows full well do not correspond with reality. If you ask "Did you just make that up?" the LLM will respond with something like "Sorry, yes, I made that up, I actually just remembered I can't cite direct sources." I wonder if this is because RLHF teaches the LLM that humans in practice prefer properly formatted fake output over truthful refusals?

How does a model "know full well" that it output a fake ISBN?

It's been trained that sources look like plausible-titles + random numbers.

It's been trained that when challenged it should say "oh sorry I can't do this."

Are those things actually distinct?

Nah it's quite pedantic to say that 'this neologism does not encapsulate the meaning it's meant to'

This is the nature of language evolution. Everyone knows what hallucination means with respect to AI, without trying to confer to its definition the baggage of a term used for centuries as a human psychology term.

Neologism undersells what this term is being used for. It's a technical term of art that's created its own semantic category in LLM research that separates "text generated that is factually inaccurate according to ${sources}" from "text generated that is morally repugnant to ${individuals}" or "text generated that ${governments} want to censor".

These three categories are entirely identical at a technological level, so I think it's entirely reasonable to flag that serious LLM researchers are treating them as distinct categories of problems when they're fundamentally not at all distinct. This isn't just a case of linguistic pedantry, this is a case of the language actively impeding a proper understanding of the problem by the researchers who are working on that problem.

> This is the nature of language evolution.

Only if it sticks. Hallucination is such an unnatural term for the phenomenon I would be surprised to see it stick.

> Everyone knows what hallucination means with respect to AI

This is false. It took me months to realize this just meant "output incoherent with reality" rather than an issue with training—the natural place for perceptual errors to occur.

"Hallucination" is derogatory and insulting when aimed at normal people who hear normal voices which don't necessarily belong to corporeal beings, or originate in the natural world. Labeling "hallucinations" and pathologizing, then medicating them, constitutes assault and bigotry.

"Hallucination" applied to inanimate and non-sentient software is insulting and presumptive on a different level. I'm fine with "confabulation".

The term "hallucinate" may not be the best. However, if the context is a device generating a plausible sounding but non-existent scientific reference for a given claim it makes, some words more specific than "producing false information..." seem needed.

These thing make assertions, some true, some false. Since they emulate human behavior, they say things that would tend to convince people of their assertions. So "anthropomorphism" is a complicated question.

True. Actually, researchers know about it. In a sense, i feel "hallucination" is a way to keep the hype up, that we could fix it in the future, given enough data and compute and money. On the other hand, "generating false information by design" is too strong and negative. At least that's what I hear around in my research community.

Your argument makes several mistakes.

First, you have just punted the validation problem of what a Normal LLM Model ought to be doing. You rhetorically declared hallucinations to be part of the normal functioning (i.e., the word "Normal" is already a value judgement). But we don't even know that - we would need theoretical proof that ALL theoretical LLMs (or neural networks as a more general argument) cannot EVER attain a certain probabilistic distribution. This is a theoretical computer science problem and remains an open problem.

So the second mistake is your probabilistic reductionism. It is true that LLMs, neural nets, and human brains alike are based on probabilistic computations. But the reasonable definition of a Hallucination is stronger than that - it needs to capture the notion that the probabilistic errors are way too extreme compared to the space of possible correct answers. An example of this is that Humans and LLMs get Right Answers and Wrong Answers in qualitatively very different ways. A concrete example of that is that Humans can demonstrate correctly the sequence of a power set (an EXP-TIME problem), but LLMs theoretically cannot ever do so. Yet both Humans and LLMs are probabilistic, we are made of chemicals and atoms.

Thirdly, the authors' thesis is that mitigation is impossible. It is not some "lens" where mitigation is equal to alignment, in fact one should use their thesis to debunk the notion that Alignmnent is an attainable problem at all. It is formally unsolvable and should be rendered as a absurd as someone claiming prima facie that the Halting Problem is solvable.

Finally, the meta issue is that the AI field is full of people who know zip about theoretical computer science. The vast majority of CS graduates have had maybe 1-2 weeks on Turing machines; an actual year-long course at the sophomore-senior level on theoretical computer science is Optional and for mathematically mature students who wish to concentrate in it. So the problem arises is a matter of a language and conceptual gap between two subdisciplines, the AI community and the TCS community. So you see lots of people believing in very simplistic arguments for or against some AI issue without a strong theoretical grounding that while CS itself has, but is not by default taught to undergraduates.

> You rhetorically declared hallucinations to be part of the normal functioning (i.e., the word "Normal" is already a value judgement).

No they aren't: When you flip a coin, it landing to display heads or tails is "normal". That's no value judgement, it's just a way to characterize what is common in the mechanics.

If it landed perfectly on its edge or was snatched out of the air by a hawk, that would not be "normal", but--to introduce a value judgement--it'd be pretty dang cool.

You just replaced 'normal' with 'common' to do the heavy lifting, the value judgment remains in the threshold you pick.

Whereas OP said that "hallucinations are part of the normal functioning" of the LLM. I contend their definition of hallucination is too weak and reductive, that scientifically we have not actually settled that hallucinations are a given for LLMs, that humans are an example that LLMs are currently inferior - or else how would you make sense of Terence Tao's assessment of gpt01. It is not a simplistic argument of LLMs are garbage in garbage out, therefore they will always hallucinate. OP doesn't even show they read or understood the paper which is about Turing machine arguments, rather OP is using simplistic semantic and statistical arguments to support their position.

I didn't say that. I said "hallucination" is a value judgment we assign to a piece of text produced by an LLM, not a type of malfunction in the model.

If we're going to nitpick on word choice let's pick on the words that I actually used.

Humans can also spend an entire lifetime (or more, across multiple generations) being absolutely, inexorably and violently certain they are correct about something and still be 100% wrong.

I am not disagreeing that either people or LMMs are not extremely helpful in many or most instances. But if the best we can do with this technology is to make human-comparable mistakes WAY faster and more efficiently, I think as a species we’re in for a lot more bad times before we get to graduate to the good times.

People constantly make this mistake, so just to clarify: absolutely nothing about what I just said implies that llms are not helpful.

Having an accurate mental model for what a tool is doing does not preclude seeing its value, but it does preclude getting caught up in unrealistic hype.

But unlike LLMs, a person is capable of reflecting and realizing that they made a mistake. They are also capable of correcting so that they don't make the same mistake again. This is why fallible humans are still useful, and fallible LLMs are not.

Except the LLM didn't deliver a "wrong" result, it delivered text that is human readable and makes grammatical sense. Whether or not the information contained in the text is "wrong" is subjective, and the reader gets to decide if it's factual or not. If the LLM delivered unreadable gibberish, then that could be considered "wrong", but there is no "hallicinating" going on with LLMs. That's an anthropomorphism that is divorced from the reality of what an LLM does. Whoever called it "hallucinating" in the context of an LLM should have their nerd credentials revoked.

The term "hallucination" is in response to what the LLMs are being marketed and sold as what they are supposed to do, not how they work technologically.

It makes sense to have better terms for technical discussions, but this term is going to stick around for mainstream use.

It's very subjective. An LLM could return statements like "Global warming is real and man-made", and it also could produce a result like "Global warming is a hoax", and it's definitely up to the reader as to whether the LLM is "hallucinating". It doesn't matter how readable or grammatically correct the LLM is, it's still up to the reader to call bullshit, or not.

If you ask about opinions, sure. Because there are no "true" opinions.

If you ask about the capital of France, any answer but "Paris" is objectively wrong, whether given by a human or LLM.

Paris has not always been the capital of France. Many other cities around France have been capital.

https://en.wikipedia.org/wiki/List_of_capitals_of_France

There's practically no subject you could bring up that an LLM wouldn't "hallucinate" or give "wrong" information about given that garbage in -> garbage out, and LLMs are trained on all the garbage (as well as too many facts) they've been able to scrape. The LLM lacks the ability to reason about what century the prompt is asking about, and a guess is all it is programmed to do.

Also, if you ask 100 French citizens today what the true capital of France is, you're not always going to get "Paris" as a reply 100 times.

But if you ask "what is the capital of France" (not what it has been, but what it is), there is actually only one correct answer. "Capital" has a definition, and the headquarters of the government of France has a definite location. Sure, some French citizens will give a different answer. Some people will say the earth is flat, too. They are wrong.

But we're talking about LLMs in this thread, and the example I used of French citizens not always saying Paris is the capital of France is just an example of how topics can be subjective. If you have something pertinent to the LLM discussion, then please reply.

The capital of France is not subjective. People say stuff. Some of it is subjective, and some of it is just wrong.

So, was your comment about Paris about the LLM discussion, or wasn't it? Because you're the one who brought it up, so if we got off topic, blame yourself.

I have asserted that the LLMs are sometimes flat-out wrong. You have answered that point with an example of... what were you trying to say? That humans can also be wrong? If so, that's true, but so what? We were talking about LLMs. Or were you trying to say that even something like the capital of France is actually subjective? If so, you're wrong. "The capital of France" has only one correct answer, even if there are other answers in the training data.

Or are you trying to say that it's not the LLM's fault, because the wrong answers are in the training data? That's true, but it's irrelevant. The LLM is still giving wrong answers to questions that have objectively correct answers. The LLM had that in its training data; that's what LLMs are; but that doesn't actually make the answer any less wrong.

So what's your actual argument here? You seemed to be headed toward a "no answer can actually be objectively correct", which is both lousy epistemology, and completely unworkable in real life. But then you seemed to veer into... something. What are you actually trying to claim?

This is sounding kind of harsh, but I really am not following what your actual point is.

>>The capital of France is not subjective. People say stuff. Some of it is subjective, and some of it is just wrong.

>"The capital of France" has only one correct answer, even if there are other answers in the training data.

"Champagne is the Champagne capital of France". "Bordeaux is the red wine capital of France". See how easy it is? You're pedantry is only proving that you're a pedant and can't accept anything but you being the only one who is correct. Ease up, bro. We can both be right.

But none of that is about LLMs, I'm just proving a separate point.

>So what's your actual argument here?

A system that's programmed to generate plausible sounding text is "right" when it generates plausible sounding text. It's not "hallucinating", it's not "lying", it's not "wrong", it is operating exactly as designed, it was never programmed to deliver "the truth". It's up the the reader to decide if the output is acceptable, which is entirely subjective to what the reader thinks is right. If the LLM says "Bordeaux is the red wine capital of France" are you going to shit on it and say it's somehow "wrong"? NO ITS WROOOONG THERE CAN ONLY BE ONE TRUE CAPITAL OF FRANNNNNNCEEEE!!! Go ahead and die on that hill if you must.

If this were another website, I'd have blocked you already, because this is entirely a waste of my time.

For the first half of what you said: I will note that "wine capital of France" is a completely different claim than "capital of France", even if many of the words are the same. For the rest: I'll just leave this here for everyone else to judge which of us is being the pedant, and which is arguing just to keep arguing.

As for the second half: I am almost in agreement with your overall point here. LLMs are plausible text generators. Yes, I'm with you there. But LLMs are marketed as more than that, and that's the problem. They're marketed by their makers as more than that.

This is not a technical problem, it's a marketing problem. You can't yell at people for accusing a plausible text generator of "hallucinating", when they were sold it as being more than just a plausible text generator. (The were sold it as being "AI", which is something that you might realistically be able to accuse of hallucinating.) The LLM creators have written a check that their tech, by its very nature, cannot cash. And so their tech is being held to a standard that it cannot reach. This isn't the fault of the tech; it's the fault of the marketing departments.

>But LLMs are marketed as more than that, and that's the problem. They're marketed by their makers as more than that.

The new snake oil, same as the old snake oil. This is no different than any other tech bubble. Nobody paying attention should think otherwise. I don't care how it's marketed, I mean half the US is going to vote for a serial rapist conman thanks to some twisted marketing. People are idiots and are easily fooled, and this has gone on as long as there have been humans. I'm not sure what to say about "marketing".

So finally we can sort of agree on something. But I still think you're giving the LLMs too much credit in suggesting that they will always infallibly say "Paris" when asked where is the capital of France. There's simply no mechanism for the LLM to understand "Paris" or "France" or "Capital". If I asked the LLM that question 1,000,000 times, do you really think it would result in "Paris" 1,000,000 times? I kind of doubt it.

The problem is with the person who is expecting truth from an LLM. So far I don't really see too many people putting absolute faith in anything an LLM is telling them, but maybe those people are out there.

No, I never said that an LLM would always say "Paris". I said that Paris is the actual correct answer. I don't give LLMs that kind of credit; I'm not sure what I said that made you think that I do.

Maybe with vanilla LLMs, but new LLM training paradigms include post-training with the explicit goal of avoiding over-confident answers to questions the LLM should not be confident about answering. So hallucination is a malfunction, just like any overconfident incorrect prediction by a model.

The only time the LLM can be somewhat confident of its answer is when it is reproducing verbatim text from its training set. In any other circumstance, it has no way of knowing if the text it produced is true or not, because fundamentally it only knows if it's a likely completion of its input.

Post training includes mechanisms to allow LLMs to understand areas that they should exercise caution in answering. It’s not as simple as you say anymore.

I still think OP has a point. The LLMs evolved after public use to be positioned as oracles which know so much knowledge. They were always probabilistic content generators, but people use them the way they use search engines, to retrieve info they know exists but don't exactly know.

Since LLMs aren't designed for this there's a whole post process to try to make them amenable to this use case, but it will never plug that gap

> but it will never plug that gap

They don't have to be perfect, they just have to be better than humans. And that seems very likely to be achievable eventually.

To be better than humans they have to able confidently say "I don't know" when the correct answer is not available[1]. To me this sounds like a totally different type of "knowledge" than stringing words together based on a training set.

[1] LLMs are already better than humans in terms of breadth, and sometimes depth, of knowledge. So it's not a problem of the AI knowing more facts.

> To me this sounds like a totally different type of "knowledge" than stringing words together based on a training set.

We're desperate to keep seeing ourselves as unique with key distinguishing features that are unreproducible in silicon, but from my long experience with computer chess, every step along the way, people were explaining patiently how computers could never reproduce the next quality that set humans apart. And it was always just wishful thinking, because computers eventually stopped looking silly, stopped looking like they were playing by rote, and started to produce "beautiful" chess games.

And it will happen again, after the next leap in AI, people will again latch on to whatever it is that AI systems still lack, and use it to explain how they'll "always" be lacking... only to eventually be disappointed again that silicon can in fact reach that height too.

Humans aren't magic. Whatever we can do, silicon can do too. It's just a matter of time.

How sure are you that humans don’t also “simply produce plausible-sounding text”?

I’m watching my toddler learning to speak and it’s remarkably like an early LLM that outputs gibberish with nuggets of sense.

Conversely I’ve seen middle-aged professionals write formal design documents that are markedly inferior to what GPT-4 can produce. This is at every level: overall structure, semantics, and syntax. Diagrams with arrows the wrong way, labels with typos, labels on the wrong object, mixed up icons, tables with made-up content, and on and on.

Aren’t we all doing this to some degree? Half way through a sentence do you known ahead of time exactly what you’re about to say three pages later? Or are you just “completing” the next word based on the context?

> How sure are you that humans don’t also “simply produce plausible-sounding text”?

I'm not, and I didn't say I was.

Nothing in my comment is downplaying the success and utility of LLMs. Nothing in my comment is suggesting that they won't get better or even that they won't get better than us humans. It is strictly an observation that these two things that we call different names—hallucinations and alignment—are actually equivalent and can each be expressed in terms of the other. A hallucination is a specific kind of misalignment, nothing more.

I agreed with you until your last sentence. Solving alignment is not a necessity for solving hallucinations even though solving hallucinations is a necessity for solving alignment.

Put another way, you can have a hypothetical model that doesn't have hallucinations and still has no alignment but you can't have alignment if you have hallucinations. Alignment is about skillful lying/refusing to answer questions and is a more complex task than simply telling no lies. (My personal opinion is that trying to solve alignment is a dystopian action and should not be attempted.)

My point is that eliminating hallucinations is just a special case of alignment: the case where we want to bound the possible text outputs to be constrained by the truth (for a value of truth defined by $SOMEONE).

Other alignment issues have a problem statement that is effectively identical, but s/truth/morals/ or s/truth/politics/ or s/truth/safety/. It's all the same problem: how do we get probabilistic text to match our expectations of what should be outputted while still allowing it to be useful sometimes?

As for whether we should be solving alignment, I'm inclined to agree that we shouldn't, but by extension I'd apply that to hallucinations. Truth, like morality, is much harder to define than we instinctively think it is, and any effort to eliminate hallucinations will run up against the problem of how we define truth.

If the model starts concocting nonexistent sources, like "articles from serious newspapers that just never existed", it is definitely a malfunction for me. AFAIK this is what happened in the Jonathan Turley case.

Isn’t hallucination just the result of speaking out loud the first possible answer to the question you’ve been asked?

A human does not do this.

First of all, most questions we have been asked before. We have made mistakes in answering them before, and we remember these, so we don’t repeat them.

Secondly, we (at least some of us) think before we speak. We have an initial reaction to the question, and before expressing it, we relate that thought to other things we know. We may do “sanity checks“ internally, often habitually without even realizing it.

Therefore, we should not expect an LLM to generate the correct answer immediately without giving it space for reflection.

In fact, if you observe your thinking, you might notice that your thought process often takes on different roles and personas. Rarely do you answer a question from just one persona. Instead, most of your answers are the result of internal discussion and compromise.

We also create additional context, such as imagining the consequences of saying the answer we have in mind. Thoughts like that are only possible once an initial “draft” answer is formed in your head.

So, to evaluate the intelligence of an LLM based on its first “gut reaction” to a prompt is probably misguided.

Let me know if you need any further revisions!

No, if I ask a human about something he doesn't know, the first thing he will think about is not a made up answer, it is "I don't know". It actually takes effort to make up a story, and without training we tend to be pretty bad at it. Some people do it naturally, but it is considered a disorder.

For LLMs, there is no concept of "not knowing", they will just write something that best matches their training data, and since there is not much "I don't know" in their training data, it is not a natural answer.

For example, I asked for a list of bars in a small city the LLM clearly didn't know much about, and gave me a nice list with names, addresses, phone numbers, etc... all hallucinated. Try to ask a normal human to give you a list of bars in a city he doesn't know well enough, and force him to answer something plausible, no "I don't know". Eventually, especially if he knows a lot about bars, you will get an answer, but it absolutely won't be his first thought, he will probably need to think hard about it.

Rubbish. Humans absolutely hallucinate things, including lists of bars.

Most of them do it infrequently but it absolutely happens. Sometimes they don't even realise it (see all the research about fictitious memories and eye witness accounts).

It's definitely a problem that LLMs hallucinate all the time, but let's not pretend humans are perfect and never bullshit.

Hell, I've worked with one awful guy who did bullshit just as much as LLMs. Literally never said "I don't know". Always some answer that you were never sure was true or just entirely made up. Annoyingly it worked quite well for him - lots of people couldn't see through it.

> No, if I ask a human about something he doesn't know, the first thing he will think about is not a made up answer, it is "I don't know".

You've just made this up, through. It's not what happens. How would somebody even know that they didn't know without trying to come up with an answer?

But maybe more convincingly, people who have brain injuries that cause them to neglect a side (i.e. not see the left or right side of things) often don't realize (without a lot of convincing) the extent to which this is happening. If you ask them to explain their unexplainable behaviors, they'll spontaneously concoct the most convincing explanation that they can.

https://en.wikipedia.org/wiki/Hemispatial_neglect

https://en.wikipedia.org/wiki/Anosognosia

People try to make things make sense. LLMs try to minimize a loss function.

> Isn’t hallucination just the result of speaking out loud the first possible answer to the question you’ve been asked?

No.

> In fact, if you observe your thinking…

There is no reason to believe that LLMs should be compared to human minds other than our bad and irrational tendency towards anthropomorphizing everything.

> So, to evaluate the intelligence of an LLM based on its first “gut reaction” to a prompt is probably misguided.

LLMs do not have guts and do not experience time. They are not some nervous kid randomly filling in a scantron before the clock runs out. They are the product of software developers abandoning the half-century+ long tradition of making computers output correct answers and chasing vibes instead

>> Isn’t hallucination just the result of speaking out loud the first possible answer to the question you’ve been asked?

>No.

Not literally, but it's certainly comparable.

>There is no reason to believe that LLMs should be compared to human minds

There is plenty of reason to do that. They are not the same, but that doesn't mean it's useless to look at the similarities that do exist.

> LLMs do not have guts

Just going to ignore the scare quotes then?

> do not experience time

None of us experience time. Time is a way to describe cause and effect, and change. LLMs have a time when they have been invoked with a prompt, and a time when they have generated output based on that prompt. LLMs don't experience anything, they're computer programs, but we certainly experience LLMs taking time. When we run multiple stages and techniques, each depending on the output of a previous stage, those are time.

So when somebody says "gut reaction" they're trying to get you to compare the straight probabilistic generation of text to your instinctive reaction to something. They're asking you to use introspection and ask yourself if you review that first instinctive reaction i.e. have another stage afterwards that relies on the result of the instinctive reaction. If you do, then asking for LLMs to do well in one pass, rather than using the first pass to guide the next passes, is asking for superhuman performance.

I feel like this is too obvious to be explaining. Anthropomorphizing things is worth bitching about, but anthropomorphizing human languages and human language output is necessary and not wrong. You don't have to think computer programs have souls to believe that running algorithms over human languages to produce free output that is comprehensible and convincing to humans requires comparisons to humans. Otherwise, you might as well be lossy compressing music without referring to ears, or video without referring to eyes.

> Just going to ignore the scare quotes then?

Yep. The analogy is bad even with that punctuation.

> None of us experience time.

That is not true and would only be worthy of discussion if we had agreed that comparing human experience to LLMs predicting tokens was worthwhile (which I emphatically have not done)

> You don't have to think computer programs have souls to believe that running algorithms over human languages to produce free output that is comprehensible and convincing to humans requires comparisons to humans.

This is true. You also don’t have to think that comparing this software to humans is required. That’s a belief that a person can hold, but holding it strongly does not make it an immutable truth.

> So, to evaluate the intelligence of an LLM based on its first “gut reaction” to a prompt is probably misguided.

There's no intelligence to evaluate. They're not intelligent. There's no logic or cogitation in them.

> A human does not do this.

You obviously had never asked me anything. (Specialy tech questions while drinking a cup of cofee.) If I had a cent for every wrong answer, I'd be already a millionair.

Why?? To defend AI you used yourself as an example of how we can also be that dumb too.

I don't understand. Your example isn't true - what the OP posted is the human condition regarding this particular topic. You, as a human being obviously kno better than to blurt out the first thing that pop into your head - you even have different preset iterations of acceptable things to blurt in certain situations solely to avoid saying the wrong thing like - I'm sorry for your loss. Thoughts and prayers" and stuff like "Yes, Boss" or all the many rules of politeness, all of that is second nature to you, a prevents from blurting shit out.

Lastly, how do dumb questions in the mornings with coffee at a tech meeting in any way compare to an AI hallucination??

Did you ever reply with information that you completely made up, has seemingly little to do with the question and doesn't appear to make any logical or reasonable sense as to why that's your answer or how you even got there??

That's clearly not the behavior of an "awake" or sentient thing. That is perhaps the simplest way for normal people to "get it" is by realizing what a hallucination is and that their toddler is likely more capable of comprehending context.

You dismissed a plainly stated and correct position, with self depreciating humor - for why?

> In fact, if you observe your thinking, you might notice that your thought process often takes on different roles and personas.

I don't think it's possible to actually observe one's own thinking. A lot of the "eureka" moments one has in the shower, for example, were probably being thought about somewhere in your head but that process is completely hidden from your conscious mind.

Maybe not observe directly, but analyzing and understanding the thinking process is exactly what psychoanalysis is doing actually. But it takes the help of another person to be able to reach those unconscious thoughts.

I also like comparing a human thought experiment like Einstein (?) would do to forcing an llm to write code to answer a question. Yes you can make a good guess, but making many smaller obvious decisions that lead to an answer is a stronger process.

The US had a president for eight years who was re-elected on his ability to act on his “gut reaction”s.

Not saying this is ideal, just that it isn’t the showstopper you present it as. In fact, when people talk about “human values”, it might be worth reflecting on whether this a thing we’re supposed to be protecting or expunging?

"I'm not a textbook player, I'm a gut player.” —President George W. Bush.

https://www.heraldtribune.com/story/news/2003/01/12/going-to...

Humans totally do this if their prefrontal cortex shuts down due to fight or flight response. See eg stage fright or giving bullshit answers in leetcode style interviews.

Our brains also seem to tie our thoughts to observed reality in some way. The parts that do sensing and reasoning interact with the parts that handle memory. Different types of memory exist to handle trade offs. Memory of what makes sense also grows in strength compared to random things we observed.

The LLM’s don’t seem to be doing these things. Their design is weaker than the brain on mitigating hallucinations.

For brain-inspired research, I’d look at portions of the brain that seem to be abnormal in people with hallucinations. Then, models of how they work. Then, see if we can apply that to LLM’s.

My other idea was models of things like the hippocampus applied to NN’s. That’s already being done by a number of researchers, though.

I'm of the opinion that the current architectures are fundamentally ridden with "hallucinations" that will severely limit their practical usage (including very much what the hype thinks they could do). But this article puts an impossible limit to what it is to "not-hallucinate".

It essentially restates well known fundamental limitations of formal systems and mechanistic computation and then presents the trivial result that LLMs also share these limitations.

Unless some dualism or speculative supercomputational quantum stuff is invoked, this holds very much to humans too.

> fundamentally ridden with "hallucinations" that will severely limit their practical usage

On the other hand, a LLM that got rid of "hallucinations" is basically just a thing that copy-paste at that point. The interesting properties from LLMs comes from the fact that it can kind of make things up but still make them believable.

Its important that anyone reading fiction knows its fiction.

Without that we can run into a situation where a fictional story on the radio convinces the public that we are under alien attack in what could be called a war of the worlds.

Training with inappropriate data will make copy-paste equally bad with hallucinating. For example, fiction or sarcasm may be taken out of context and used as a serious output.

What's missing is sanity-checking the sources and the output.

What impresses me is frankly how bad it is.

I can't claim to have tried every model out there, but most models very quickly fail when asked to do something along the lines of "describe the interaction of 3 entities." They can usually handle 2 (up to the point where they inevitably start talking in circles - often repeating entire chunks verbatim in many models), but 3 seems utterly beyond them.

LLMs might have a role in the field of "burn money to generate usually-wrong ideas that are cheap enough to check in case there's actually a good one" though.

C.S. Peirce, who is known for characterizing abductive reasoning and had a considerable on John Sowa’s old school AI work, had an interesting take on this. I can’t fully do it justice, but essentially he held that both matter and mind are real, but aren’t dual. Rather, there is a smooth and continuous transition between the two.

However, whatever the nature of mind and matter really is, we have convincing evidence of human beings creating meaning in symbols by a process Peirce called semiosis. We lack a properly formally described semiotic, although much interesting mathematical applied philosophy has been done in the space (and frankly a ton of bullshit in the academy calls itself semiotic too). Until we can do that, we will probably have great difficulty producing an automaton that can perform semiosis. So, for now, there certainly remains a qualitative difference between the capabilities of humans and LLMs.

I don’t know the technical philosophy terms for this, but my simplistic way of thinking about it is that when I’m “seriously” talking (not just emitting thoughtless cliché phrases), I’m talking about something. And this is observable because sometimes I have an idea that I have trouble expressing in words, where I know that the words I’m saying are not properly expressing the idea that I have. (I mean — that’s happening right now!)

I don’t see how that could ever happen for an LLM, because all it does is express things in words, and all it knows is the words that people expressed things with. We know for sure, that’s just what the code does; there’s no question about the underlying mechanism, like there is with humans.

Producing text is only the visible end product. The LLM is doing a whole lot behind the scenes, which is conceivably analogous to the thought space from which our own words flow.

You can simulate a NAND gate using balls rolling down a specially designed wood board. In theory you could construct a giant wood board with billions and billions of balls that would implement the inference step of an LLM. Do you see these balls rolling down a wood board as a form of interiority/subjective experience? If not, then why do you give it to electric currents in silicon? Just because it's faster?

Your point of disagreement is the _medium_ of computation? The same point can be made about neurons.

Do you think you could have the same kind of cognitive processes you have now if you were thinking 1000x slower than you do? Speed of processing matters, especially when you have time bounds on reaction, such in real life.

Another problem with balls would be the necessity of perception, that you can't really do with balls alone, you need different kind of medium for perception and interaction, that humans, (and comuputers) do have.

> Your point of disagreement is the _medium_ of computation?

No. My point is that we should not impute interiority onto computation. The medium is a thought experiment meant to stimulate your intuition that computational sophistication does not imply interiority. Unless you do think that the current state of balls rolling down a board does entail a conscious experience?

To adjust the speed, just imagine the balls rolling down in 100000000x time. I don't know where you stand, but I still don't think there is a conscious experience on the board.

Really out-of-ignorance: Is 'proof' the right word here? A more substantial philosophical counter-argument may be needed, but proof sounds weird in these "metaphysical" (for now) discussions.

You raise a valid point. Proof is an overloaded term with differing meanings in metaphysics, math, and law. However in all three cases it's obviously different attempts to grasp at the same ultimate thing: truth.

My take on Searle is that he was a hack. It's possible I judge too harshly, that _I_ am a hack (the likeliest, tbh) or that I and his writing have some fundamental life outlooks different.

Regardless, I think the Chinese room experiment is bunk and proves nothing. And I fail to gather where the medium of computation steps in the Chinese room experiment. The "computer" might as well be a bunch of neurons in a petry dish.

I guess the proof will be in the pudding when we develop superhumanly intelligent AI.

> I guess the proof will be in the pudding when we develop superhumanly intelligent AI.

I'm not sure that's the case. The universe itself is already capable of superhuman intelligence. There's nobody alive that can predict how wind will flow over an airfoil better than a wind tunnel.

The actual proof will be in the pudding if we develop superhumanly creative AI.

Yes, but that space is entirely derived from human expressions, in words, of their own thought space. The LLM has no direct training access to the humans’ thoughts like it does to their words. So if it does have comparable thought space, that would imply such a space can be reconstructed accurately after passing through expression in words, which seems like a unsupported claim based on millennia of humans having trouble understanding each others’ thoughts based on verbal communication, and students writing essays that are superficially similar to the texts they’ve read, but clearly indicate they haven’t internalized the concepts they were supposedly learning.

It’s not to say there couldn’t be a highly multimodal and self-training model that developed a similar thought space, which would be very interesting to study. It just seems like LLMs aren’t enough.

> I have an idea that I have trouble expressing in words, where I know that the words I’m saying are not properly expressing the idea that I have. (I mean — that’s happening right now!)

That's a quality insight. Which, come to think of it, is an interestingly constructed word given what you just said.

The way that LLMs hallucinate now seems to have everything to do with the way in which they represent knowledge. Just look at the cost function. It's called log likelihood for a reason. The only real goal is to produce a sequence of tokens that are plausible in the most abstract sense, not consistent with concepts in a sound model of reality.

Consider that when models hallucinate, they are still doing what we trained them to do quite well, which is to at least produce a text that is likely. So they implicitly fall back onto more general patterns in the training data i.e. grammar and simple word choice.

I have to imagine that the right architectural changes could still completely or mostly solve the hallucination problem. But it still seems like an open question as to whether we could make those changes and still get a model that can be trained efficiently.

Update: I took out the first sentence where I said "I don't agree" because I don't feel that I've given the paper a careful enough read to determine if the authors aren't in fact agreeing with me.

You can never completely solve the problem because it's mathematically undecideable, which you probably didn't need this preprint to intuit. That said, a better question is whether you can get good enough performance or not.

Incomplete training data is kind of a pointless thing to measure.

Isn’t incomplete data the whole point of learning in general? The reason why we have machine learning is because data was incomplete. If we had complete data we don’t need ml. We just build a function that maps the input to output based off the complete data. Machine learning is about filling in the gaps based off of a prediction.

In fact this is what learning in general is doing. It means this whole thing about incomplete data applies to human intelligence and learning as well.

Everything this theory is going after basically has application learning and intelligence in general.

So sure you can say that LLMs will always hallucinate. But humans will also always hallucinate.

The real problem that needs to be solved is: how do we get LLMs to hallucinate in the same way humans hallucinate?

Yes, but it also makes a huge difference whether we are asking the model to interpolate or extrapolate.

Generally speaking, models perform much better on the former task, and have big problems with the latter.

Without any ability to reason about the known facts, are we better off with LLMs trying to interpolate at all rather than acting as a huge search space that returns only references?

If an LLM has the exact answer needed it could simply be returned without needing to be rephrased or predicted at all.

If the exact answer is not found, or if the LLM attempts to paraphrase the answer through prediction, isn't it already extrapolating? That doesn't even get to the point where it is attempting to combine multiple pieces of training data or fill in blanks that it hasn't seen.

> Machine learning is about filling in the gaps based off of a prediction.

I think this is a generous interpretation of network-based ML. ML was designed to solve problems. We had lots of data, and we knew large amounts of data could derive functions (networks) as opposed to deliberate construction of algorithms with GOFAI.

But "intelligence" with ML as it stands now is not how humans think. Humans do not need millions of examples of cats to know what a cat is. They might need two or three, and they can permanently identify them later. Moreover, they don't need to see all sorts of "representative" cats. A human could see a single instance of black cat and identify all other types of house cats as cats correctly. (And they do: just observe children).

Intelligence is the ability to come up with a solution without previous knowledge. The more intelligent an entity is, the less data it needs. As we approach more intelligent systems, they will need less data to be effective, not more.

> Humans do not need millions of examples of cats to know what a cat is.

We have evolved over time to recognize things in our environment. We also don’t need to be told that snakes are dangerous as many humans have an innate understanding of that. Our training data is partially inherited.

>> Machine learning is about filling in the gaps based off of a prediction.

>I think this is a generous interpretation of network-based ML.

This is False.

The definition of What you actually do with machine learning is Literally filling in the gaps based on prediction. If you can't see this you may not intuitively understand what ML is in actuality doing.

Let's examine ML in it's most simplest form. Linear Regression based off of 2 data points with a single input X and single output Y:

    (0, 0), (3, 3)

With linear regression this produces a model that's equivalent to : y = x

with y = x you've literally filled an entire domain of infinite possible inputs and outputs from -infinite to positive infinite. From two data points I can now output points like (1,1), (2,2),(343245,343245) literally from the model y=x.

The amount of data given by the model is so overwhelmingly huge that basically it's infinite. You feed in random data into the model at speeds of 5 billion numbers per nano second you will NEVER hit an original data point and you will always be creating novel data from the model.

And there's no law that says the linear regression line even has to TOUCH a data point.

ML is simply a more complex form of what I described above with thousands of values for input, thousands of values for output and thousands of datapoints and a different best fit curve (as opposed to a straight line, to fit into the data points). EVEN with thousands of datapoints you know an equation for a best fit curve basically covers a continuous space and thus holds almost an infinite amount of creative data compared with the amount of actual data points.

Make no mistake. All of ML is ALL about novel data output. It is literally pure creativity..... not memory at all. I'm baffled by all these people thinking that ML models are just memorizing and regurgitating.

The problem this paper is talking about is that the outputs are often illusory. Or to circle back to my comment the "predictions" are not accurate.

A key skill necessary to work effectively with LLMs is learning how to use technology that is fundamentally unreliable and non-deterministic.

A lot of people appear to find this hurdle almost impossible to overcome.

Honesty and accuracy builds trust.

And when you trust something it reduces the cognitive load because you don't have to build a mental model of the different ways it could be deceiving you and how to handle it.

Which is why for me at least when I use LLMs I find them useful but stressful.

Everything you said applies to regular search engines too. There's always a cognitive load in using them. Often the load is less with an LLM. Other times it's not.

Can't use either of these tools for all problems. Pick the right tool for the right problem.

We don’t need to “live with this”. We can just not use them, ignore them, or argue against their proliferation and acceptance, as I will continue doing.

You're assuming that the ending is already written and the long term value of LLMs is already known.

Someone not using them could get left behind if LLMs turn out that to have a consistent multiplier effect on productivity. That person also may not actually care about being "left behind".

Someone not using them could also save a bunch of time and effort if LLMs don't pan out and don't add a meaningful value over the long run. That's not to say they don't add any value, LLMs could be relegated to the role of a much better predictive text engine that gets used regularly but isn't itself a large enough gain to leave anyone behind for not using predictive text.

Technically, you are right. Donald Knuth still doesn't use e-mail, after all.

But for the global "we" entity, it is almost certain that it is not going to heed your call.

Gotta love "progress" as the ultimate goal. Ignore how you got here, whether you made a wrong turn, or even if you're happy as-is and don't need to change. Just keep pushing forward and moving around the deck chairs!

LLMs amplify both intelligence and stupidity, which is why Terence Tao finds them about at the level of a mediocre graduate student (and getting better), and you can’t wait for them to die.

But but before that we need to achieve what we call "AGI" first.

Even before that we need to define it, first and the reality is no-one knows what "AGI" even is. Thus it could be anything.

The fact that Sam doesn't believe that AGI has been "achieved" yet even after GPT-3.5, ChatGPT, GPT-4 (multi-modal), and with o1 (Strawberry) suggests that what AGI really means is to capture the creation and work of billions, raise hundreds of billions of dollars and for everyone to be on their UBI based scheme, whilst they enrich themselves as the bubble continues.

Seems like the hallucinations are an excuse to say that AGI has not yet been achieved. So it time to raise billions more money on training, energy costs on inference all for it to continue to hallucinate.

Once all the value has been captured by OpenAI and the insiders cash out, THEN they would want the bubble to burst with 95% of AI startups disappearing (Except OpenAI).

LLMs will go the way of the 'expert systems'. We're gonna wonder why we ever thought that was gonna happen.

I just recommend you don't pidgeonhole yourself and an AI professional because it's gonna be awfully cold outside pretty soon.

Disagree - https://arxiv.org/abs/2406.17642

We cover halting problem and intractable problems in the related work.

Of course LLMs cannot give answers to intractable problems.

I also don’t see why you should call an answer of “I cannot compute that” to a halting problem question a hallucination.

I treat LLMs like a fallible being, the same way I treat humans. I don’t just trust output implicitly, and I accept help with tasks knowing I am taking a certain degree of risk. Mostly, my experience has been very positive with GPT-4o / ChatGPT and GitHub copilot with that in mind. I use each constantly throughout the day.

One big difference is that at least some people have a healthy sense for when they may be wrong. This sort of meta-cognitive introspection is currently not possible for an LLM. For instance, let's say I asked someone "do you know the first 10 elements of the periodic table of elements?" Most people would be able to accurately say "honestly I'm not sure what comes after Helium." But an LLM will just make up some bullshit, at least some of the time.

There are ways to gauge the confidence of the LLM (token probabilities over the response, generating multiple outputs and checking consistency), but yeah that’s outside the LLM itself. You could feed the info back to the LLM as a status/message I suppose

The idea of hooking LLMs back up to themselves, i.e. giving them token prob information somehow or even giving them control over the settings they use to prompt themselves is AWESOME and I cannot believe that no one has seriously done this yet.

I've done it in some jupyter notebooks and the results are really neat, especially since LLMs can be made with a tiny bit of extra code to generate a context "timer" that they wait before they prompt themselves to respond, creating a proper conversational agent system (i.e. not the walkie talkie systems of today)

I wrote a paper that mentioned doing things like this for having LLMs act as AI art directors: https://arxiv.org/abs/2311.03716

That's indeed the biggest problem, because it limits its usefulness to questions for which you can verify the correctness. (Don't get me wrong, you can still get a lot of utility out of that, since for many problems finding the [candidate] solutions is much more difficult than verifying them)

OT, but this also reminds me how much I despise bullshitters. Sometimes right, something wrong, but always confident. In the end, you can't trust what they say.

I treat them as always hallucinating and it just so happens that, by accident, they sometimes produce results that resemble intentional, considered or otherwise veritable to the human observer. The accident is sometimes of a high probability, but still an accident. Humans are similar, but we are the standard for what’s to be considered a hallucination, clinically speaking. For us, a hallucination is a cause of concern. For llms, it’s just one of all the possible results. Monkeys bashing on a typewriter. This difference is an essential one, imo.

> I treat LLMs like a fallible being, the same way I treat humans.

The issue is LLMs are not marketed in this way. They're marketed as all knowing oracles to people that have been conditioned to just accept the first result Google gives them.

Just from the way this paper is written (badly, all kinds of LaTeX errors), my belief that something meaningful was proved here, that some nice mathematical theory has been developed, is low.

Example: The first 10 pages are meaningless bla

Sorry but you're just wrong. There are issues but the paper is written well enough. The content (whether this is really a novel enough idea) is debateable because anyone could have told you that LLMs aren't going to develop the halting algorithm.

We can't get rid of hallucinations. Hallucinations are a feature not a bug. A recent study by researchers Jim Waldo and Soline Boussard highlights the risks associated with this limitation. In their analysis, they tested several prominent models, including ChatGPT-3.5, ChatGPT-4, Llama, and Google’s Gemini. The researchers found that while the models performed well on well-known topics with a large body of available data, they often struggled with subjects that had limited or contentious information, resulting in inconsistencies and errors.

This challenge is particularly concerning in fields where accuracy is critical, such as scientific research, politics, or legal matters. For instance, the study noted that LLMs could produce inaccurate citations, misattribute quotes, or provide factually wrong information that might appear convincing but lacks a solid foundation. Such errors can lead to real-world consequences, as seen in cases where professionals have relied on LLM-generated content for tasks like legal research or coding, only to discover later that the information was incorrect. https://www.lycee.ai/blog/llm-hallucinations-report

I prefer confabulate over hallucinate.

Confabulate - To fill in gaps in one's memory with fabrications that one believes to be facts.

Hallucinate - To wander; to go astray; to err; to blunder; -- used of mental processes

Confabulation sounds a lot more like what LLMs actually do.

It's crazy to me that we managed to get such an exciting technology both theoretically and as a practical tool and still managed to make it into a bubbly hype wave because business people want it to be an automation technology, which is just a poor fit for what they actually do

It's kind of cool that we can make mathematical arguments for this, but the idea that generative models can function as universal automation is a fiction mostly being pushed by non-technical business and finance people, and it's a good demonstration of how we've let such people drive the priorities of technological development and adoption for far too long

A common argument I see folks make is that humans are fallible too. Yes, no shit. No automation even close to as fallible as a human at its task could function as an automation. When we automate, we remove human accountability and human versatility from the equation entirely, and can scale the error accumulation far beyond human capability. Thus, an automation that actually works needs drastically superhuman reliability, which is why functioning automations are usually narrow-domain machines

> want it to be an automation technology

They want it to be a wage reduction technology. Everything else you've noticed is a direct consequence of this, and only this, so the analysis doesn't actually need to be any deeper than that.

Business culture's tribal knowledge about technology seems to have by and large devolved into wanting only bossware, that all gizmos should function as a means of controlling their suppliers, their labor force, or their customers. I think this is a good sign that the current economic incentives aren't functioning in a desirable way for most people and this needs significant intervention

LLM and other generative output can only be useful for a purpose or not useful. Creating a generative model that only produces absolute truths (as if this was possible, or there even were such a thing) would make them useless for creative pursuits, jokes, and many of the other purposes to which people want to put them. You can’t generate a cowboy frog emoji with a perfectly reality-faithful model.

To me this means two things:

1. Generative models can only be helpful for tasks where the user can already decide whether the output is useful. Retrieving a fact the user doesn’t already know is not one of those use cases. Making memes or emojis or stories that the user finds enjoyable might be. Writing pro forma texts that the user can proofread also might be.

2. There’s probably no successful business model for LLMs or generative models that is not already possible with the current generation of models. If you haven’t figured out a business model for an LLM that is “60% accurate” on some benchmark, there won’t be anything acceptable for an LLM that is “90% accurate”, so boiling yet another ocean to get there is not the golden path to profit. Rather, it will be up to companies and startups to create features that leverage the existing models and profit that way rather than investing in compute, etc.

LLMs can neither understand nor hallucinate. All LLMs are just picking tokens based on probability. So doesn't matter how plausible the outputs look, the reasons lead to the output are absolutely NOT what we expect them to be. But such ugly fact cannot be admitted or the party would be stopped.

Human brains are also just picking tokens tho. A beautiful illusion of insight and thought in the chaos noise of information. But out of the chaos, the emergence of thought is real. It’s just not exclusive to humans.

I mean… even magicians (mentalists) can reliably hack humans into generating the next token they want you to generate.

Oh, not that again. Didn't we see this argument about three weeks ago.

A 100% correct LLM may be impossible. A LLM checker that produces a confidence value may be possible. We sure need one. Although last week's proposal for one wasn't very good.

When someone says something practical can't be done because of the halting problem, they're probably going in the wrong direction.

The authors are all from something called "UnitedWeCare", which offers "AI-Powered Holistic Mental Health Solutions". Not sure what to make of that.

Perplexity does a pretty good job on this. I find myself reaching for it first when looking for a factual answer or doing research. It can still make mistakes but the hallucination rate is very low. It feels comparable to a google search in terms of accuracy.

Pure LLMs are better for brainstorming or thinking through a task.

Due to the limitations of gradient descent and training data we are limited in the architectures that are viable. All the top LLM's are decoder-only for efficiency reasons and all models train on the production of text because we are not able to train on the thoughts behind the text.

Something that often gives me pause is the consideration that it is actually possible to come up with an architecture which has a good chance of being capable of being an AGI (RNNs, transformers etc as dynamical systems) but the model weights that would allow it to happen cannot be found because gradient descent will fail or not even be viable.

Been saying this from the beginning. Let's look at comparitor of a human result.

What is the likelihood that a junior college student with access to google will generate a "hallucination" after reading a textbook and doing some basic research on a given topic. Probably pretty high.

In our culture, we're often told to fake it till you make it. How many of us are probabilistic-ly hallucinating knowledge we've regurgitate from other sources?

If you ask a student to solve a problem while admitting when they don't an answer, they will stop at generating gook for an answer.

LLMs on the other hand regularly spew bogus with high confidence.

I’m not sure what this paper is supposed to prove and find it rather trivial.

> All of the LLMs knowledge comes from data. Therefore,… a larger more complete dataset is a solution for hallucination.

Not being able to include everything in the training data is the whole point of intelligence. This also holds for humans. If sufficiently intelligent it should be able to infer new knowledge, refuting the very first assumption at the core of the work.

LLMs hallucinate because probs -> tokens erase confidence values and it's difficult to assign confidences to strings of tokens, especially if you don't know where to start and stop counting (one word? one sentence?)

Is there a reason to believe this is not solvable as literally an API change? The necessary data are all there.

Models are often wrong but sometimes useful. Models that provide answers couched in a certain level of confidence are miscalibrated when all answers are given confidently. New training paradigms attempt to better calibrate model confidence in post-training, but clearly there are competing incentives to give answers confidently given the economics of the AI arms race.

But humans also hallucinate.

And humans habitually stray from the “truth” too. It’s always seemed to me that getting AI to be more accurate isn’t a math problem, it’s getting AI to “care” about what is true - aka better defining what truth is- aka what sources should be cited with what weights.

We can’t even keep humans in society from believing in the stupidest conspiracy theories. When humans get their knowledge from sources indiscriminately, they also parrot stupid shit that isn’t real.

Now enter Gödel’s incompleteness Theorem: there is no perfect tie between language and reality. Super interesting. But this isn’t the issue. Or at least it’s not more of an issue for robots than it is for humans.

If/when humans deliver “accurate” results in our dialogs, it’s because we’ve been trained to care about what is “accuracy” (as defined by society’s chosen sources)

Remember that AI “doesn’t live here.” It’s swimming in a mess of noisy context without guidance for what it should care about.

IMHO, as soon as we train AI to “care” at a basic level about what we culturally agree is “true” the hallucinations will diminish to be far smaller than the hallucinations of most humans.

I’m honestly not sure if that will be a good thing or the start of something horrifying.

Hallucinations in LLM will severely affect its usage in scenarios where such hallucinations are completely unacceptable - and there are many such scenarios. This is a good thing because it will mean that human intelligence and oversight will continue to be needed.

Sounds like a missed STTNG story line. I can imagine that such a “Data,” were we ever to build one, would hallucinate from time to time.

To tire a comparison to human thinking, you can conceive of it as hallucinations too, we just have another layer behind the hallucinations that evaluates each one and tries to integrate them with what we believe to be true. You can observe this when you're about to fall asleep or are snoozing, sometimes you go down wild thought paths until the critical thinking part of your brain kicks in with "everything you've been thinking about these past 10 seconds is total incoherent nonsense". Dream logic.

In that sense, a hallucinating system seems like a promising step towards stronger AI. AI systems simply are lacking a way to test their beliefs against a real world in the way we can, so natural laws, historical information, art and fiction exist on the same epistemological level. This is a problem when integrating them into a useful theory because there is no cost to getting the fundamentals wrong.

OK - there's always a nonzero chance of hallucination. There's also a non-zero chance that macroscale objects can do quantum tunnelling, but no one is arguing that we "need to live with this" fact. A theoretical proof of the impossibility of reaching 0% probability of some event is nice, but in practice it says little about whether we can exponentially decrease the probability of it happening or not to effectively mitigate risk.

Plus, why do we care about that degree? If we could make it so humans don't hallucinate too that would be great, but it ain't happening. Humans memory gets polluted the moment you feed them new information, as evidence by how much care we have to give when trying to extract information when it matters, like law enforcement.

People rag on LLMs constantly and i get it, but they then give humans way too much credit imo. The primary difference i feel like we see with LLMs vs Humans is complexity. No, i don't personally believe LLMs can scale to human "intelligence". However atm it feels like comparing a worm brain to a human intelligence and saying that's evidence that neurons can't reach human intelligence level.. despite the worm being a fraction of the underling complexity.

Humans have two qualities that make them infinitely superior to LLMs for similar tasks.

a) They don't give detailed answers for questions they have no knowledge about.

b) They learn from their mistakes.

> there's always a nonzero chance of hallucination. There's also a non-zero chance that macroscale objects can do quantum tunnelling, but no one is arguing that we "need to live with this" fact.

True, but it is defeatist and goes against a good engineering/scientific mindset.

With this attitude we'd still be practicing alchemy.

Exactly.

LLMs will sometimes be inaccurate. So are humans. When LLMs are clearly better than humans for specific use cases, we don't need 100% perfection.

Autonomous cars will sometimes cause accidents. So do humans. When AVs are clearly safer than humans for specific driving scenarios, we don't need 100% perfection.

If we only used LLMs for use cases where they exceed human ability, that would be great. But we don't. We use them to replace human beings in the general case, and many people believe that they exceed human ability in every relevant factor. Yet if human beings failed as often as LLMs do at the tasks for which LLMs are employed, those humans would be fired, sued and probably committed.

Yet any arbitrary degree of error can be dismissed in LLMs because "humans do it too." It's weird.

I don't think it's true that modern LLMs are used to replace human beings in the general case, or that any significant number of people believe they exceed human ability in every relevant factor.

> When AVs are clearly safer than humans for specific driving scenarios, we don't need 100% perfection.

People didn't stop refining the calculator once it was fast enough to beat a human. It's reasonable to expect absolute idempotent perfection from a robot designed to manufacture text.

Maybe, down the line. The calculator went through a long period of perfecting until it became as powerful as they are today. It’s only natural LLMs will also take time. And much like calculators moving from stepped drums, to vacuum tubes, to finally transistors, the way we build LLMs are sure to change. Although I’m not quite sure idempotence is something LLMs are capable of.

Sure, but that doesn’t mean they haven’t improved. Try calculating 99! on a TI-59. I doubt it can do it, and I know the modern TI-30XIIS can’t do it, but my Numworks can (although doing it twice forces it to clear that section of RAM). The calculator space may be slow to improve, as most non-testing calculations have went to computers, but that doesn’t mean they’re not useful, especially with scripting languages allowing me to convert between whatever units I want or calculate anything easily.

When will I see AI dialogue in video games? Imagine a RPG where instead of picking from a series of pre recorded dialogues, you could just talk to that villager. If it worked it would be mind blowing. The first studio to really pull it off in the AAA game would rake in the cash.

That seems like the lowest hanging fruit to me, like we would do that long before we have AI going over someone's medical records.

If the major game studios aren't confident enough in the tech to have it write dialogue for a Disney character for fear of it saying the wrong thing, I'm not ready for it to anything in the real world.

As a game developer, I'm absolutely not going to risk my players' engagement with the game by putting character personalities, dialog, and (therefore) plot into the hands of an AI that's going to play out differently for each player. No way, no how. That's where all my game's value is - that's what pays my rent - and I will not be handing it off to a glorified RNG.

Shakes fist at clouds... Back in my day we called these "bugs" and if you didn't fix them your program didn't work.

Jest aside, there is a long list of "flaws" in LLMS that no one seems to be addressing. Hallucinations, Cut off dates, Lack of true reasoning (the parlor tricks to get there don't cut it), size/cost constraints...

LLM's face the same issues as expert systems, without the constant input of experts (subject matter) your llm becomes quickly outdated and useless, for all but the most trivial of tasks.

They did say each token is generated using probability, not certainty, given that there is a chance it produces wrong tokens

This gets at the heart of the problem. It doesn’t produce the wrong tokens. The tokens are right. It’s the data that was “wrong”. Or at least it was weighted “incorrectly” (according to the judge living outside the data with their own context they decide is true)

If you feed AI conspiracy theories and then it tells you Elvis is still alive, that’s an input problem not an algorithm problem.

Now, getting to an AI that doesn’t “hallucinate” is a little more complicated than simply filtering out “conspiracy theories” from data, but IMhO it’s not many orders of magnitudes away. Far from insurmountable in a couple Moores law cycles.

I think human brains operate on the same principal of divining next tokens. We’re just judging AI for not saying the tokens we like best even though we feed AI garbage in and don’t tell AI what it should even care about. “AI doesn’t live here” it wasn’t born “here.”

Someday soon we’ll probably give AI guard rails to respond considering the context of society (the programmers version of society), and it will probably hallucinate less than most humans.

This seems to miss the point, which is how to minimize hallucinations to a desirable level. Good prompts refined over time can minimize hallucinations by a significant degree, but they cannot fully eliminate them.

The hallucinations seem to be related to AI's agreeableness. They always seem to tell you what you want to hear except when it goes against significant social narratives.

It's like LLMs know all possible alternative theories (including contradictory ones) and which one it brings up depends on how you phrase the question and how much you already know about the subject.

The more accurate information you bring into the question, the more accurate information you get out of it.

If you're not very knowledgeable, you will only be able to tap into junior level knowledge. If you ask the kinds of questions that an expert would ask, then it will answer like an expert.

Useless trash paper. It's like saying any object can disappear and reappear anywhere in the universe due to quantum physics, so there's no point studying physics or engineering. Just maybe we care if the probability of that happening is 10%, 0.00001%, or 1e-1000%.

（评论） (comments)

（评论）
(comments)