We have to stop ignoring AI’s hallucination problem

misk@sopuli.xyz · 5 months ago

We have to stop ignoring AI’s hallucination problem

Voroxpete@sh.itjust.works · 5 months ago

We not only have to stop ignoring the problem, we need to be absolutely clear about what the problem is.

LLMs don’t hallucinate wrong answers. They hallucinate all answers. Some of those answers will happen to be right.

If this sounds like nitpicking or quibbling over verbiage, it’s not. This is really, really important to understand. LLMs exist within a hallucinatory false reality. They do not have any comprehension of the truth or untruth of what they are saying, and this means that when they say things that are true, they do not understand why those things are true.

That is the part that’s crucial to understand. A really simple test of this problem is to ask ChatGPT to back up an answer with sources. It fundamentally cannot do it, because it has no ability to actually comprehend and correlate factual information in that way. This means, for example, that AI is incapable of assessing the potential veracity of the information it gives you. A human can say “That’s a little outside of my area of expertise,” but an LLM cannot. It can only be coded with hard blocks in response to certain keywords to cut it from answering and insert a stock response.

This distinction, that AI is always hallucinating, is important because of stuff like this:

But notice how Reid said there was a balance? That’s because a lot of AI researchers don’t actually think hallucinations can be solved. A study out of the National University of Singapore suggested that hallucinations are an inevitable outcome of all large language models. **Just as no person is 100 percent right all the time, neither are these computers. **

That is some fucking toxic shit right there. Treating the fallibility of LLMs as analogous to the fallibility of humans is a huge, huge false equivalence. Humans can be wrong, but we’re wrong in ways that allow us the capacity to grow and learn. Even when we are wrong about things, we can often learn from how we are wrong. There’s a structure to how humans learn and process information that allows us to interrogate our failures and adjust for them.

When an LLM is wrong, we just have to force it to keep rolling the dice until it’s right. It cannot explain its reasoning. It cannot provide proof of work. I work in a field where I often have to direct the efforts of people who know more about specific subjects than I do, and part of how you do that is you get people to explain their reasoning, and you go back and forth testing propositions and arguments with them. You say “I want this, what are the specific challenges involved in doing it?” They tell you it’s really hard, you ask them why. They break things down for you, and together you find solutions. With an LLM, if you ask it why something works the way it does, it will commit to the bit and proceed to hallucinate false facts and false premises to support its false answer, because it’s not operating in the same reality you are, nor does it have any conception of reality in the first place.

???@lemmy.world · 5 months ago

I fucking hate how OpenAi and other such companies claim their models “understand” language or are “fluent” in French. These are human attributes. Unless they made a synthetic brain, they can take these claims and shove them up their square tight corporate behinds.

EatATaco@lemm.ee · 5 months ago

This is circular logic: only humans can be fluent, so the models can’t be fluent because they aren’t human.

And it’s universally upvoted…in response to an ais getting things wrong so they can’t be doing anything but hallucinating.

And will you learn from this? Nope. I’ll just be down voted and shouted at.

Danksy@lemmy.world · edit-2 5 months ago

It’s not circular. LLMs cannot be fluent because fluency comes from an understanding of the language. An LLM is incapable of understanding so it is incapable of being fluent. It may be able to mimic it but that is a different thing. (In my opinion)

???@lemmy.world · 5 months ago

This is not at all what I said. If a machine was complex enough to reason, all power to it. But these LLMs cannot.

el_bhm@lemm.ee · 5 months ago

They do not have any comprehension of the truth or untruth of what they are saying, and this means that when they say things that are true, they do not understand why those things are true.

Which can be beautifully exploited with sponsored content.

See Google I/O '24.

SnipingNinja@slrpnk.net · 5 months ago

What specifically in Google I/O?

EatATaco@lemm.ee · 5 months ago

they do not understand why those things are true.

Some researchers compared the results of questions between chat gpt 3 and 4. One of the questions was about stacking items in a stable way. Chat gpt 3 just, in line with what you are saying about “without understanding”, listed the items saying to place them one on top of each other. No way it would have worked.

Chat gpt 4, however, said that you should put the book down first, put the eggs in a 3 x 3 grid on top of the book, trap them in a way with a laptop so they don’t roll around, and then put the bottle on top of the laptop standing up, and then balance the nail on the top of it…even noting you have to put the flat end of the nail down. This sounds a lot like understanding to me and not just rolling the dice hoping to be correct.

Yes, AI confidently gets stuff wrong. But let’s all note that there is a whole subreddit dedicated to people being confidently wrong. One doesn’t need to go any further than Lemmy to see people confidently claiming to know the truth about shit they should know is outside of their actual knowledge. We’re all guilty of this. Including refusing to learn when we are wrong. Additionally, the argument that they can’t learn doesn’t make sense because models have definitely become better.

Now I’m not saying ai is conscious, I really don’t know, but all of your shortcomings you’ve listed humans are guilty of too. So to use it as examples as to why it’s always just a hallucination, or that our thoughts are not, doesn’t seem to hold much water to me.

insaan@leftopia.org · 5 months ago

the argument that they can’t learn doesn’t make sense because models have definitely become better.

They have to be either trained with new data or their internal structure has to be improved. It’s an offline process, meaning they don’t learn through chat sessions we have with them (if you open a new session it will have forgotten what you told it in a previous session), and they can’t learn through any kind of self-directed research process like a human can.

all of your shortcomings you’ve listed humans are guilty of too.

LLMs are sophisticated word generators. They don’t think or understand in any way, full stop. This is really important to understand about them.

EatATaco@lemm.ee · 5 months ago

They have to be either trained with new data or their internal structure has to be improved. It’s an offline process, meaning they don’t learn through chat sessions we have with them (if you open a new session it will have forgotten what you told it in a previous session), and they can’t learn through any kind of self-directed research process like a human can.

Most human training is done through the guidance of another, additionally, most of this is training is done through an automated process where some computer is just churning through data. And while you are correct that the context does not exist from one session to the next, you can in fact teach it something and it will maintain it during the session. It’s just like moving to a new session is like talking to completely different person, and you’re basically arguing “well, I explained this one thing to another human, and this human doesn’t know it. . .so how can you claim it’s thinking?” And just imagine the disaster that would happen if you would just allow it to be trained by anyone on the web. It would be spitting out memes, racism, and right wing propaganda within days. lol

They don’t think or understand in any way, full stop.

I just gave you an example where this appears to be untrue. There is something that looks like understanding going on. Maybe it’s not, I’m not claiming to know, but I have not seen a convincing argument as to why. Saying “full stop” instead of an actual argument as to why just indicates to me that you are really saying “stop thinking.” And I apologize but that’s not how I roll.

insaan@leftopia.org · edit-2 5 months ago

Most human training is done through the guidance of another

Let’s take a step back and not talk about training at all, but about spontaneous learning. A baby learns about the world around it by experiencing things with its senses. They learn a language, for example, simply by hearing it and making connections - getting corrected when they’re wrong, yes, but they are not trained in language until they’ve already learned to speak it. And once they are taught how to read, they can then explore the world through signs, books, the internet, etc. in a way that is often self-directed. More than that, humans are learning at every moment as they interact with the world around them and with the written word.

An LLM is a static model created through exposure to lots and lots of text. It is trained and then used. To add to the model requires an offline training process, which produces a new version of the model that can then be interacted with.

you can in fact teach it something and it will maintain it during the session

It’s still not learning anything. LLMs have what’s known as a context window that is used to augment the model for a given session. It’s still just text that is used as part of the response process.

They don’t think or understand in any way, full stop.

I just gave you an example where this appears to be untrue. There is something that looks like understanding going on.

You seem to have ignored the preceding sentence: “LLMs are sophisticated word generators.” This is the crux of the matter. They simply do not think, much less understand. They are simply taking the text of your prompts (and the text from the context window) and generating more text that is likely to be relevant. Sentences are generated word-by-word using complex math (heavy on linear algebra and probability) where the generation of each new word takes into account everything that came before it, including the previous words in the sentence it’s a part of. There is no thinking or understanding whatsoever.

This is why Voroxpete@sh.itjust.works said in the original post to this thread, “They hallucinate all answers. Some of those answers will happen to be right.” LLMs have no way of knowing if any of the text they generate is accurate for the simple fact that they don’t know anything at all. They have no capacity for knowledge, understanding, thought, or reasoning. Their models are simply complex networks of words that are able to generate more words, usually in a way that is useful to us. But often, as the hallucination problem shows, in ways that are completely useless and even harmful.

KeenFlame@feddit.nu · 5 months ago

You are just wrong

Cyberflunk@lemmy.world · 5 months ago

Holy shit. Dunning Kruger is fully engaged in these post comments

KillingTimeItself@lemmy.dbzer0.com · 5 months ago

it’s only going to get worse, especially as datasets deteriorate.

With things like reddit being overrun by AI, and also selling AI training data, i can only imagine what mess that’s going to cause.

Cyberflunk@lemmy.world · 5 months ago

Hallucinations, like depression, is a multifaceted issue. Training data is only a piece of it. Quantized models, overfitted training models rely on memory at the cost of obviously correct training data. Poorly structured Inferences can confuse a model.

Rest assured, this isn’t just training data.

KillingTimeItself@lemmy.dbzer0.com · 5 months ago

yeah there’s also this stuff as well, though i consider that to be a more technical challenge, rather than a hard limit.

ClamDrinker@lemmy.world · edit-2 5 months ago

It will never be solved. Even the greatest hypothetical super intelligence is limited by what it can observe and process. Omniscience doesn’t exist in the physical world. Humans hallucinate too - all the time. It’s just that our approximations are usually correct, and then we don’t call it a hallucination anymore. But realistically, the signals coming from our feet take longer to process than those from our eyes, so our brain has to predict information to create the experience. It’s also why we don’t notice our blinks, or why we don’t see the blind spot our eyes have.

AI representing a more primitive version of our brains will hallucinate far more, especially because it cannot verify anything in the real world and is limited by the data it has been given, which it has to treat as ultimate truth. The mistake was trying to turn AI into a source of truth.

Hallucinations shouldn’t be treated like a bug. They are a feature - just not one the big tech companies wanted.

When humans hallucinate on purpose (and not due to illness), we get imagination and dreams; fuel for fiction, but not for reality.

Queen HawlSera@lemm.ee · edit-2 5 months ago

You assume the physical world is all there is or that the AI has any real intelligence at all. It’s a damn chinese room.

Wirlocke@lemmy.blahaj.zone · 5 months ago

I’m a bit annoyed at all the people being pedantic about the term hallucinate.

Programmers use preexisting concepts as allegory for computer concepts all the time.

Your file isn’t really a file, your desktop isn’t a desk, your recycling bin isn’t a recycling bin.

[Insert the entirety of Object Oriented Programming here]

Neural networks aren’t really neurons, genetic algorithms isn’t really genetics, and the LLM isn’t really hallucinating.

But it easily conveys what the bug is. It only personifies the LLM because the English language almost always personifies the subject. The moment you apply a verb on an object you imply it performed an action, unless you limit yourself to esoteric words/acronyms or you use several words to overexplain everytime.

abrinael@lemmy.world · edit-2 5 months ago

What I don’t like about it is that it makes it sound more benign than it is. Which also points to who decided to use that term - AI promoters/proponents.

Edit: it’s like all of the bills/acts in congress where they name them something like “The Protect Children Online Act” and you ask, “well, what does it do?” And they say something like, “it lets local police read all of your messages so they can look for any dangers to children.”

zalgotext@sh.itjust.works · 5 months ago

The term “hallucination” has been used for years in AI/ML academia. I reading about AI hallucinations ten years ago when I was in college. The term was originally coined by researchers and mathematicians, not the snake oil salesman pushing AI today.

abrinael@lemmy.world · 5 months ago

I had no idea about this. I studied neural networks briefly over 10 years ago, but hadn’t heard the term until the last year or two.

KeenFlame@feddit.nu · 5 months ago

We were talking about when it was coined, not when you heard it first

Wirlocke@lemmy.blahaj.zone · edit-2 5 months ago

In terms of LLM hallucination, it feels like the name very aptly describes the behavior and severity. It doesn’t downplay what’s happening because it’s generally accepted that having a source of information hallucinate is bad.

I feel like the alternatives would downplay the problem. A “glitch” is generic and common, “lying” is just inaccurate since that implies intent to deceive, and just being “wrong” doesn’t get across how elaborately wrong an LLM can be.

Hallucination fits pretty well and is also pretty evocative. I doubt that AI promoters want to effectively call their product schizophrenic, which is what most people think when hearing hallucination.

Ultmately all the sciences are full of analogous names to make conversations easier, it’s not always marketing. No different than when physicists say particles have “spin” or “color” or that spacetime is a “fabric” or [insert entirety of String theory]…

abrinael@lemmy.world · edit-2 5 months ago

After thinking about it more, I think the main issue I have with it is that it sort of anthropomorphises the AI, which is more of an issue in applications where you’re trying to convince the consumer that the product is actually intelligent. (Edit: in the human sense of intelligence rather than what we’ve seen associated with technology in the past.)

You may be right that people could have a negative view of the word “hallucination”. I don’t personally think of schizophrenia, but I don’t know what the majority think of when they hear the word.

Knock_Knock_Lemmy_In@lemmy.world · 5 months ago

You could invent a new word, but that doesn’t help people understand the problem.

You are looking for an existing word that describes providing unintentionally incorrect thoughts but is totally unrelated to humans. I suspect that word doesn’t exist. Every thinking word gets anthropomorphized.