Hallucinations, like depression, is a multifaceted issue. Training data is only a piece of it. Quantized models, overfitted training models rely on memory at the cost of obviously correct training data. Poorly structured Inferences can confuse a model.
it’s only going to get worse, especially as datasets deteriorate.
With things like reddit being overrun by AI, and also selling AI training data, i can only imagine what mess that’s going to cause.
Hallucinations, like depression, is a multifaceted issue. Training data is only a piece of it. Quantized models, overfitted training models rely on memory at the cost of obviously correct training data. Poorly structured Inferences can confuse a model.
Rest assured, this isn’t just training data.
yeah there’s also this stuff as well, though i consider that to be a more technical challenge, rather than a hard limit.