RE: Odds and Ends — 9 May 2025

about 1 month ago

You are viewing a single comment's thread:

Over the past 18 months, hallucination rates for LLMs asked to summarize a news article have fallen from a range of 3%-27% down to a range of 1-2%. (Hallucinate is a technical term that means the model makes shit up.)

But new “reasoning” models that purportedly think through complex problems before giving an answer hallucinate at much higher rates.

OpenAI’s most powerful “state of the art” reasoning system, o3, hallucinates one-third of the time on a test answering questions about public figures, which is twice the rate of the previous reasoning system, o1. O4-mini makes stuff up about public figures almost half the time.

ChatGPT a 'schizophrenia-seeking missile'

It's quite easy to trigger hallucinations, intentionally or not, as any seasoned Hiver who asks a question about Hive will find out. One of the major problems seems to be that the system doesn't know what it doesn't know. For example, if you've read a book that the AI hasn't had access to, it will hallucinate all sorts of nonsense rather than say the AI equivalent of "I haven't read it". Once the system starts hallucinating, it is very easy to lead it down the garden path. As the system seems to be iterative, I imagine being fed and encouraged to manufacture nonsense is a problem.

oddsandends

0.000

0 comments