LLM models - a reality check!
Based on my personal experience and on the Economist article -> "AI models make stuff up. How can hallucinations be controlled?"
In the last couple of weeks, I have experienced several “hallucinations” - responses that are confident, coherent and just plain wrong- from LLM models that were really eye-openers on the potential limitations of the tool, that went from combinatorics math to simple math!!! (see print-screens below).
Example 1 - Simple Math Hallucination
Example 2 - Combinatorics Math Hallucination
In the past I have used LLMs mainly for Coding with excellent results, but it seems when we get to math and some degrees of reasoning (at least with the version I’m using) we immediately see its limits.
The below article from the Economist delves exactly into this topic and I found it quite interesting (recommend its reading).
AI models make stuff up. How can hallucinations be controlled?
Based on the article the key sources of hallucinations are:
language models are probabilistic, and a correct (math) answer is not.
the model used is a simplification/compression of the training data,
Fine-tuning a pretrained model, when the statistical coefficients of the model are updated for a specifical task.
The ways we can reduce the hallucinations are:
Improve the model weights methodology, several are currently being tested and implemented,
Bias the model to more conservative calculation, by using higher minimum threshold levels of accepted probabilities to select the next character,
Clever prompting, the one the user has an important role.
Hallucinations may be fading in the more advanced models, but they will not disappear, thus the only solution is:
Any successful real-world deployment of these models will probably require training humans how to use and view ai models as much as it will require training the models themselves.
I will continue to use LLMs as I can see its great potential, especially for coding, but as the old saying say:
“TRUST, BUT VERIFY”