A recent study by Legal Guardian Digital has revealed significant discrepancies in the accuracy of popular chatbots, warning of a phenomenon known as “hallucination,” which could lead to users being misled by completely false information.
Beyond the technical complexities, Large Language Models (LLMs) rely on statistical patterns to predict the next word. When a model fails to find an accurate pattern for the answer, it constructs words that seem statistically logical but lack factual accuracy. This means the bot isn’t intentionally lying; it’s simply executing its programming to try and provide an answer even if it lacks the necessary information.
The study revealed a surprising finding regarding Google Gemini, which topped the list of the most “hallucinatory” chatbots with an error rate of 32%. These figures may be a cause for concern for Apple, which reportedly pays Google $1 billion annually to use a customized version of Gemini to enhance Siri’s engine in the upcoming iOS 27 operating system.
ChatGPT came in second in terms of error rate, providing inaccurate information in 30% of its responses—double the error rate of its Chinese competitor, DeepSeek.
On the other hand, Perplexity AI proved to be the most reliable, with a hallucination rate of just 13%, followed by DeepSeek at 14%, and Elon Musk’s Grok at 15%.
The study indicated that accuracy is not the only criterion; availability is also crucial. Perplexity and Grok were the only two engines to maintain 100% uptime throughout the study period. ChatGPT achieved an availability rate of 99.98%, while Cloud came in last with 99.68%, which is still considered a very reliable figure.

