OpenAI’s Whisper Under Fire: Speech-to-Text Tool’s Hallucination Problem Raises Serious Concerns

By Byte Staff Hardware
FILE PHOTO: OpenAI and ChatGPT logos are seen in this illustration taken, February 3, 2023. REUTERS/Dado Ruvic/Illustration/File Photo

Researchers and developers have identified hallucinations in a substantial portion of Whisper’s transcriptions. A University of Michigan researcher found hallucinations in 8 out of every 10 audio transcriptions of public meetings. A machine learning engineer analyzed over 100 hours of Whisper transcriptions and discovered hallucinations in more than half of them.

Hallucinations in Whisper Transcriptions

Researchers and engineers have uncovered significant issues with the accuracy of Whisper, a popular open-source speech recognition model. A University of Michigan researcher found that 8 out of 10 audio transcriptions of public meetings contained hallucinations, where the model generated text that did not correspond to the actual speech. Furthermore, a machine learning engineer’s analysis of over 100 hours of Whisper transcriptions revealed hallucinations in more than half of the recordings.

These findings raise concerns about the reliability of Whisper, which is often used for a variety of applications, including meeting transcription, audio captioning, and spoken language understanding. The prevalence of hallucinations in Whisper’s output could lead to inaccurate or misleading information being presented to users, with potentially serious consequences in certain contexts.

Share This Article
Leave a Comment