ChatGPT has gone haywire, and nobody knows why
Users of ChatGPT have reported major issues with the chatbot, specifically using the GPT-4 model - and OpenAI is yet to confirm the cause
ChatGPT appears to be encountering serious technical problems that are causing it to provide nonsensical answers to user queries.
Users of the artificial intelligence (AI) tool have reported a number of issues over the last 12 hours, with some reporting that it has been speaking “Spanglish” in response to specific queries.
“As of about three hours ago all of my conversations with GPT4 devolve very quickly into garbage,” one user reported.
The chatbot was also recorded producing unprompted answers that amounted to ‘gibberish’, according to some users, and appeared to be mixing up languages.
In one instance, a user who input a prompt for advice on a coding-related issue was met with a long winded, near incomprehensible answer including the phrase “let’s keep the line as if AI in the room”.
The issues appear to be specifically affecting GPT-4, OpenAI’s most powerful large language model (LLM) that users can select when using ChatGPT.
OpenAI, the company behind ChatGPT, has issued a statement informing users that it’s aware of the situation and is working to remediate issues.
Get the ITPro. daily newsletter
Receive our latest news, industry updates, featured resources and more. Sign up today to receive our FREE report on AI cyber crime & security - newly updated for 2024.
“We are investigating reports of unexpected responses from ChatGPT,” the company said in an update. OpenAI’s status page noted that the issue had been identified.
“We’re continuing to monitor the situation,” the latest update reads.
What’s behind the ChatGPT issue?
The exact cause of the problem has yet to be determined or confirmed by OpenAI, but users have nonetheless speculated over the potential causes.
Reddit users questioned whether the cause of the issue was due to the temperature-related changes. For context, ‘temperature’ in this instance refers to a specific setting that determines the creativity of the model.
OpenAI sets a range of between 0 and 2.0. In most cases, the lower range of setting is used primarily for focused and detailed outputs, while the higher range of the scale is aimed at producing creative responses and random results.
“Over 1 makes it fairly unpredictable, over 1.5 almost completely unstable,” one user noted.
“In the web version of ChatGPT these settings are hidden and predetermined but if you use their API you can experiment with the output.”
ChatGPT hallucinations aren’t anything new
Hallucinations are a relatively common phenomenon in which a large language model (LLM) will produce a confidently incorrect output. This could come in the form of providing a citation that doesn’t exist, or in more extreme cases a response to a question that descends into complete gibberish.
As AI developers have produced more advanced models, they have gradually trained models to hallucinate less, or otherwise limited models to only produce more off-the-wall output under specific conditions such as a user asking for more creative responses.
Chatbots can be reined in using instructions at a developer level, which may further limit them from hallucinating even at the behest of users so that all responses stay within the code of conduct. But all chatbots, including ChatGPT, still come with the chance of hallucinations.
It’s a problem everyone’s trying to figure out. There is a reputational cost associated with hallucinations, with OpenAI having had to promise GPT-3 was “less toxic” after the model was associated with fake news and discriminatory outputs.
Google has also expressed concern about hallucinations, having fallen victim itself after a factually-incorrect answer from its Bard chatbot at its keynote unveiling led to Google stock plunging 7%.
A key challenge that developers will continue to face, however, is that hallucinations are wrapped up in the ‘generative’ aspect of generative AI.
All outputs from LLMs are only based on statistical models and curbing the extent to which a model can inject randomness into its responses could also severely impede its ability to produce outputs such as draft texts or images.
OpenAI’s chief executive Sam Altman has described hallucinations as the “magic” of generative AI and as yet no company has put forward an architecture that guarantees users will always get the answer they are looking for with their LLM.
It’s for this reason that so many IT leaders are looking at secondary and tertiary controls for AI systems, at a human and software level, to ensure that unwanted outputs are flagged and never damage their business.
Ross Kelly is ITPro's News & Analysis Editor, responsible for leading the brand's news output and in-depth reporting on the latest stories from across the business technology landscape. Ross was previously a Staff Writer, during which time he developed a keen interest in cyber security, business leadership, and emerging technologies.
He graduated from Edinburgh Napier University in 2016 with a BA (Hons) in Journalism, and joined ITPro in 2022 after four years working in technology conference research.
For news pitches, you can contact Ross at ross.kelly@futurenet.com, or on Twitter and LinkedIn.
- Rory BathgateFeatures and Multimedia Editor