ChatGPT gives wrong answers to programming questions more than 50% of the time
Some developers may be placing too much faith in generative AI, experts warn
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
You are now subscribed
Your newsletter sign-up was successful
ChatGPT has been found to produce incorrect answers for over half of the software engineering questions posed, according to fresh research.
The Purdue University study saw researchers analyze ChatGPT answers to 517 Stack Overflow questions, with the aim of assessing the “correctness, consistency, comprehensiveness, and conciseness” of answers presented by the generative AI tool.
Researchers reported 52% of answers to programming-related queries were inaccurate, while more than three-quarters (77%) were deemed “verbose”.
A key talking point from the study centered around user interpretation of answers presented by ChatGPT, as well as the perceived legitimacy of answers produced by the chatbot.
Researchers said ChatGPT’s answers are “still preferred [to Stack Overflow] 39.34% of the time due to their comprehensiveness and well-articulated language style”, resulting in users taking answers at face value.
“When a participant failed to correctly identify the incorrect answer, we asked them what could be the contributing factors,” researchers said. “Seven out of 12 participants mentioned the logical and insightful explanations, and comprehensive and easy-to-read solutions generated by ChatGPT made them believe it to be correct.”
Of the “preferred answers” identified by users to software queries, more than three-quarters (77%) of these were found to be wrong.
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
Researchers said that users were only able to identify errors in ChatGPT-based answers when it was glaringly obvious. However, in instances where the error was “not readily verifiable”, users frequently failed to identify incorrect answers or “underestimate the degree” of error in the answer itself.
RELATED RESOURCE
Application performance management for microservice applications on Kubernetes
This guide offers a deep understanding of the challenges faced in managing performance in a microservices architecture deployed on Kubernetes.
Surprisingly, the study also found that even when answers contained obvious errors, two out of 12 still marked them as correct and revealed they “preferred that answer”.
Researchers said the perceived legitimacy of answers presented by ChatGPT in this instance should be a cause for concern among users. “Communication correctness” should be a key focus for the creators of such tools.
ChatGPT does give users ample warning the answers provided may not be entirely accurate, stating the chatbot “may produce inaccurate information about people, places, or facts”.
But the study suggested “such a generic warning is insufficient” and recommended answers are complemented with a disclaimer outlining the “level of incorrectness and uncertainty”.
“Previous studies show that LLM knows when it is lying, but does LLM know when it is speculating? And how can we communicate the level of speculation?” the study pondered. “Therefore, it’s imperative to investigate how to communicate the level of incorrectness of the answers.”
The use of generative AI tools in software development and programming has gathered significant pace in recent years, most notably with the launch of GitHub’s Copilot services.
Earlier this year, the firm announced the general availability of the AI-based coding assistant for business customers. The model is designed specifically to bolster code safety and has been hailed by developers as a vital tool in supporting their daily operations.
A survey published by GitHub in June revealed that a majority (92%) of devs now use an AI coding tool at their work, with 70% stating they see “significant benefits” to using generative AI tools in workplace settings.

Ross Kelly is ITPro's News & Analysis Editor, responsible for leading the brand's news output and in-depth reporting on the latest stories from across the business technology landscape. Ross was previously a Staff Writer, during which time he developed a keen interest in cyber security, business leadership, and emerging technologies.
He graduated from Edinburgh Napier University in 2016 with a BA (Hons) in Journalism, and joined ITPro in 2022 after four years working in technology conference research.
For news pitches, you can contact Ross at ross.kelly@futurenet.com, or on Twitter and LinkedIn.
-
AutoCAD Users may have a ransomware problem – here's what they can doIn-depth A new malware family is currently using the same file types as the professional design software AutoCAD
-
Google Workspace just got a huge Gemini updateNews Google is targeting deeper Gemini integration across a range of Workspace applications
-
Microsoft has a new AI poster child in Anthropic – and it’s about timeOpinion Microsoft is cosying up to Anthropic at a crucial time in the race to deliver on AI promises
-
Will AI hiring entrench gender bias?ITPro Podcast This International Women's Day, it's more important than ever to consider the inherent biases of training data
-
Why Amazon’s ‘go build it’ AI strategy aligns with OpenAI’s big enterprise pushNews OpenAI and Amazon are both vying to offer customers DIY-style AI development services
-
February rundown: SaaS-pocalypse now?ITPro Podcast Geopolitical uncertainty is intensifying public and private sector focus on true sovereign workloads
-
‘A huge vote of confidence’: London set to host OpenAI's largest research hub outside USNews OpenAI wants to capitalize on the UK’s “world-class” talent in areas such as machine learning
-
Sam Altman just said what everyone is thinking about AI layoffsNews AI layoff claims are overblown and increasingly used as an excuse for “traditional drivers” when implementing job cuts
-
OpenAI's Codex app is now available on macOS – and it’s free for some ChatGPT users for a limited timeNews OpenAI has rolled out the macOS app to help developers make more use of Codex in their work
-
Amazon’s rumored OpenAI investment points to a “lack of confidence” in Nova model rangeNews The hyperscaler is among a number of firms targeting investment in the company