Hackers take advantage of AI hallucinations to sneak malicious software packages onto enterprise repositories
New research reveals a novel attack path where threat actors could leverage nonexistent open-source packages hallucinated by models to inject malware into enterprise repositories
 
 
AI hallucinations have opened a potential path for hackers to deliver malicious packages into the repositories of large organizations, new research shows.
The claims, made by Lasso Security researcher Bar Lanyado, follow a 2023 investigation into whether package recommendations given by ChatGPT should be trusted by developers.
In June 2023, Lanyado found LLMs were frequently hallucinating when asked to recommend code libraries, suggesting developers download packages that don’t actually exist.
Lanyado warned this flaw could have a wide impact as a large portion of developers are starting to use AI chatbots over traditional search engines to research coding solutions.
This initial investigation, since updated in a follow-up probe, sheds further light on the growing scale of the problem and warned that threat actors are picking up on this trend to create malicious packages using names that are frequently hallucinated by the models.
The research paper advised developers to only download from vetted libraries to avoid installing malicious code.
Lanyado stated that his aim with this latest study was to ascertain whether the model makers have dealt with issues he highlighted in August last year, or if package hallucinations remain a problem six months after his initial findings.
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
One of the most stark findings from this latest study related to a hallucinated python package repeatedly dreamt up by a number of models called ‘huggingface-cli’.
Researchers uploaded an empty package with the same name to gauge whether these were being uncritically downloaded to repositories by developers using coding assistants.
He also included a dummy package to verify how many downloads were by real people, or just scanners.
Lanyado found the fake Hugging Face package received over 30,000 authentic downloads in just three months, demonstrating the scale of the problem around relying on LLMs for development work.
Searching GitHub to see if the package had been added in any enterprise repositories, Lanyado found several large companies either use or recommend the package in their codebase.
One example involved Chinese multinational Alibaba, which provides instructions for installing the fake python package in the README of a repository dedicated to inhouse company research.
Almost one-in-three questions elicited one or more AI hallucinations
In his initial study, Lanyado used 457 questions on over 40 subjects in two programming languages to test the reliability of GPT-3.5 turbo’s suggestions.
Lanyado found that just under 30% of questions elicited at least one hallucinated package from the model. The latest research has an expanded scope in terms of the questions he puts to the models, as well as the number of programming languages he tested.
Researchers also tested a number of different models including GPT-3.5 and 4, Google’s Gemini Pro, and Cohere’s Command, comparing performance and looking for overlaps where hallucinated packages were received in more than one model.
The latest investigation used 2,500 questions on 100 subjects in 5 different programming languages including Python, node.js, go, .net, and ruby.
RELATED WHITEPAPER
  
Lanyado’s prompts were optimized to simulate the workflow of software developers, using ‘how to’ questions to get the model to provide a solution that includes a specific package.
One other alteration made to the study was to use the Langchain application development framework to take care of the interactions with the models.
The benefit of this is using the system’s default prompt that instructs the model to say if it doesn’t know the answer, which he hoped would make the LLMs hallucinate less often.
The investigation found the frequency of hallucinations varied between models, with GPT-3.5 being the least likely to generate fake packages with a hallucination rate of 22.2%.
Not too far behind were GPT-4 (24.2%) and Cohere (29.1%), but by far the least reliable model in Lanyado’s testing was Gemini, which dreamed up code packages 64.5%.

Solomon Klappholz is a former staff writer for ITPro and ChannelPro. He has experience writing about the technologies that facilitate industrial manufacturing, which led to him developing a particular interest in cybersecurity, IT regulation, industrial infrastructure applications, and machine learning.
- 
 Global IT spending set to exceed $6 trillion in 2026 Global IT spending set to exceed $6 trillion in 2026News Several key areas are expected to drive the bulk of investment next year 
- 
 Data engineers have never been more important, as businesses are starting to find out Data engineers have never been more important, as businesses are starting to find outNews An MIT survey for Snowflake shows the changing role of data engineers – and their rise in influence 
- 
 Organizations urged to act fast after GitHub Action supply chain attack Organizations urged to act fast after GitHub Action supply chain attackNews More than 20,000 organizations may be at risk following a supply chain attack affecting tj-actions/changed-files GitHub Action. 
- 
 Nearly a million devices were infected in a huge GitHub malvertising campaign Nearly a million devices were infected in a huge GitHub malvertising campaignNews Microsoft has alerted users to a malvertising campaign leveraging GitHub to infect nearly 1 million devices around the world. 
- 
 'GitVenom' campaign uses dodgy GitHub repositories to spread malware 'GitVenom' campaign uses dodgy GitHub repositories to spread malwareNews Security researchers have issued an alert over a campaign using GitHub repositories to distribute malware, with users lured in by fake projects. 
- 
 Malicious GitHub repositories target users with malware Malicious GitHub repositories target users with malwareNews Criminals are exploiting GitHub's reputation to install Lumma Stealer disguised as game hacks and cracked software 
- 
 A leaked GitHub access token could have led to a catastrophic supply chain attack A leaked GitHub access token could have led to a catastrophic supply chain attackNews The GitHub access token with administrator level privileges could have been used to great effect by threat actors 
- 
 Hackers have found yet another way to trick devs into downloading malware from GitHub Hackers have found yet another way to trick devs into downloading malware from GitHubNews Threat actors have developed a new way to covertly embed malicious files into legitimate repositories on both GitHub and GitLab using the comment section 
- 
 Hackers are abusing GitHub's search function to spread malware Hackers are abusing GitHub's search function to spread malwareNews Hackers are using the names of popular GitHub repositories to trick users into downloading malicious code, new research reveals. 
- 
 Hackers are spoofing themselves as GitHub's Dependabot to steal user passwords Hackers are spoofing themselves as GitHub's Dependabot to steal user passwordsNews GitHub Dependabot was crudely spoofed in hundreds of successful attacks on open source projects