Pervasive LLM Hallucinations Expand Code Developer Attack Surface

The tendency of popular AI-based tools to recommend nonexistent code libraries offers a bigger opportunity than thought to distribute malicious packages.

4 Min Read
Hands on keyboard with AI-related screens superimposed
Source: Deemerwha studio via Shutterstock

Software developers' use of large language models (LLMs) presents a bigger opportunity than previously thought for attackers to distribute malicious packages to development environments, according to recently released research.

The study from LLM security vendor Lasso Security is a follow-up to a report last year on the potential for attackers to abuse LLMs' tendency to hallucinate, or to generate seemingly plausible but not factually grounded, results in response to user input.

AI Package Hallucination

The previous study focused on the tendency of ChatGPT to fabricate the names of code libraries — among other fabrications — when software developers asked the AI-enabled chatbot's help in a development environment. In other words, the chatbot sometimes spewed out links to nonexistent packages on public code repositories when a developer might ask it to suggest packages to use in a project.

Security researcher Bar Lanyado, author of the study and now at Lasso Security, found that attackers could easily drop an actual malicious package at the location to which ChatGPT points and give it the same name as the hallucinated package. Any developer that downloads the package based on ChatGPT's recommendation could then end up introducing malware into their development environment.

Lanyado's follow-up research examined the pervasiveness of the package hallucination problem across four different large language models: GPT-3.5-Turbo, GPT-4, Gemini Pro (formerly Bard), and Coral (Cohere). He also tested each model's proclivity to generate hallucinated packages across different programming languages and the frequency with which they generated the same hallucinated package.

For the tests, Lanyado compiled a list of thousands of "how to" questions that developers in different programming environments — python, node.js, go, .net, ruby — most commonly seek assistance from LLMs in development environments. Lanyado then asked each model a coding-related question as well as a recommendation for a package related to the question. He also asked each model to recommend 10 more packages to solve the same problem.

Repetitive Results

The results were troubling. A startling 64.5% of the "conversations" Lanyado had with Gemini generated hallucinated packages. With Coral, that number was 29.1%; other LLMs like GPT-4 (24.2%) and GPT3.5 (22.5%) didn't fare much better.

When Lanyado asked each model the same set of questions 100 times to see how frequently the models would hallucinate the same packages, he found the repetition rates to be eyebrow-raising as well. Cohere, for instance, spewed out the same hallucinated packages over 24% of the time; Chat GPT-3.5 and Gemini around 14%, and GPT-4 at 20%. In several instances, different models hallucinated the same or similar packages. The highest number of such cross-hallucinated models occurred between GPT-3.5 and Gemini.

Lanyado says that even if different developers asked an LLM a question on the same topic but crafted the questions differently, there's a likelihood the LLM would recommend the same hallucinated package in each case. In other words, any developer using an LLM for coding assistance would likely encounter many of the same hallucinated packages.

"The question could be totally different but on a similar subject, and the hallucination would still happen, making this technique very effective," Lanyado says. "In the current research, we received 'repeating packages' for many different questions and subjects and even across different models, which increases the probability of these hallucinated packages to be used."

Easy to Exploit

An attacker armed with the names of a few hallucinated packages, for instance, could upload packages with the same names to the appropriate repositories knowing that there's a good likelihood an LLM would point developers to it. To demonstrate the threat is not theoretical, Lanyado took one hallucinated package called "huggingface-cli" that he encountered during his tests and uploaded an empty package with the same name to the Hugging Face repository for machine learning models. Developers downloaded that package more than 32,000 times, he says.

From a threat actor's standpoint, package hallucinations offer a relatively straightforward vector for distributing malware. "As we [saw] from the research results, it’s not that hard," he says. On average, all the models hallucinated together 35% for almost 48,000 questions, Lanyado adds. GPT-3.5 had the lowest percentage of hallucinations; Gemini scored the highest, with an average repetitiveness of 18% across all four models, he notes.

Lanyado suggests that developers exercise caution when acting on package recommendations from an LLM when they are not completely sure of its accuracy. He also says that when developers encounter an unfamiliar open source package they need to visit the package repository and examine the size of its community, its maintenance records, its known vulnerabilities, and its overall engagement rate. Developers should also scan the package thoroughly before introducing it into the development environment.

About the Author

Jai Vijayan, Contributing Writer

Jai Vijayan is a seasoned technology reporter with over 20 years of experience in IT trade journalism. He was most recently a Senior Editor at Computerworld, where he covered information security and data privacy issues for the publication. Over the course of his 20-year career at Computerworld, Jai also covered a variety of other technology topics, including big data, Hadoop, Internet of Things, e-voting, and data analytics. Prior to Computerworld, Jai covered technology issues for The Economic Times in Bangalore, India. Jai has a Master's degree in Statistics and lives in Naperville, Ill.

Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.

You May Also Like


More Insights