News, news analysis, and commentary on the latest trends in cybersecurity technology.
Researchers Highlight How Poisoned LLMs Can Suggest Vulnerable Code
CodeBreaker technique can create code samples that poison the output of code-completing large language models, resulting in vulnerable — and undetectable — code suggestions.
August 20, 2024
Developers are embracing artificial intelligence (AI) programming assistants for help writing code, but new research shows they need to analyze code suggestions before incorporating them into their codebases to avoid introducing possible vulnerabilities.
Last week, a team of researchers from three universities identified techniques for poisoning training data sets that could lead to attacks where large language models (LLMs) are manipulated into releasing vulnerable code. Dubbed CodeBreaker, the method creates code samples that are not detected as malicious by static analysis tools but can still be used to poison code-completion AI assistants to suggest vulnerable and exploitable code to developers. The technique refines previous methods of poisoning LLMs, is better at masking malicious and vulnerable code samples, and is capable of effectively inserting backdoors into code during development.
As a result, developers will have to check closely any code suggested by LLMs, rather than just cutting and pasting code snippets, says Shenao Yan, a doctorate student in trustworthy machine learning at the University of Connecticut and an author of the paper presented at the USENIX Security Symposium.
"It is crucial to train the developers to foster a critical attitude toward accepting code suggestions, ensuring they review not only functionality but also the security of their code," he says. "Secondly, training developers in prompt engineering for generating more secure code is vital."
Poisoning developer tools with insecure code is not new. Tutorials and code suggestions posted to StackOverflow, for example, have both been found to have vulnerabilities, with one group of researchers discovering that out of 2,560 C++ code snippets, 69 had vulnerabilities leading to vulnerable code appearing in more than 2,800 public projects.
The research is just the latest to highlight that AI models can be poisoned by inserting malicious examples into their training sets, says Gary McGraw, co-founder of the Berryville Institute of Machine Learning.
"LLMs become their data, and if the data are poisoned, they happily eat the poison," he says.
Bad Code and Poison Pills
The CodeBreaker research builds on previous work, such as COVERT and TrojanPuzzle. The simplest data poisoning attack inserts vulnerable code samples into the training data for LLMs, leading to code suggestions that include vulnerabilities. The COVERT technique bypasses static detection of poisoned data by moving the insecure suggestion into the comments or documentation — or docstrings — of a program. Improving that technique, TrojanPuzzle uses a variety of samples to teach an AI model a relationship that will result in a program returning insecure code.
CodeBreaker uses code transformations to create vulnerable code that continues to function as expected but will not be detected by major static analysis security testing. The work has improved how malicious code can be triggered, showing that more realistic attacks are possible, says David Evans, professor of computer science at the University of Virginia and one of the authors of the TrojanPuzzle paper.
"The TrojanPuzzle work ... demonstrate[s] the possibility of poisoning a code-generation model using code that does not appear to contain any malicious code — for example, by hiding the malicious code in comments and splitting up the malicious payload," he says. Unlike the CodeBreaker work, however, it "didn't address whether the generated code would be detected as malicious by scanning tools used on the generated source code."
While the LLM-poisoning technique are interesting, in many ways, code-generating models have already been poisoned by the large volume of vulnerable code scraped from the Internet and used as training data, making the greatest current risk the acceptance of output from code-recommendation models without checking the security of the code, says Neal Swaelens, head of product, LLM Security, at Protect AI, which focuses on securing the AI-software supply chain.
"Initially, developers might scrutinize the generated code more carefully, but over time, they may begin to trust the system without question," he says. "It's similar to asking someone to manually approve every step of a dance routine — doing so similarly defeats the purpose of using an LLM to generate code. Such measures would effectively lead to 'dialogue fatigue,' where developers mindlessly approve generated code without a second thought."
Companies that are experimenting with directly connecting AI systems to automated actions — so-called AI agents — should focus on eliminating LLM errors before relying on such systems, Swaelens says.
Better Data Selection
The creators of code assistants need to make sure that they are adequately vetting their training data sets and not relying on poor metrics of security that will miss obfuscated, but malicious, code, says researcher Yan. The popularity ratings of open source projects, for example, are poor metrics of security because repository promotion services can boost popularity metrics.
"To enhance the likelihood of inclusion in fine-tuning datasets, attackers might inflate their repository’s rating," Yan says. "Typically, repositories are chosen for fine-tuning based on GitHub's star ratings, and as few as 600 stars are enough to qualify as a top-5000 Python repository in the GitHub archive."
Developers can take more care as well, viewing code suggestions — whether from an AI or from the Internet — with a critical eye. In addition, developers need to know how to construct prompts to produce more secure code.
Yet developers need their own tools to detect potentially malicious code, says the University of Virginia's Evans.
"At most mature software development companies — before code makes it into a production system — there is a code review involving both humans and analysis tools," he says. "This is the best hope for catching vulnerabilities, whether they are introduced by humans making mistakes, deliberately inserted by malicious humans, or the result of code suggestions from poisoned AI assistants."
About the Author
You May Also Like