News, news analysis, and commentary on the latest trends in cybersecurity technology.
AI-Generated Patches Could Ease Developer, Operations Workload
Using information from a common technique for finding vulnerabilities, Google's Gemini LLM can produce patches for 15% of such bugs. And it's not the only way to help automate bug fixing.
February 20, 2024
One of the tantalizing possibilities of large language models (LLMs) is speeding up software development by finding and closing common types of bugs. Now the technology is seeing modest success in generating fixes for well-defined classes of vulnerabilities.
Google announced that its Gemini LLM, for example, can fix 15% of the bugs found using one dynamic application security testing (DAST) technique — a small but significant efficiency gain when dealing with thousands of vulnerabilities produced every year that often are not prioritized by developers. Using information from sanitizers — tools that offer one way to find bugs at runtime — the LLM detected hundreds of uninitialized values, simultaneous data access violations, and buffer overflows, two Google researchers stated in a paper published at the end of January.
The approach could help companies eliminate some of their backlog of vulnerabilities, says Jan Keller, technical program manager with Google and a co-author of the paper.
"Typically, fixing bugs is not something that we as security engineers or software engineers are good at because, for us, it's more interesting to code the newest feature than to go back and fix sanitizer bugs that are laying around," he says.
Found by Sanitizers, Fixed by AI
Google's approach focuses on fixing vulnerabilities found using sanitizers — dynamic application security tools (DAST) that instrument an application and replace memory functions to allow for error detection and reporting. Typically, developers test their code with sanitizers after creating a working application, and after the code has been committed, but before the application is released to production. As a result, bugs found via sanitizer are deemed less critical — and are slower to be fixed — because they are not blocking the release of software, according to Google.
When the Google researchers decided to try the experiment, they had no idea whether it would work and were happy with the initial single-digit success, Google's Keller says.
The main advantage of Google's approach is that the artificial intelligence (AI) not only suggested patches, but the researchers were able to test the patch candidates using automation, according to Eitan Worcel, CEO and co-founder at Mobb, a startup focused on automating code fixing for developers. Otherwise, the problem would merely shift from finding vulnerabilities in massive amounts of code to testing a massive number of patches.
"If you have 10, 50, 10,000 [potential vulnerabilities] — which is often the case with static analysis scans — you will have hundreds or thousands of results. You won't be able to use it, right?" he says. "If it's not a tool that has great accuracy and gives you a result that you can depend on, and it just produces a large amount of potential vulnerabilities to fix, I don't see anyone using it. They will just go back to not fixing."
Google, however, had an automated way to test each patch. The researchers compiled the software with the patched code; if it continued to run, the researchers considered that as passing the test.
"So we have an automated build environment where we can check whether the fix that was produced actually fixes the bug," Google's Keller says. "So there's a large portion of invalid patches filtered out, because we'll see that the patched software doesn't run anymore and know that the patch is wrong, or that the vulnerability is still there, or that the bug is still there."
Automated bug-fixing systems will be needed more and more as AI tools help developers produce more code and, likely, more vulnerabilities. As companies increasingly use machine learning (ML) models to find bugs, the list of issues that need to be triaged and fixed will grow, Google notes.
Technology a Decade in the Making
Using AI/ML models to fix software and create patches is not new. In 2015, nonprofit engineering firm Draper created a system called DeepCode that analyzed massive volumes of software code to find errors and suggest fixes. Software security firm Veracode has created its own system, dubbed Fix, using a strictly curated dataset of reference patches for specific classes of vulnerabilities in specific languages to deliver suggested fixes for any vulnerabilities its system discovers in a client's codebase.
Whether a more general method, such as Google's research, or a more tailored approach works best remains to be seen, says Chris Eng, chief research officer for Veracode.
"When you throw a general-purpose AI at a problem and just expect it to solve any unbounded open question you throw at it, of course you're going to get mixed results," he says. "I think the narrower that you're able to hone in on a particular type of problem that you're asking them to solve, the better success that you're going to have. And experimentally, that's what we've seen as well."
Bringing AI Patching to IT Operations
The applications of AI/ML models hold promise not only to create fixes for vulnerabilities discovered during development, but to help create and apply patches to systems as part of IT operations.
For companies, the way to fix vulnerable software is to patch the application. The greatest risk in patching comes from adverse side effects that occur when a change in software breaks a production system. LLMs' ability to sift through data could help make sure it doesn't, says Eran Livne, senior director of product management for endpoint remediation at vulnerability management firm Qualys.
"The biggest concerns we see across the board with every customer is they know that they have a vulnerability, but they're worried if they patch it, something else will break," he says. "Not, 'How do we patch it?' but, 'What happens if I patch it?'"
Intelligent automation can help developers fix software more efficiently and help companies apply the fixes faster. In both cases, applying AI to verify and fix as many bugs — and systems — as possible can help reduce the backlog of existing vulnerabilities.
About the Author
You May Also Like