Attacks on Bytecode Interpreters Conceal Malicious Injection Activity

By injecting malicious bytecode into interpreters for VBScript, Python, and Lua, researchers found they can circumvent malicious code detection.

4 Min Read
A magnified section of computer code on a screen
Source: Casimiro PT via Shutterstock

Attackers can hide their attempts to execute malicious code by inserting commands into the machine code stored in memory by the software interpreters used by many programming languages, such as VBScript and Python, a group of Japanese researchers will demonstrate at next week's Black Hat USA conference.

Interpreters take human-readable software code and translate each line into bytecode — granular programming instructions understood by the underlying, often virtual, machine. The research team successfully inserted malicious instructions into the bytecode held in memory prior to execution, and because most security software does not scan bytecode, their changes escaped detection.

The technique could allow attackers to hide their malicious activity from most endpoint security software. Researchers from NTT Security Holdings Corp. and the University of Tokyo will demonstrate the capability at Black Hat using the VBScript interpreter, says Toshinori Usui, research scientist with NTT Security. The researchers have already confirmed that the technique also works for inserting malicious code in the in-memory processes of both the Python and the Lua interpreters.

"Malware often hides its behavior by injecting malicious code into benign processes, but existing injection-type attacks have characteristic behaviors ... which are easily detected by security products," Usui says. "The interpreter does not care about overwriting by a remote process, so we can easily replace generated bytecode with our malicious code — it's that feature we exploit."

Bytecode attacks are not necessarily new, but they are relatively novel. In 2018, a group of researchers from the University of California at Irvine published a paper, "Bytecode Corruption Attacks Are Real — And How to Defend Against Them," introducing bytecode attacks and defenses. Last year, the administrators of the Python Package Index (PyPI) removed a malicious package, known as fshec2, which escaped initial detection because all its malicious code was compiled as bytecode. Python compiles its bytecode into PYC files, which can be executed by the Python interpreter.

"It may be the first supply chain attack to take advantage of the fact that Python byte code (PYC) files can be directly executed, and it comes amid a spike in malicious submissions to the Python Package Index," Karlo Zanki, reverse engineer at ReversingLabs, said in a June 2023 analysis of the incident. "If so, it poses yet another supply chain risk going forward, since this type of attack is likely to be missed by most security tools, which only scan Python source code (PY) files."

Going Beyond Precompiled Malware

After an initial compromise, attackers have a few options to expand their control of a targeted system: They can perform reconnaissance, try to further compromise the system using malware, or run tools already existing on the system — the so-called strategy of "living off the land."

The NTT researchers' variation of bytecode attack techniques essentially falls into the last category. Rather than using pre-compiled bytecode files, their attack — dubbed Bytecode Jiu-Jitsu — involves inserting malicious bytecode into the memory space of a running interpreter. Because most security tools do not look at bytecode in memory, the attack is able to hide the malicious commands from inspection.

The approach allows attacker to skip other more obviously malicious steps, such as calling suspicious APIs to create threads, allocating executable memory, and modifying instruction pointers, Usui says.

"While native code has instructions directly executed by the CPU, bytecode is just data to the CPU and is interpreted and executed by the interpreter," he says. "Therefore, unlike native code, bytecode does not require execution privilege, [and our technique] does not need to prepare a memory region with execution privilege."

Better Interpreter Defenses

Developers of interpreters, security-tools developers, and operating-system architects can all have some impact on the problem. While attacks targeting bytcode do not exploit vulnerabilities in interpreters, but rather the way that they execute code, certain security modifications such as pointer checksums could mitigate the risk, according to the UC Irvine paper.

The NTT Security researchers noted that checksum defenses would not likely be effective against their techniques and recommend that developers enforce write protections to help eliminate the risk. "The ultimate countermeasure is to restrict the memory write to the interpreter," Usui says.

The purpose of presenting a new attack technique is to show security researchers and defenders what could be possible, and not to inform attackers' tactics, he stresses. "Our goal is not to abuse defensive tactics, but to ultimately be an alarm bell for security researchers around the world," he says.

Read more about:

Black Hat News

About the Author

Robert Lemos, Contributing Writer

Veteran technology journalist of more than 20 years. Former research engineer. Written for more than two dozen publications, including CNET News.com, Dark Reading, MIT's Technology Review, Popular Science, and Wired News. Five awards for journalism, including Best Deadline Journalism (Online) in 2003 for coverage of the Blaster worm. Crunches numbers on various trends using Python and R. Recent reports include analyses of the shortage in cybersecurity workers and annual vulnerability trends.

Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.

You May Also Like


More Insights