University of Cambridge scientists Nicholas Boucher and Ross Anderson have discovered a new class of vulnerabilities that allow attackers to inject visually deceptive malware in a way that is semantically valid but changes the logic defined by the source code, making the code vulnerable to a wide variety of cyber threats. including those related to supply chains.
A technique described by experts called the "Trojan Source attack" is based on the use of "subtle differences in character encoding standards like Unicode to create source code whose tokens are logically encoded in a different order than they are displayed, which leads to vulnerabilities that people reviewing the code they can't see."
The vulnerabilities, identified as CVE-2021-42574 and CVE-2021-42694, affect compilers of all popular programming languages such as C, C++, C#, JavaScript, Java, Rust, Go, and Python.
The problem is related to the bidirectional Unicode algorithm (Bidi algorithm), which provides support for writing both from left to right (for example, Russian) and from right to left (for example, Hebrew). The Bidia algorithm also supports a bidirectional override that allows words to be written from left to right in a right-to-left language sentence and vice versa. In other words, the algorithm allows text written from left to right to be perceived as written from right to left.
Compiler output is expected to correctly implement the source code, however, the inconsistencies that occur when inserting Bidi Unicode override characters into comments and strings allow the creation of syntactically valid source code in which the display order of the characters represents logic that diverges from the actual logic.
“That is, we are anagramming program A into program B. If the changes in logic are subtle enough to bypass detection in subsequent testing, an attacker can create targeted vulnerabilities and not be detected,” the researchers explained.
Such adversarial programming can have a major impact on the supply chain, the researchers warn, when vulnerabilities embedded in open source software are transferred to end products, affecting all users of the software. Even worse, a "Trojan Source Attack" can become more serious if an attacker uses homoglyphs to override pre-existing functions in the source package and call them from the victim program.