Backdoor in XZ Utils allows RCE: everything you need to know

Detect and mitigate CVE-2024-3094, a critical supply chain compromise, affecting XZ Utils Data compression library. Organizations should patch urgently.

7 minutes read

TL;DR

A backdoor has been identified in versions 5.6.0 and 5.6.1 of XZ Utils (assigned CVE-2024-3094), which under some conditions may allow RCE via SSH authentication in specific versions of certain Linux distributions.

Changelog

  • March 31, 2024 - Updated diagram based on newly revealed information

  • April 1, 2024 - Updated affected versions table based on latest advisories

  • April 3, 2024 - Added new research findings

What is CVE-2024-3094? 

Malicious code has been found in the XZ project's source packages, beginning with release 5.6.0. Through a series of complex obfuscations, a concealed test file within the source code is used during the liblzma compilation process to extract a precompiled object file. This file then alters particular functions within the liblzma code. Consequently, this results in a compromised liblzma library, which affects OpenSSH when it supports systemd notification. This is because libsystemd relies on lzma, and the backdoor can intercept and alter its data exchanges. Specifically, certain Linux distributions use this library for SSH, and could therefore be vulnerable to remote code execution. 

The malicious code is obfuscated and can only be found in the released versions (of specific Linux distributions), not in the Git distribution, which lacks the M4 macro, that triggers the backdoor build process. If the malicious macro is present, the second-stage artifacts found in the Git repository are injected during the build time. 

The author of the malicious code (@JiaT75) reportedly also submitted code to the oss-fuzz project that may have specifically prevented this fuzzer from being able to detect the backdoor they planted in XZ Utils. 

Wiz Research data: what’s the risk to cloud environments?       

According to Wiz data, while XZ Utils itself is highly prevalent, only approximately 2% of cloud environments have instances with versions vulnerable to CVE-2024-3094. 

Which products are affected? 

According to Repology, there are many distributions that are potentially impacted by CVE-2024-3094, but Wiz will update this table with concrete information as vendors publicly address the vulnerability:

DistroNotesPackageAffected versionsFixed versions
RedHatRed Hat Enterprise Linux (RHEL) is not affected, but Fedora 41 and Fedora Rawhide are affected.xzFedora 41 and Fedora RawhideRedHat has advised users to immediately stop any instances of Fedora 41 or Fedora Rawhide.
DebianNo Debian stable versions are known to be affected, but non-stable branches are affected.xz-utilsFrom 5.5.1alpha-0.1 up to and including 5.6.1-15.6.1+really5.4.5-1
Kali LinuxAffects Kali installations updated between March 26th to March 29th.xz-utils5.6.0-0.2Upgrade to the latest version
OpenSUSEopenSUSE maintainers have rolled back the version of xz on Tumbleweed on March 28th and have released a new Tumbleweed snapshot (20240328 or later) that was built from a safe backup.xz5.6.0 5.6.15.6.1.revertto5.4
Alpine-xz5.6.0 5.6.0-r0 5.6.0-r1 5.6.1 5.6.1-r0 5.6.1-r15.6.0-r2 5.6.1-r2
ArchThe following release artifacts contain the compromised package: (1) Installation medium 2024.03.01, (2) Virtual machine images 20240301.218094 and 20240315.221711, (3) Container images created between and including 2024-02-24 and 2024-03-28xz5.6.0-15.6.1-2
GentooGentoo recommends downgrading to an older version.xz-utils>= 5.6.0< 5.6.0
FreeBSDNot affected.---
Amazon LinuxNot affected.---

Which actions should security teams take? 

  • Follow the guidance provided in the table above for each Linux distribution. 

  • CISA have advised to downgrade to an uncompromised XZ Utils version (earlier than 5.6.0) and to hunt for any malicious or suspicious activity on systems where affected versions have been installed. 

Wiz customers can use the pre-built query and advisory in the Wiz Threat Center to search for vulnerable instances in their environment. For more details see the in-product advisory.

Diving into the technical intricacies 

This section is based on analysis conducted by Andres Freund, who discovered this backdoor after noticing unusual behavior associated with sshd. He observed that sshd was consuming a surprisingly large amount of CPU during the login process. Additionally, he encountered numerous errors while using the 'Valgrind' tool for profiling and memory debugging, prompting him to investigate further. 

As part of the XZ build process, the Build-to-Host.m4 script is executed. This script contains the following line: gl_[$1]config='sed "r\n" $gl_am_configmake | eval $gl_path_map | $gl[$1]_prefix -d 2>/dev/null', which injects an obfuscated script to be executed at the end of the configure script. This script is responsible for creating the MakeFiles for xz-utils and liblzma. 

While the obfuscated script is executed, among other things it checks for two conditions: it determines if the OS is x86-64 Linux and whether it is part of a Debian or RPM package build. The script mainly aims to modify the MakeFile of liblzma in order to interfere with its symbol resolution process in runtime, particularly causing the RSA_public_decrypt@....pl symbol to point to its own malicious backdoor code.   

Further analysis revealed that during the public key authentication process of sshd, the RSA_public_decrypt@....pl function gets called, causing the attacker's code to be executed.  This code operates by attempting to extract a payload from the public key that is passed to it during the authentication process. This payload undergoes a series of verification steps and signature checks. If the payload successfully passes these checks, it is then transferred to the libc's system() function. This function is designed to execute the payload. Therefore, it is a remote code execution (RCE), rather than authentication bypass. 

Overview of the backdoor functionality

Furthermore, the observed requirements for exploitation suggest that the attacker intended to complicate the analysis process for researchers. 

The obfuscated code running within the configure script during the build process installs the backdoor only under certain conditions. Among several checks, the following two are noteworthy: 

  • The targeted OS must be x86-64 Linux. If this condition is not met, the backdoor will not be installed. 

if ! (echo "$build" | grep -Eq "^x86_64" > /dev/null 2>&1) && (echo "$build" | grep -Eq "linux-gnu$" > /dev/null 2>&1); then 
  • The XZ build process must be part of a Debian or RPM package build. This makes it more difficult to reproduce, as the backdoor won't be installed when attempting to manually build the XZ package. 

if test -f "$srcdir/debian/rules" || test "x$RPM_ARCH" = "xx86_64"; then

In addition to these, several runtime requirements for the exploit have been observed: 

  • The TERM environment variable must not be set - this variable is set in the SSH client and server communication after the authentication process has begun, and therefore if it isn't set then this means the process hasn't started yet, which is precisely the stage that the exploit targets. 

  • The path to the currently running binary, argv[0], needs to be /usr/sbin/sshd - this means the malicious code will only run when sshd uses the libzlma library. It won't be relevant when other binaries use the infected liblzma library. 

  • The environment variables LD_DEBUG and LD_PROFILE must not be set  - to avoid exposing the process of symbol resolution interference and other linker/loader manipulations. 

  • The LANG environment variable must be set - as sshd always sets LANG

  • The exploit detect whether debugging tools such as rr and gdb are being used and if so it doesn't run - a classic anti-debugging technique. 

Latest Wiz research findings (as of April 3, 2024)

Wiz Research is currently working on reverse engineering the XZ Utils backdoor in order to support our related detection engineering efforts. In the course of our research, we have reached the following findings which we believe to be partially useful for this purpose, and have decided to share them publicly so that others can make use of them as well. We haven't seen these specific findings discussed elsewhere as of the time of publication.

Multiple hooks

We have identified logic to potentially hook more than one function:

  1. RSA_public_decrypt - The primary hook, which is already well known at this point.

  2. EVP_PKEY_set1_RSA - Another hook which will be used if the primary one doesn’t exist.

  3. RSA_get0_key - Another hook which will be used if the primary one doesn’t exist.

Here we can see checks comparing symbol names to multiple possible functions to hook. All strings in the code are obfuscated and referred to with a unique id as seen here:

We estimate that this functionality might have been added because RSA_public_decrypt is currently deprecated and might not be used in the future. We assume this might have been a way for the threat actor to future-proof the backdoor, as updating this area of the source code on a regular basis might have drawn suspicion; the origin of the blob containing the backdoor is located in a very specific test file which doesn’t warrant frequent updates, so updating it periodically might have raised eyebrows.

Hiding the public key from verbose SSH logs

As documented by Anthony Weems, the commands sent by the attacker to the backdoor are disguised as public key information as part of the SSH handshake. This means that when the sshd server logs that information when logging is set to verbose or higher, the user (or someone reviewing the logs in retrospect) would have potentially observed a series of apparent authentication failures, each with a different RSA-CERT hash. For example:

Failed publickey for root from 172.17.0.1 port 55794 ssh2: RSA-CERT ...

However, in order to mitigate this and avoid suspicion (and make any future investigations more difficult), the attacker implemented at least 3 methods to hide the public key from verbose logs:

Method 1: Hardcoded fake public key

After performing the command sent by the attacker (using system), the backdoor replaces the public RSA key object with a prebuilt constant key. Here we can see the logic of creating the mock RSA key and overriding the original one:

Interestingly, if this step fails for whatever reason, the backdoor will abruptly exit the process instead of resuming normal program flow. We estimate that this might indicate the importance of this defense evasion functionality to the attacker.

The fake public key will manifest in verbose logs in the following format. Note that the RSA hashes in the log are consistent (RSA-CERT G1Q+4a0TgAHnlq2B8BKLZUP6wDHsjX6F5nVtUTU3dBQ and CA RSA s3cBHP6c4j0tLaNyEE5f/1n7gjKuOJn8AWxp+nRPD7c), meaning that this method generates artifacts that can be used for detection purposes on SSH servers with verbose logging enabled, unless the attacker uses one of the other two methods described below.

Connection from 172.17.0.1 port 51300 on 172.17.0.2 port 22 rdomain ""
Failed publickey for root from 172.17.0.1 port 51300 ssh2: RSA-CERT SHA256:G1Q+4a0TgAHnlq2B8BKLZUP6wDHsjX6F5nVtUTU3dBQ ID  (serial 0) CA RSA SHA256:s3cBHP6c4j0tLaNyEE5f/1n7gjKuOJn8AWxp+nRPD7c
Connection from 172.17.0.1 port 51316 on 172.17.0.2 port 22 rdomain ""
Failed publickey for root from 172.17.0.1 port 51316 ssh2: RSA-CERT SHA256:G1Q+4a0TgAHnlq2B8BKLZUP6wDHsjX6F5nVtUTU3dBQ ID  (serial 0) CA RSA SHA256:s3cBHP6c4j0tLaNyEE5f/1n7gjKuOJn8AWxp+nRPD7c

Method 2: Log entry filtering

We have identified a flag in the command message that when enabled will cause the backdoor to hook the sshlogv function in order to filter out specific log lines, causing them to not be logged even if verbose logging is enabled. However, this creates an anomaly, in which a series of connections appear to have been initiated, but without a corresponding reject or accept message:

Connection from 172.17.0.1 port 49038 on 172.17.0.2 port 22 rdomain ""
Connection from 172.17.0.1 port 59032 on 172.17.0.2 port 22 rdomain ""

The above anomaly is somewhat identifiable and might be useful for hunting for activity related to this backdoor, but we estimate that this behavior could also occur legitimately if SSH connections are unexpectedly terminated by the client, so it is less useful for detection purposes.

Method 3: Log entry replacement

The above function also contains logic implementing an alternative approach for hiding connections, but we have yet to identify the precise conditions in which it is used. Here we can see a fake log entry being built to replace a successful connection message:

In this alternative method, the backdoor replaces log entries describing successful connections with entries describing failed connection attempts, which would look like this:

Connection from 172.17.0.1 port 46722 on 172.17.0.2 port 22 rdomain ""
Connection closed by authenticating user root 172.17.0.1 port 46722 [preauth]

This method is the stealthiest of the three, since it does not generate any identifiable anomalies in verbose logs.

References 

Continue reading

Get a personalized demo

Ready to see Wiz in action?

“Best User Experience I have ever seen, provides full visibility to cloud workloads.”
David EstlickCISO
“Wiz provides a single pane of glass to see what is going on in our cloud environments.”
Adam FletcherChief Security Officer
“We know that if Wiz identifies something as critical, it actually is.”
Greg PoniatowskiHead of Threat and Vulnerability Management