TPM on Embedded Systems: Pitfalls and Caveats to Watch Out For

原始链接: https://sigma-star.at/blog/2026/01/tpm-on-embedded-systems-pitfalls-and-caveats/

## 嵌入式系统中的TPM：摘要可信平台模块（TPM）在嵌入式Linux设备中的应用日益普遍，已超出传统PC的范围。这些芯片提供安全的密钥存储、可测量的启动能力（通过PCR验证系统完整性）以及远程证明——系统状态的密码学证明。虽然专用芯片是典型选择，但基于固件的TPM（fTPM），例如使用Arm TrustZone的技术，也是可行的。然而，在嵌入式系统中部署TPM面临独特的挑战。与PC不同，这些设备通常在物理恶劣的环境中无人值守运行，使其容易受到总线嗅探、复位攻击和直接物理篡改。威胁模型也不同；嵌入式系统优先保护*固件*，而PC则侧重于用户数据。常见的陷阱包括未能启用会话加密（导致数据在总线上易受攻击）和忽视TPM复位攻击。固件漏洞（如ROCA）和fTPM弱点也构成风险。更长的设备生命周期需要强大的固件更新机制。成功的TPM实施需要清晰的威胁模型、积极缓解物理攻击以及整体的安全方法——包括安全启动、内核加固和后端基础设施保护——因为TPM本身不是一个完整的解决方案。

黑客新闻新的 | 过去的 | 评论 | 提问 | 展示 | 工作 | 提交登录 TPM 在嵌入式系统中的陷阱和注意事项 (sigma-star.at) 8 分，由 Deeg9rie9usi 2 小时前发布 | 隐藏 | 过去的 | 收藏 | 1 条评论 dfajgljsldkjag 12 分钟前 [–] 这些芯片默认未启用会话加密，这太疯狂了。我觉得大多数厂商只是把 TPM 贴在板子上，认为这样就安全了，而没有正确配置它。文章说得对，物理访问通常意味着游戏结束，所以这似乎为了微小的收益付出了很多努力。回复指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系搜索：

Trusted Platform Module (TPM) chips have been around since the release of the TPM 1.2 specification more than 20 years ago, and the TPM 2.0 specification^{was released in 2014. The technology is
now seeing widespread adoption in various computing sectors. TPMs have been a standard feature
in PCs, particularly notebooks, for some time. With integration into tools like systemd’s
tooling for LUKS/dm-crypt and legal requirements like EU’s CRA, TPM functionality is also
now making its way into the embedded Linux sector. In this post, we’ll highlight common
pitfalls and considerations for using TPM chips on embedded devices.}

Brief Overview: Trusted Platform Module (TPM)

Usually, a TPM is a dedicated chip that is connected to the CPU via a bus like LPC, SPI, or I2C. In modern PCs, a firmware-based TPM (fTPM) is common, where the firmware (e.g., UEFI) emulates a TPM in a secure environment. The same is also possible on embedded devices, for example, by using Arm TrustZone to emulate a TPM.

There are three common use cases for TPM:

Secret storage: A core feature of a TPM is its ability to protect data using secure, internal keys. These keys are protected in a way that makes them usable only by and within the TPM itself. A TPM 2.0 chip supports multiple key hierarchies that reflect different trust scenarios, with the Platform Hierarchy and Storage Hierarchy being the most commonly used ones. A TPM client can create a new key in a hierarchy where the parent key’s handle is used to protect (encrypt and authenticate) the child key’s data (then known as blob) when it is placed in external storage^{. This process is known as binding. When the secret
is needed again, the protected blob is passed back to the TPM, which verifies and decrypts it
before returning it to the client.}

The TPM 2.0 specification defines a multitude of cryptographic algorithms such as hash functions, symmetric/asymmetric ciphers and message authentication codes (MACs). This enables its use for network protocols like TLS and storage security like LUKS/dm-crypt on Linux or BitLocker on Windows.

Measured boot: Each TPM chip contains a set of 24 Platform Configuration Registers (PCRs)^{.
Their values are hashes produced from log events that reflect the boot and configuration state of
the platform. This is primarily used to secure the boot chain of a system, where each stage of
the boot process measures relevant components (bootloader, drivers, firmware, configuration data, etc.)
and uses these measurements to extend the cryptographic hash of the respective PCRs. Concurrently,
the details of these measurements are recorded in the TPM’s event log. The full set of PCR
values provides a record of the boot state. Should a component be manipulated, the final PCR value
will change, which indicates a potential security issue.}

This can be combined with TPM keys, where each key can be tied to a set of PCR values (known as sealing). If the PCRs contain the expected values, the TPM allows the key to be used. Otherwise, its use is prevented. This enables tight control over when a key is usable and provides an integrity measure.

(Remote) Attestation: TPMs provide the ability to produce cryptographic proof of the boot and configuration state of a system. This is done through a dedicated attestation protocol. For the protocol to work, a trust chain must be established by verifying the TPM’s identity and its Attestation Key (AK). The attestation procedure utilizes a TPM key that is bound to a set of PCR values. This enforces that the attestation can only be run when the system is in a known, secure state. The result of this process is a signed Quote from the TPM, which contains the current PCR values and a signature that proves their origin.

Using TPM on Embedded Device

Embedded devices are primarily based on Arm SoCs, in contrast to the x86/amd64-based CPUs common in PCs. This distinction is particularly relevant for the secure boot process, as much of this is handled by a PC’s UEFI, which is not standard on most embedded systems. Instead, these systems rely on a vendor-specific boot-ROM followed by bootloaders like Barebox or U-Boot. While U-Boot can support running in UEFI mode, this is still a rarity in the embedded space. It is, however, possible to achieve a similar security posture without UEFI, as demonstrated by Manuel Traut’s All Systems Go! talk from 2024^{. A physical TPM chip is not strictly
necessary either, as a fTPM can be implemented using technologies like Arm
TrustZone with OP-TEE.}

From a threat model perspective, a critical difference between a PC and an embedded device is that an embedded system often operates in a physically hostile environment without constant human supervision. While a laptop can usually be protected by its owner quite easily (i.e. lock it away), an embedded device, such as an IoT sensor or a point-of-sale terminal, is a prime target for physical attacks including tampering, side-channel attacks, fault injection, and bus snooping. This is particularly relevant when a TPM is used for disk encryption, as research has shown these attacks can be effective against solutions like BitLocker^{and others^.}

The key difference in threat models is that the device manufacturer often needs to protect their intellectual property (firmware, algorithms, and data) from the end-user or third parties, whereas on a PC, the end-user is the one protecting their assets.

An advantage of embedded systems, however, is that their software stack is tailored to a specific use case, unlike a general-purpose PC. This allows the manufacturer to tightly control the software environment and implement robust security measures.

Other noteworthy differences include the significantly longer lifetimes of embedded systems, which can exceed 10 years, compared to the 3-5 year lifespan of a typical PC. This extended lifecycle necessitates longer support cycles for hardware components, including TPM chips, whose firmware may contain vulnerabilities that require updates. Furthermore, embedded systems are increasingly subject to legal requirements like CRA (Cyber Resilience Act) and NIS2 (Network and Information Systems Directive 2), which aim to raise their security baseline and hold manufacturers accountable.

TPM Pitfalls on Embedded Devices

With that differences in mind, let’s have a look at some of the common pitfalls when using a TPM in general and specifically an embedded system:

Bus Snooping Attacks

By attaching a probe to the physical communication lines between a TPM and the CPU/SoC, a passive attacker can read the messages exchanged between them. This is a common and straightforward physical attack, particularly in unattended embedded devices.

To mitigate this, TPM 2.0 supports session-based command and response parameter encryption. This requires a secure session to be established between the client and the TPM. The TPM specification defines multiple session types, including HMAC sessions for command authentication and integrity, and policy sessions for more complex authorization based on conditions like PCR values.

Crucially, the TPM’s session-based encryption and authentication must be explicitly enabled by the client software, as not all current implementations do this by default. This is a significant security risk, as a lack of session encryption can expose sensitive data like unsealed keys or authorization values in plaintext on the bus, making it vulnerable to passive bus snooping.

TPM Reset Attacks by Active Interposer

A more advanced threat is an active attacker on the bus between the TPM and the CPU. They can launch a Man-in-the-Middle (MitM) attack, impersonating a legitimate TPM. Even with session encryption, this attack can be effective if the client does not first establish a trusted relationship with the TPM. A Time-of-Check to Time-of-Use (TOCTU) attack is also possible, where the attacker only launches the MitM once the system’s integrity check has passed.

A major challenge is attacks that work across TPM resets. An attacker can physically reset the TPM to force it back into an uncompromised state and then reconstruct the PCR values to reflect an untampered boot. This allows them to access keys sealed to those specific PCR values.

While there is no way to fully prevent a determined physical attacker, the Linux Kernel has gained support to detect such interposer attacks thanks to James Bottomley’s work on Kernel 6.10. The mechanism chosen for this relies on the TPM’s NULL seed, which changes its value on every TPM reset. This allows the kernel to derive a primary key that is volatile to the current boot session. The key’s unique, name can be passed securely from the bootloader to the kernel. If a MitM attacker resets the TPM during this handover, the handle will become invalid.

The attestation process is rather complex to perform in kernel code, so the kernel exposes the name of the derived key via sysfs, enabling userspace to perform the full attestation logic including verification of the EK certificate. Successful attestation of this NULL key name will prove that there was no interposer present during the whole boot process. For more details, see James’ excellent talk at FOSDEM 2025^.

Unfortunately, while the foundations in the Linux Kernel have been put in place and documented^{, other parts are still lagging behind here.
Especially systemd’s dm-crypt integration does not support this as of now.}

A more robust alternative that helps mitigate these types of attacks is using a fTPM, as it runs on the same SoC, eliminating the physical bus. However, fTPMs have their own set of caveats, which we will discuss later.

Direct Physical Attacks on TPM Chip

With full physical access to the device, a malicious actor can perform a range of attacks, from fault injection and side-channel analysis to more direct tampering. While TPM chips are designed with security in mind, the level of protection varies significantly by vendor and model. The ISO/IEC 19790^{security level provides an indicator of a chip’s
physical security capabilities. Unfortunately, most commercial
TPMs only achieve Level 1 or Level 2. Level 3 and 4 certifications,
which require tamper-resistant or tamper-evident features, are
extremely rare for TPMs and are typically reserved for high-security
applications using dedicated hardware security modules (HSMs).}

A highly motivated attacker can aim to circumvent these protections by desoldering the TPM and operating in another device. When its secrets are cryptographically sealed to PCRs, it is also required that the attacker reproduces the relevant PCR measurements in order to be able to unseal a key outside the original device. While this is extremely difficult and requires a lot of effort, it demonstrates that a TPM is not a silver bullet.

Integrity Protection and Code Execution Flaws

While the host has means to verify the TPM chip (Endorsement Key), this does not work the other way around. A TPM has no way to do this as it is a passive device. As long as the caller provides the correct authorization value to perform a command, it will do so. This is a huge problem if malicious code is executed on a running system. If it gains enough privileges it can interact with the TPM to have it decrypt secrets.

If a TPM key is sealed to specific PCR values, a malicious process can gain access by resetting the TPM and replaying a legitimate boot process.

This highlights the need to secure the entire boot chain using measured boot and verified boot. However, these do not cover any runtime issues after the system successfully booted. For this, additional Kernel features like IMA/EVM are needed.

On embedded Arm-based systems, measured boot often begins after the SoC’s boot-ROM and the initial bootloader, which is why SoC-specific features like NXP’s HAB are essential to secure the lowest levels of the boot chain.

For secret storage, a robust hardening measure is to lock the TPM key to PCRs that change once the key is no longer needed. Once the disk is mounted in the initrd, an additional measurement can be made to change a PCR value, making the sealed key unusable until the next boot.

In general, a TPM cannot fully mitigate any code execution flaw on your platform. Once an attacker controls the OS kernel, they can often extract secrets already in memory or copy data from mounted, unencrypted disks. Therefore, additional in-depth hardening (e.g., AppArmor, SELinux), a clear concept of least privilege per process and hardening of backend infrastructure are crucial.

Hardware Flaws

TPM chips from various vendors have had hardware flaws in the past. One prime example is the ROCA vulnerability^{(CVE-2017-15361) affecting
millions of Infineon TPM chips which produced weak RSA keys. Vulnerabilities
like these become even more problematic with longer product lifetimes.
If the security of a product is solely based on the TPM chip, a flaw in
it can be catastrophic.}

Therefore, additional hardening measures are needed to reduce the impact of malicious code execution (e.g. reduced privileges, SELinux) or key extraction (e.g. unique keys per device). Furthermore, any backend infrastructure should be hardened such that a single compromised device cannot impact the operation of other devices.

Software-based TPM Issues

On Intel and AMD hardware, fTPMs have become more known at least since Windows 11 requires a TPM to operate. These solutions implement a TPM in software, running within a Trusted Execution Environment (TEE) on the CPU. This has the advantage of removing any physical bus that can be snooped or tampered with. On the other hand, fTPMs are still vulnerable to various kinds of side-channel attacks, such as timing leaks, as the vulnerability dubbed TPM-FAIL has shown^.

These trusted execution environments are vulnerable as well. Any flaw in the TEE can result in a full compromise of the TPM keys, breaking all security guarantees. This includes insecure key storage. While the TEEs are designed to keep the execution state safe, it also requires a secure way to persist secrets (e.g., TPM key hierarchy seeds) across reboots. A prime example is the faulTPM flaw on some AMD CPUs^.

On Arm-based embedded devices, the main way to implement an fTPM is by utilizing Arm TrustZone technology and something like OP-TEE’s fTPM^{.
Any flaw in the TrustZone implementation can be a problem, including
memory protection issues that leak keys the fTPM holds in memory^.}

When using OP-TEE, it is also essential to properly configure the storage layer and ensure that it has a secure way to derive a storage encryption key that cannot be accessed outside of the TEE. SoCs with upstream OP-TEE support often already have this, but vendors sometimes ship forks that do not. When using storage backed by eMMC RPMB (Replay Protected Memory Block), the key must be properly set up and configured. Hard-coding the RPMB key in the OP-TEE binary is a particularly bad idea.

Cold Boot Attacks

While not a direct TPM problem, cold boot attacks are a critical consideration when designing a system that uses a TPM for secret storage. Although a physical TPM provides protection for its internal memory, the problem lies with the main system memory once the TPM has unsealed and returned a secret to the host. A prime example is disk encryption, where the encryption key resides in the volatile main memory for the entire duration of the system’s uptime. It is essential to ensure proper mitigations against cold boot attacks to avoid leaking secrets this way.

Performance

Not specific to embedded systems, but often overlooked: a TPM is slow. Therefore, offloading cryptography-heavy protocols like TLS to the TPM is unfeasible. It can, however, still be used to keep long-term keys secure and perform less timing-critical operations during a protocol handshake.

One more thing to keep in mind is that with TPM 2.0, root keys are not stored directly but derived from seeds. This key derivation can be computationally expensive. Consequently, using a RSA key can cause more significant delays than an ECC key due to the mathematical operations involved.

TPM Firmware Updates

It is common knowledge now that delivering security updates is crucial to keeping devices secure over their whole lifetime. It is important to remember that TPM chips also contain firmware that needs to be updated. On a regular PC, this is handled through UEFI or Windows. However, on an embedded device, it is the responsibility of the producer to keep the TPM firmware up to date. This is critical, especially when a vulnerability like TPM-FAIL (CVE-2019-16863) occurs, as it requires the ability to update your TPM to patch the flaw.

Summary

A TPM chip can help to implement state-of-the-art security mechanisms, but it’s not a silver bullet that will make things inherently secure. To utilize a TPM correctly, it is crucial to have a well-defined threat model. This model must clearly specify the role of the TPM and the kind of attacks it is intended to protect against. Especially when used on an embedded system, the lack of human interaction must be considered when designing protection against certain threats.