数字签名以及如何避免它们

数字签名以及如何避免它们
Digital signatures and how to avoid them

原始链接: https://neilmadden.blog/2024/09/18/digital-signatures-and-how-to-avoid-them/

数字签名是一种用于确认电子消息或文档真实性的方法。 Alice 可以使用她的私钥对消息进行签名，Bob 可以使用她的公钥验证消息的真实性和完整性。数字签名通常用于软件更新验证、SSL 连接身份验证和其他应用程序。然而，从密码学家的角度来看，数字签名具有一些有趣的特征。他们对数字签名的理解超出了基本用例。数字签名的起源源于交互式识别协议，特别是 Schnorr 协议。开发人员通常认为签名主要是为了签署消息，而密码学家则将其视为身份验证方法的自动化版本。数字签名充当身份验证和授权机制，允许在允许访问或执行权限之前确认身份。将身份验证协议与数字签名进行比较时，存在差异。身份验证协议交互发生在两个特定实体之间，而数字签名是普遍适用的，不需要上下文来验证真实性。此外，身份验证协议的运行仅限于单个实例，而数字签名无论使用多少次都仍然有效。签名的主要优点在于它们能够无限期地提供作者身份证明，甚至可以跨多个平台，这与传统的身份验证协议不同，传统的身份验证协议一旦用户注销或会话终止，其有效性就会消失。然而，当寻求在特定交易或环境中建立信任时，这种优势就会成为弱点。由于签名是通用的，验证其合法性需要进一步的上下文检查，以确定消息是否源自预期来源以及目标接收者。此外，必须采取措施防止重复使用相同的签名，从而避免潜在的重放攻击。虽然 Schnorr 协议是许多当代数字签名技术（例如 EdDSA）的基础，但它提出了有关特殊健全性的问题，这意味着意外重用承诺会暴露私钥。因此，一些签名采用通过私钥和消息的散列生成的确定性承诺，确保承诺仅匹配相同的消息，从而避免泄露敏感细节。总之，数字签名在网络安全和网络服务中发挥着重要作用。虽然它们擅长提供通用身份验证功能，但在正确实施和应用方面应谨慎行事，

会话 Cookie 与 JSON Web 令牌 (JWT) - 会话 Cookie 将用户信息存储在浏览器中，允许快速访问用户数据，而无需往返服务器。然而，如果未安全存储，它们可能容易受到未经授权用户的潜在操纵。 JSON Web 令牌 (JWT) 是一种将用户数据编码在令牌中的标准化方法，然后根据每个请求将其发送到服务器并由服务器进行验证。与传统会话 cookie 相比，JWT 通过加密和使用 HMAC 来确保数据完整性，从而提供增强的安全性。当预计会有大量请求时，或者当应用程序需要可扩展性时，JWT 可能更适合，因为它们易于跨各种平台和技术实施。加密允许在令牌本身内存储敏感信息，而不需要在后端进行额外的状态管理，从而减少延迟并提高整体系统效率。签名 cookie 的用途与 JWT 类似，但对于复杂性有限的小型应用程序来说，通常被认为是过度杀伤力。它们提供了额外的安全优势，例如防止恶意脚本注入攻击，但计算成本和实现复杂性方面的开销增加，使得它们不太适合资源较少的简单系统。对于严重依赖实时交互和低延迟的应用程序，尽管需要在安全性和资源利用率之间进行权衡，但使用加密会话 cookie 可能仍然是首选。

原文

Wikipedia’s definition of a digital signature is:

A digital signature is a mathematical scheme for verifying the authenticity of digital messages or documents. A valid digital signature on a message gives a recipient confidence that the message came from a sender known to the recipient.
—Wikipedia

They also have a handy diagram of the process by which digital signatures are created and verified:

Source: https://commons.m.wikimedia.org/wiki/File:Private_key_signing.svg#mw-jump-to-license (CC-BY-SA)

Alice signs a message using her private key and Bob can then verify that the message came from Alice, and hasn’t been tampered with, using her public key. This all seems straightforward and uncomplicated and is probably most developers’ view of what signatures are for and how they should be used. This has led to the widespread use of signatures for all kinds of things: validating software updates, authenticating SSL connections, and so on.

But cryptographers have a different way of looking at digital signatures that has some surprising aspects. This more advanced way of thinking about digital signatures can tell us a lot about what are appropriate, and inappropriate, use-cases.

Identification protocols

There are several ways to build secure signature schemes. Although you might immediately think of RSA, the scheme perhaps most beloved by cryptographers is Schnorr signatures. These form the basis of modern EdDSA signatures, and also (in heavily altered form) DSA/ECDSA.

The story of Schnorr signatures starts not with a signature scheme, but instead with an interactive identification protocol. An identification protocol is a way to prove who you are (the “prover”) to some verification service (the “verifier”). Think logging into a website. But note that the protocol is only concerned with proving who you are, not in establishing a secure session or anything like that.

There are a whole load of different ways to do this, like sending a username and password or something like WebAuthn/passkeys (an ironic mention that we’ll come back to later). One particularly elegant protocol is known as Schnorr’s protocol. It’s elegant because it is simple and only relies on basic security conjectures that are widely accepted, and it also has some nice properties that we’ll mention shortly.

The basic structure of the protocol involves three phases: Commit-Challenge-Response. If you are familiar with challenge-response authentication protocols this just adds an additional commitment message at the start.

Alice (for it is she!) wants to prove to Bob who she is. Alice already has a long-term private key, a, and Bob already has the corresponding public key, A. These keys are in a Diffie-Hellman-like finite field or elliptic curve group, so we can say A = g^a mod p where g is a generator and p is the prime modulus of the group. The protocol then works like this:

Alice generates a random ephemeral key, r, and the corresponding public key R = g^r mod p. She sends R to Bob as the commitment.
Bob stores R and generates a random challenge, c and sends that to Alice.
Alice computes s = ac + r and sends that back to Bob as the response.
Finally, Bob checks if g^s = A^c * R (mod p). If it is then Alice has successfully authenticated, otherwise it’s an imposter. The reason this works is that g^s = g^(ac + r) and A^c * R = (g^a)^c * g^r = g^(ac + r) too. Why it’s secure is another topic for another day.

Don’t worry if you don’t understand all this. I’ll probably do a blog post about Schnorr identification at some point, but there are plenty of explainers online if you want to understand it. For now, just accept that this is indeed a secure identification scheme. It has some nice properties too.

One is that it is a (honest-verifier) zero knowledge proof of knowledge (of the private key). That means that an observer watching Alice authenticate, and the verifier themselves, learn nothing at all about Alice’s private key from watching those runs, but the verifier is nonetheless convinced that Alice knows it.

This is because it is easy to create valid runs of the protocol for any private key by simply working backwards rather than forwards, starting with a response and calculating the challenge and commitment that fit that response. Anyone can do this without needing to know anything about the private key. That is, for any given challenge you can find a commitment for which it is easy to compute the correct response. (What they cannot do is correctly answer a random challenge after they’ve already sent a commitment). So they learn no information from observing a genuine interaction.

Fiat-Shamir

So what does this identification protocol have to do with digital signatures? The answer is that there is a process known as the Fiat-Shamir heuristic by which you can automatically transform certain interactive identification protocols into a non-interactive signature scheme. You can’t do this for every protocol, only ones that have a certain structure, but Schnorr identification meets the criteria. The resulting signature scheme is known, amazingly, as the Schnorr signature scheme.

You may be relieved to hear that the Fiat-Shamir transformation is incredibly simple. We basically just replace the challenge part of the protocol with a cryptographic hash function, computed over the message we want to sign and the commitment public key: c = H(R, m).

That’s it. The signature is then just the pair (R, s).

Note that Bob is now not needed in the process at all and Alice can compute this all herself. To validate the signature, Bob (or anyone else) recomputes c by hashing the message and R and then performs the verification step just as in the identification protocol.

Schnorr signatures built this way are secure (so long as you add some critical security checks!) and efficient. The EdDSA signature scheme is essentially just a modern incarnation of Schnorr with a few tweaks.

What does this tell us about appropriate uses of signatures

The way I’ve just presented Schnorr signatures and Fiat-Shamir is the way they are usually presented in cryptography textbooks. We start with an identification protocol, performed a simple transformation and ended with a secure signature scheme. Happy days! These textbooks then usually move on to all the ways you can use signatures and never mention identification protocols again. But the transformation isn’t an entirely positive process: a lot was lost in translation!

There are many useful aspects of interactive identification protocols that are lost by signature schemes:

A protocol run is only meaningful for the two parties involved in the interaction (Alice and Bob). By contrast a signature is equally valid for everyone.
A protocol run is specific to a given point in time. Alice’s response is to a specific challenge issued by Bob just prior. A signature can be verified at any time.

These points may sound like bonuses for signature schemes, but they are actually drawbacks in many cases. Signatures are often used for authentication, where we actually want things to be tied to a specific interaction. This lack of context in signatures is why standards like JWT have to add lots of explicit statements such as audience and issuer checks to ensure the JWT came from the expected source and arrived at the intended destination, and expiry information or unique identifiers (that have to be remembered) to prevent replay attacks. A significant proportion of JWT vulnerabilities in the wild are caused by developers forgetting to perform these checks.

WebAuthn is another example of this phenomenon. On paper it is a textbook case of an identification protocol. But because it is built on top of digital signatures it requires adding a whole load of “contextual bindings” for similar reasons to JWTs. Ironically, the most widely used WebAuthn signature algorithm, ECDSA, is itself a Schnorr-ish scheme.

TLS also uses signatures for what is essentially an identification protocol, and similarly has had a range of bugs due to insufficient context binding information being included in the signed data. (SSL also uses signatures for verifying certificates, which is IMO a perfectly good use of the technology. Certificates are exactly a case of where you want to convert an interactive protocol into a non-interactive one. But then again we also do an interactive protocol (DNS) in that case anyway :shrug:).

In short, an awful lot of uses of digital signatures are actually identification schemes of one form or another and would be better off using an actual identification scheme. But that doesn’t mean using something like Schnorr’s protocol! There are actually better alternatives that I’ll come back to at the end.

Special Soundness: fragility by design

Before I look at alternatives, I want to point out that pretty much all in-use signature schemes are extremely fragile in practice. The zero-knowledge security of Schnorr identification is based on it having a property called special soundness. Special soundness essentially says that if Alice accidentally reuses the same commitment (R) for two runs of the protocol, then any observer can recover her private key.

This sounds like an incredibly fragile notion to build into your security protocol! If I accidentally reuse this random value then I leak my entire private key??! And in fact it is: such nonce-reuse bugs are extremely common in deployed signature systems, and have led to compromise of lots of private keys (eg Playstation 3, various Bitcoin wallets etc).

But despite its fragility, this notion of special soundness is crucial to the security of many signature systems. They are truly a cursed technology!

To solve this problem, some implementations and newer standards like EdDSA use deterministic commitments, which are based on a hash of the private key and the message. This ensures that the commitment will only ever be the same if the message is identical: preventing the private key from being recovered. Unfortunately, such schemes turned out to be more susceptible to fault injection attacks (a much less scalable or general attack vector), and so now there are “hedged” schemes that inject a bit of randomness back into the hash. It’s cursed turtles all the way down.

If your answer to this is to go back to good old RSA signatures, don’t be fooled. There are plenty of ways to blow your foot off using old faithful, but that’s for another post.

Did you want non-repudiation with that?

Another way that signatures cause issues is that they are too powerful for the job they are used for. You just wanted to authenticate that an email came from a legitimate server, but now you are providing irrefutable proof of the provenance of leaked private communications. Oops!

Signatures are very much the hammer of cryptographic primitives. As well as authenticating a message, they also provide third-party verifiability and (part of) non-repudiation.

You don’t need to explicitly want anonymity or deniability to understand that these strong security properties can have damaging and unforeseen side-effects. Non-repudiation should never be the default in open systems.

I could go on. From the fact that there are basically zero acceptable post-quantum signature schemes (all way too large or too risky), to issues with non-canonical signatures and cofactors and on and on. The problems of signature schemes never seem to end.

What to use instead?

Ok, so if signatures are so bad, what can I use instead?

Firstly, if you can get away with using a simple shared secret scheme like HMAC, then do so. In contrast to public key crypto, HMAC is possibly the most robust crypto primitive ever invented. You’d have to go really far out of your way to screw up HMAC. (I mean, there are timing attacks and that time that Bouncy Castle confused bits and bytes and used 16-bit HMAC keys, so still do pay attention a little bit…)

If you need public key crypto, then… still use HMAC. Use an authenticated KEM with X25519 to generate a shared secret and use that with HMAC to authenticate your message. This is essentially public key authenticated encryption without the actual encryption. (Some people mistakenly refer to such schemes as designated verifier signatures, but they are not).

Signatures are good for software/firmware updates and pretty terrible for everything else.