Wikipedia’s definition of a digital signature is:
A digital signature is a mathematical scheme for verifying the authenticity of digital messages or documents. A valid digital signature on a message gives a recipient confidence that the message came from a sender known to the recipient.
—Wikipedia
They also have a handy diagram of the process by which digital signatures are created and verified:
Alice signs a message using her private key and Bob can then verify that the message came from Alice, and hasn’t been tampered with, using her public key. This all seems straightforward and uncomplicated and is probably most developers’ view of what signatures are for and how they should be used. This has led to the widespread use of signatures for all kinds of things: validating software updates, authenticating SSL connections, and so on.
But cryptographers have a different way of looking at digital signatures that has some surprising aspects. This more advanced way of thinking about digital signatures can tell us a lot about what are appropriate, and inappropriate, use-cases.
Identification protocols
There are several ways to build secure signature schemes. Although you might immediately think of RSA, the scheme perhaps most beloved by cryptographers is Schnorr signatures. These form the basis of modern EdDSA signatures, and also (in heavily altered form) DSA/ECDSA.
The story of Schnorr signatures starts not with a signature scheme, but instead with an interactive identification protocol. An identification protocol is a way to prove who you are (the “prover”) to some verification service (the “verifier”). Think logging into a website. But note that the protocol is only concerned with proving who you are, not in establishing a secure session or anything like that.
There are a whole load of different ways to do this, like sending a username and password or something like WebAuthn/passkeys (an ironic mention that we’ll come back to later). One particularly elegant protocol is known as Schnorr’s protocol. It’s elegant because it is simple and only relies on basic security conjectures that are widely accepted, and it also has some nice properties that we’ll mention shortly.
The basic structure of the protocol involves three phases: Commit-Challenge-Response. If you are familiar with challenge-response authentication protocols this just adds an additional commitment message at the start.
Alice (for it is she!) wants to prove to Bob who she is. Alice already has a long-term private key, a, and Bob already has the corresponding public key, A. These keys are in a Diffie-Hellman-like finite field or elliptic curve group, so we can say A = g^a mod p where g is a generator and p is the prime modulus of the group. The protocol then works like this:
- Alice generates a random ephemeral key, r, and the corresponding public key R = g^r mod p. She sends R to Bob as the commitment.
- Bob stores R and generates a random challenge, c and sends that to Alice.
- Alice computes s = ac + r and sends that back to Bob as the response.
- Finally, Bob checks if g^s = A^c * R (mod p). If it is then Alice has successfully authenticated, otherwise it’s an imposter. The reason this works is that g^s = g^(ac + r) and A^c * R = (g^a)^c * g^r = g^(ac + r) too. Why it’s secure is another topic for another day.
Don’t worry if you don’t understand all this. I’ll probably do a blog post about Schnorr identification at some point, but there are plenty of explainers online if you want to understand it. For now, just accept that this is indeed a secure identification scheme. It has some nice properties too.
One is that it is a (honest-verifier) zero knowledge proof of knowledge (of the private key). That means that an observer watching Alice authenticate, and the verifier themselves, learn nothing at all about Alice’s private key from watching those runs, but the verifier is nonetheless convinced that Alice knows it.
This is because it is easy to create valid runs of the protocol for any private key by simply working backwards rather than forwards, starting with a response and calculating the challenge and commitment that fit that response. Anyone can do this without needing to know anything about the private key. That is, for any given challenge you can find a commitment for which it is easy to compute the correct response. (What they cannot do is correctly answer a random challenge after they’ve already sent a commitment). So they learn no information from observing a genuine interaction.
Fiat-Shamir
So what does this identification protocol have to do with digital signatures? The answer is that there is a process known as the Fiat-Shamir heuristic by which you can automatically transform certain interactive identification protocols into a non-interactive signature scheme. You can’t do this for every protocol, only ones that have a certain structure, but Schnorr identification meets the criteria. The resulting signature scheme is known, amazingly, as the Schnorr signature scheme.
You may be relieved to hear that the Fiat-Shamir transformation is incredibly simple. We basically just replace the challenge part of the protocol with a cryptographic hash function, computed over the message we want to sign and the commitment public key: c = H(R, m).
That’s it. The signature is then just the pair (R, s).
Note that Bob is now not needed in the process at all and Alice can compute this all herself. To validate the signature, Bob (or anyone else) recomputes c by hashing the message and R and then performs the verification step just as in the identification protocol.
Schnorr signatures built this way are secure (so long as you add some critical security checks!) and efficient. The EdDSA signature scheme is essentially just a modern incarnation of Schnorr with a few tweaks.
What does this tell us about appropriate uses of signatures
The way I’ve just presented Schnorr signatures and Fiat-Shamir is the way they are usually presented in cryptography textbooks. We start with an identification protocol, performed a simple transformation and ended with a secure signature scheme. Happy days! These textbooks then usually move on to all the ways you can use signatures and never mention identification protocols again. But the transformation isn’t an entirely positive process: a lot was lost in translation!
There are many useful aspects of interactive identification protocols that are lost by signature schemes:
- A protocol run is only meaningful for the two parties involved in the interaction (Alice and Bob). By contrast a signature is equally valid for everyone.
- A protocol run is specific to a given point in time. Alice’s response is to a specific challenge issued by Bob just prior. A signature can be verified at any time.
These points may sound like bonuses for signature schemes, but they are actually drawbacks in many cases. Signatures are often used for authentication, where we actually want things to be tied to a specific interaction. This lack of context in signatures is why standards like JWT have to add lots of explicit statements such as audience and issuer checks to ensure the JWT came from the expected source and arrived at the intended destination, and expiry information or unique identifiers (that have to be remembered) to prevent replay attacks. A significant proportion of JWT vulnerabilities in the wild are caused by developers forgetting to perform these checks.
WebAuthn is another example of this phenomenon. On paper it is a textbook case of an identification protocol. But because it is built on top of digital signatures it requires adding a whole load of “contextual bindings” for similar reasons to JWTs. Ironically, the most widely used WebAuthn signature algorithm, ECDSA, is itself a Schnorr-ish scheme.
TLS also uses signatures for what is essentially an identification protocol, and similarly has had a range of bugs due to insufficient context binding information being included in the signed data. (SSL also uses signatures for verifying certificates, which is IMO a perfectly good use of the technology. Certificates are exactly a case of where you want to convert an interactive protocol into a non-interactive one. But then again we also do an interactive protocol (DNS) in that case anyway :shrug:).
In short, an awful lot of uses of digital signatures are actually identification schemes of one form or another and would be better off using an actual identification scheme. But that doesn’t mean using something like Schnorr’s protocol! There are actually better alternatives that I’ll come back to at the end.
Special Soundness: fragility by design
Before I look at alternatives, I want to point out that pretty much all in-use signature schemes are extremely fragile in practice. The zero-knowledge security of Schnorr identification is based on it having a property called special soundness. Special soundness essentially says that if Alice accidentally reuses the same commitment (R) for two runs of the protocol, then any observer can recover her private key.
This sounds like an incredibly fragile notion to build into your security protocol! If I accidentally reuse this random value then I leak my entire private key??! And in fact it is: such nonce-reuse bugs are extremely common in deployed signature systems, and have led to compromise of lots of private keys (eg Playstation 3, various Bitcoin wallets etc).
But despite its fragility, this notion of special soundness is crucial to the security of many signature systems. They are truly a cursed technology!
To solve this problem, some implementations and newer standards like EdDSA use deterministic commitments, which are based on a hash of the private key and the message. This ensures that the commitment will only ever be the same if the message is identical: preventing the private key from being recovered. Unfortunately, such schemes turned out to be more susceptible to fault injection attacks (a much less scalable or general attack vector), and so now there are “hedged” schemes that inject a bit of randomness back into the hash. It’s cursed turtles all the way down.
If your answer to this is to go back to good old RSA signatures, don’t be fooled. There are plenty of ways to blow your foot off using old faithful, but that’s for another post.
Did you want non-repudiation with that?
Another way that signatures cause issues is that they are too powerful for the job they are used for. You just wanted to authenticate that an email came from a legitimate server, but now you are providing irrefutable proof of the provenance of leaked private communications. Oops!
Signatures are very much the hammer of cryptographic primitives. As well as authenticating a message, they also provide third-party verifiability and (part of) non-repudiation.
You don’t need to explicitly want anonymity or deniability to understand that these strong security properties can have damaging and unforeseen side-effects. Non-repudiation should never be the default in open systems.
I could go on. From the fact that there are basically zero acceptable post-quantum signature schemes (all way too large or too risky), to issues with non-canonical signatures and cofactors and on and on. The problems of signature schemes never seem to end.
What to use instead?
Ok, so if signatures are so bad, what can I use instead?
Firstly, if you can get away with using a simple shared secret scheme like HMAC, then do so. In contrast to public key crypto, HMAC is possibly the most robust crypto primitive ever invented. You’d have to go really far out of your way to screw up HMAC. (I mean, there are timing attacks and that time that Bouncy Castle confused bits and bytes and used 16-bit HMAC keys, so still do pay attention a little bit…)
If you need public key crypto, then… still use HMAC. Use an authenticated KEM with X25519 to generate a shared secret and use that with HMAC to authenticate your message. This is essentially public key authenticated encryption without the actual encryption. (Some people mistakenly refer to such schemes as designated verifier signatures, but they are not).
Signatures are good for software/firmware updates and pretty terrible for everything else.