I had the pleasure of giving a talk on "XMPP and Metadata" during the last Chaos Communication Congress, in the Critical Decentralization Cluster area. It was my first public presentation in a very long while (also in english), so the talk went okay-ish at best. The end of the year was also hectic and I did not manage to prepare or rehearse as much as I would have liked to.
This blog post will be a longer, more complete version of the talk. You can nonetheless find the talk slides on the CDC pretalx. Thanks a lot to the people who proofread the blog post to fix stuff or suggest additional content.
This was about metadata, but also generally data retention and what the server sees in general.
Obvious message workflow
This might be too obvious for most people, but for clarity’s sake, I want to assert that to send a message to another entity, you need:
- a sender
- a message
- a receiver
This is not technical, this is baked into the concept of sending a message. Those elements will always be present somewhere in the workflow. Assuming a working encryption system, the message data itself will not be considered.
There are, however, some technical tricks that can hide a lot of things from the infrastructure layer.
XMPP
I cannot really make this an introduction to XMPP but to summarize, XMPP is an extensible federated protocol for messaging and presence. It is using XML for the most part but nobody should care (except trolls, I guess). It started in 1999 as Jabber, and grew to be an IETF standard under the name XMPP after Jabber got bought by Cisco (we can still use the Jabber name in many ways).
The protocol started server-heavy with light clients - and in fact, you will read as much in the "XMPP, the definitive guide" book -, but the trend got reversed in the last decade due to the rise of mobile clients which can be updated very often and other circumstances.
There are clients and servers, and it is therefore a protocol made of client-to-server interactions and server-to-server interactions, each with their own privacy implications.
The key elements to remember in order to assess threats in the XMPP network fabric would be:
- Your server is the only entity sending data to other servers. Every single bit of XML your XMPP client sends goes through your server.
- Other servers will only see your interactions with their own users.
Those two points are true in most non-P2P models. Centralized models can be thought as a specialization of this model, but with only one single server.
That is why it is essential to rely on a server you can trust, either operated by people you trust, or at least who have some accountability in place, for example the services listed on providers.xmpp.net.
Threats
I can roughly point out four types of "passive" threats on metadata for XMPP:
- A server compromise (present)
- Correlation of data streams in real time
- A server compromise (future)
- Exploitation of the static data available on the server
- An attacker present on the server network
- Can see what the server does (both with clients and servers)
- An attacker on the client (your) network
- Can see what your client does
Server compromise
Live stream interception
As stated earlier, choosing a server you trust is the very first step, and if you do not trust your server (and operators) at all, why are you there?
If the server itself is compromised, the point-to-point TLS encryption between client and servers, and between servers, becomes very much useless, which means "server network attacker" scenarios can now be matched with 100% precision, and more:
- Correlation between sender and receiver is exact
- The user’s XMPP address can be mapped to an IP and port
- Stanza type is exposed, whether <message/>, <iq/>, <presence/> stanzas are sent, everything is known
- Activity patterns can get a lot more detailed
- E2EE still works to protect data - but can get disrupted -
Some solutions for XMPP
Other interesting solutions
Serverless messaging
XEP-0174 describes how to operate XMPP inside a trusted network, which bypasses the need for a server by using mDNS on this network and advertising your address with your local IP.
I liked it a lot and had fun at the time, but it was from a blissful era where encryption was not seen as the foundation upon which everything should be built. It means that everything transits unencrypted on the network, both metadata and data are unprotected. Data could be encrypted using OTR, but some required bricks for the modern XMPP experience like the Personal Eventing Protocol are unavailable, making OMEMO in its current form a no-go.
Implementations are scarce nowadays, as an incomplete XMPP layer inside a normal client is usually pretty convoluted to maintain and after a while it was removed from most clients where it existed.
The modern replacement is RELOAD, but implementation and deployment is just starting, so it is a bit early to know if it is a good solution or not.
XTLS
XTLS describes a way to create a direct and encrypted TLS stream between two entities using jingle, which could be then used to exchange stanzas in a secure manner, without going through the server.
That means that the metadata threats shift from server-based to client-based, which can be an upside or not at all. The layer used for the channel is also quite important, as it could be In-Band Bytestreams (which means the data goes through the usual client-server-server-client route) which would then provide additional E2EE cloaking of all data exchanged between clients, but still going through all the entities to route data.
Cryptographic identities
Strong cryptographic identities as outlined in XEP-0416 could serve as a framework for removing some of the requirements for trust in the server. I do find X.509 a bit too complex for what we want to do with it, but I did not dive very deep into this spec.
Other services/protocols
Signal
Signal is a pioneer in encryption systems at scale, and keeps pushing the boundaries of what is possible to do securely. Nonetheless, their messenger is centralized, with systems running on AWS and Azure (as far as I can tell), which makes them very dependent on the US political wasteland as well as the tech landlords’ whims.
They do a lot of things to ensure things are as secure as it can be within the constraints they imposed on themselves, and as such while I trust them for now, their servers certainly have a lot of opportunities to collect and store a ton of metadata, simply due to the fact that this is a centralized system. While their cryptography work is class-leading, which makes my data secure (as long as someone does not bust the secure enclave which protects my "recovery code" I guess), keeping my metadata volatile and secure there is only a leap of faith on my part, as I can have no guarantees.

Matrix
Matrix is a federated protocol which has many of the same flaws as XMPP with regards to metadata.
One notable difference is that matrix is more like a distributed database with built-in conflict resolution, which leads to every participant’s server replicating the state (data and metadata) of the rooms they are in. This creates a more difficult situation for metadata than XMPP, because while XMPP servers can see what goes through them, in Matrix the servers are required to store this information.
Matrix also has two different sets of APIs for client-to-server and server-to-server communication, which should allow it to batch messages when appropriate.
SimpleX
SimpleX is a protocol with a lot of cryptography baked in, and has interesting properties such as the absence of user accounts and therefore identifiers (which means very little data on the servers can be compromised).
One of its more interesting properties is that it has 2-stage onion routing baked in, which allows it to sidestep many issues around metadata due to connecting to servers.
(credit: Wikipedia)
The whitepaper stresses that it is still important to choose your servers well, but that is still less critical than in XMPP since you can easily switch at very little cost.
(P.S.: calling your protocol "SMP" is not nice if it is not based on the Socialist Millionaire Protocol, I haven’t checked but skimming did not reveal any mention of it)
Conclusion
As Daniel noted on mastodon, there are some low-hanging fruits to improving the metadata issues around XMPP (and some higher-hanging fruits as well). I agree that this is not going to be even a blip on XMPP adoption, but we should do what we can nonetheless to improve the situation, in order to improve the standard and the ecosystem. That said, I can perfectly understand that since a lot of the work is purely volunteer-driven and our time and energy is limited, it can appear to be a waste of time to dedicate them to removing bits of metadata here and there.