CRDT 互动介绍

CRDT 互动介绍
An Interactive Intro to CRDTs (2023)

原始链接: https://jakelazaroff.com/words/an-interactive-intro-to-crdts/

本文探讨了无冲突复制数据类型 (CRDT)，这是一种为构建无中心服务器的协作应用程序而设计的數據結構。CRDT 允许多个用户独立更新数据，保证所有副本最终一致性。重点是*基于状态*的 CRDT，它们传输完整状态并进行合并。作者详细介绍了从头开始构建 CRDT 的过程，从最简单的开始：最后写入胜出 (LWW) 寄存器，它通过接受基于时间戳的最新更新来解决冲突。然后扩展到 LWW 映射，允许使用多个键值对，每个键值对由一个 LWW 寄存器管理。交换律、结合律和幂等性等关键概念被解释为 CRDT 合并函数的基本属性。 LWW 映射利用“墓碑”——保留已删除键的元数据——以防止同步过程中发生意外数据丢失。虽然功能强大，但 CRDT 被描述为单调递增的，这意味着数据只能添加，而不能真正删除。本文为实际应用奠定了基础：使用这些基础 CRDT 构建协作像素艺术编辑器。

一场 Hacker News 的讨论集中在协作软件中使用无冲突复制数据类型 (CRDT) 的实用性上。最初的一条评论告诫不要广泛采用 CRDT，而是倾向于使用中心化服务器来实现权限和锁定等功能，但其他人对此表示异议。具体而言，有人质疑 Google Docs 和 Figma *不* 使用 CRDT 的说法。引用了一篇 2019 年的 Figma 博客文章，揭示他们的系统*受到* CRDT 的启发，但并非纯粹的实现。讨论强调 CRDT 在去中心化环境中表现出色，能够抵御服务器中断——这种优势被比作在数据中心遭受攻击时也能生存。最终，该帖子指出存在一种权衡：CRDT 实现起来很复杂，但能提供强大的鲁棒性，而中心化系统更简单，但容易受到单点故障的影响。Google Docs 早于 CRDT 的广泛应用，并使用了操作转换 (OT)。

Have you heard about CRDTs and wondered what they are? Maybe you’ve looked into them a bit, but ran into a wall of academic papers and math jargon? That was me before I started my Recurse Center The Recurse Center The Recurse Center is a self-directed, community-driven educational retreat for programmers in New York City. www.recurse.com/ batch. But I’ve spent the past month or so doing research and writing code, and it turns out that you can build a lot with just a few simple things!

In this series, we’ll learn what a CRDT is. Then we’ll write a primitive CRDT, compose it into more complex data structures, and finally use what we’ve learned to build a collaborative pixel art editor. All of this assumes no prior knowledge about CRDTs, and only a rudimentary knowledge of TypeScript.

To pique your curiosity, this is what we’ll end up with:

Draw by clicking and dragging with your mouse. Change the paint color by using the color input on the bottom left. You can draw on either canvas and your changes will show up on the other, as if they were collaborating on the same picture.

Clicking the network button prevents changes from reaching the other canvas (although they’ll sync up again when they come back “online”). The latency slider adds a delay before changes on one canvas show up on the other.

We’ll build that in the next post. First, we need to learn about CRDTs!

What is a CRDT?

Okay, let’s start from the top. CRDT stands for “Conflict-free Replicated Data Type”. That’s a long acronym, but the concept isn’t too complicated. It’s a kind of data structure that can be stored on different computers (peers). Each peer can update its own state instantly, without a network request to check with other peers. Peers may have different states at different points in time, but are guaranteed to eventually converge on a single agreed-upon state. That makes CRDTs great for building rich collaborative apps, like Google Docs and Figma — without requiring a central server to sync changes.

Broadly, there are two kinds of CRDTs: state-based and operation-based.¹ State-based CRDTs transmit their full state between peers, and a new state is obtained by merging all the states together. Operation-based CRDTs transmit only the actions that users take, which can be used to calculate a new state.

That might make operation-based CRDTs sound way better. For example, if a user updates one item in a list, an operation-based CRDT can send a description of only that update, while a state-based CRDT has to send the whole list! The drawback is that operation-based CRDTs impose constraints on the communication channel: messages must be delivered exactly once, in causal order, to each peer.²

This post will exclusively focus on on state-based CRDTs. For brevity, I’ll just say “CRDTs” from here on out, but know that I’m referring specifically to state-based CRDTs.

I’ve been talking about what CRDTs do, but what is a CRDT? Let’s make it concrete: a CRDT is any data structure that implements this interface:³

interface CRDT<T, S> {
  value: T;
  state: S;
  merge(state: S): void;
}

That is to say, a CRDT contains at least three things:

A value, T. This is the part the rest of our program cares about. The entire point of the CRDT is to reliably sync the value between peers.
A state, S. This is the metadata needed for peers to agree on the same value. To update other peers, the whole state is serialized and sent to them.
A merge function. This is a function that takes some state (probably received from another peer) and merges it with the local state.

The merge function must satisfy three properties to ensure that all peers arrive at the same result (I’ll use the notation A ∨ B to indicate merging state A into state B):

Commutativity: states can be merged in any order; A ∨ B = B ∨ A. If Alice and Bob exchange states, they can each merge the other’s state into their own and arrive at the same result.
Associativity: when merging three (or more) states, it doesn’t matter which are merged first; (A ∨ B) ∨ C = A ∨ (B ∨ C). If Alice receives states from both Bob and Carol, she can merge them into her own state in any order and the result will be the same.⁴
Idempotence: merging a state with itself doesn’t change the state; A ∨ A = A. If Alice merges her own state with itself, the result will be the same state she started with.

Mathematically proving that a merge function has all these properties might sound hard. But luckily, we don’t have to do that! Instead, we can just combine CRDTs that already exist, leaning on the fact that someone has proven these things for us.

Speaking of CRDTs that already exist: let’s learn about one!

Last Write Wins Register

A register is a CRDT that holds a single value. There are a couple kinds of registers, but the simplest is the Last Write Wins Register (or LWW Register).

LWW Registers, as the name suggests, simply overwrite their current value with the last value written. They determine which write occurred last using timestamps, represented here by integers that increment whenever the value is updated.⁵ Here’s the algorithm:

If the received timestamp is less than the local timestamp, the register doesn’t change its state.
If the received timestamp is greater than the local timestamp, the register overwrites its local value with the received value. It also stores the received timestamp and some sort of identifier unique to the peer that last wrote the value (the peer ID).
Ties are broken by comparing the local peer ID to the peer ID in the received state.

Try it out with the playground below.

Did you get a sense for how LWW Registers work? Here are a couple specific scenarios to try:

Turn the network off, make a bunch of updates to bob, and then turn it back on. When you send updates from alice, they’ll be rejected until the timestamp exceeds bob’s timestamp.
Perform the same setup, but once you turn the network back on, send an update from bob to alice. Notice how the timestamps are now synced up and alice can write to bob again!
Increase the latency and send an update from both peers simultaneously. alice will accept bob’s update, but bob will reject alice’s. Since bob’s peer ID is greater, it breaks the timestamp tie.

Here’s the code for the LWW Register:

class LWWRegister<T> {
  readonly id: string;
  state: [peer: string, timestamp: number, value: T];

  get value() {
    return this.state[2];
  }

  constructor(id: string, state: [string, number, T]) {
    this.id = id;
    this.state = state;
  }

  set(value: T) {
    
    this.state = [this.id, this.state[1] + 1, value];
  }

  merge(state: [peer: string, timestamp: number, value: T]) {
    const [remotePeer, remoteTimestamp] = state;
    const [localPeer, localTimestamp] = this.state;

    
    if (localTimestamp > remoteTimestamp) return;

    
    if (localTimestamp === remoteTimestamp && localPeer > remotePeer) return;

    
    this.state = state;
  }
}

Let’s see how this stacks up to the CRDT interface:

state is a tuple of the peer ID that last wrote to the register, the timestamp of the last write and the value stored in the register.
value is simply the last element of the state tuple.
merge is a method that implements the algorithm described above.

LWW Registers have one more method named set, which is called locally to set the register’s value. It also updates the local metadata, recording the local peer ID as the last writer and incrementing the local timestamp by one.

That’s it! Although it appears simple, the humble LWW Register is a powerful building block with which we can create actual applications.

Last Write Wins Map

Most programs involve more than one value,⁶ which means we’ll need a more complex CRDT than the LWW Register. The one we’ll learn about today is called the Last Write Wins Map (or LWW Map).

Let’s start by defining a couple types. First, our value type:

type Value<T> = {
  [key: string]: T;
};

If each individual map value holds type T, then the value of the entire LWW Map is a mapping of string keys to T values.

Here’s our state type:

type State<T> = {
  [key: string]: LWWRegister<T | null>["state"];
};

Do you see the trick? From our application’s perspective, the LWW Map just holds normal values — but it actually holds LWW Registers. When we look at the full state, each key’s state is the state of the LWW Register at that key.⁷

I want to pause on that for a moment, because it’s important. Composition lets us combine primitive CRDTs into more complex ones. When it’s time to merge, all the parent does is pass slices of incoming state to the appropriate child’s merge function. We can nest this process as many times as we want; each complex CRDT passing ever-smaller slices of state down to the next level, until we finally hit a primitive CRDT that performs the actual merge.

From this perspective, the LWW Map merge function is simple: iterate through each key and hand off the incoming state at that key to the corresponding LWW Register to merge. Try it out in the playground below:

It’s kind of difficult to trace what’s happening here, so let’s split up the state for each key. Note, though, that this is just a visualization aid; the full state is still being transmitted as a single unit.

Try increasing the latency and then updating different keys on each peer. You’ll see that each peer accepts the updated value with a higher timestamp, while rejecting the value with a lower timestamp.

The full LWW Map class is kinda beefy, so let’s go through each property one by one. Here’s the start of it:

class LWWMap<T> {
  readonly id = "";
  #data = new Map<string, LWWRegister<T | null>>();

  constructor(id: string, state: State<T>) {
    this.id = id;

    
    for (const [key, register] of Object.entries(state)) {
      this.#data.set(key, new LWWRegister(this.id, register));
    }
  }
}

#data is a private property holding a map of the keys to LWW Register instances. To instantiate a LWW Map with preexisting state, we need to iterate through the state and instantiate each LWW Register.

Remember, CRDTs need three properties: value, state and merge. We’ll look at value first:

  get value() {
    const value: Value<T> = {};

    
    for (const [key, register] of this.#data.entries()) {
      if (register.value !== null) value[key] = register.value;
    }

    return value;
  }

It’s a getter that iterates through the keys and gets each register’s value. As far as the rest of the app is concerned, it’s just normal map!

Now let’s look at state:

  get state() {
    const state: State<T> = {};

    
    for (const [key, register] of this.#data.entries()) {
      if (register) state[key] = register.state;
    }

    return state;
  }

Similar to value, it’s a getter that builds up a map from each register’s state.

There’s a clear trend here: iterating through the keys in #data and handing things off to the register stored at that key. You’d think merge would work the same way, but it’s a little more involved:

  merge(state: State<T>) {
    
    for (const [key, remote] of Object.entries(state)) {
      const local = this.#data.get(key);

      
      if (local) local.merge(remote);
      
      else this.#data.set(key, new LWWRegister(this.id, remote));
    }
  }

First, we iterate through the incoming state parameter rather than the local #data. That’s because if the incoming state is missing a key that #data has, we know that we don’t need to touch that key.⁸

For each key in the incoming state, we get the local register at that key. If we find one, the peer is updating an existing key that we already know about, so we call that register’s merge method with the incoming state at that key. Otherwise, the peer has added a new key to the map, so we instantiate a new LWW Register using the incoming state at that key.

In addition to the CRDT methods, we need to implement methods more commonly found on maps: set, get, delete and has.

Let’s start with set:

  set(key: string, value: T) {
    
    const register = this.#data.get(key);

    
    if (register) register.set(value);
    
    else this.#data.set(key, new LWWRegister(this.id, [this.id, 1, value]));
  }

Just like in the merge method, we’re either calling the register’s set to update an existing key, or instantiating a new LWW Register to add a new key. The initial state uses the local peer ID, a timestamp of 1 and the value passed to set.

get is even simpler:

  get(key: string) {
    return this.#data.get(key)?.value ?? undefined;
  }

Get the register from the local map, and return its value if it has one.

Why coalesce to undefined? Because each register holds T | null. And with the delete method, we’re ready to explain why:

  delete(key: string) {
    
    this.#data.get(key)?.set(null);
  }

Rather than fully removing the key from the map, we set the register value to null. The metadata is kept around so we can disambiguate deletions from states that simply don’t have a key yet. These are called tombstones — the ghosts of CRDTs past.

Consider what would happen if we really did delete the keys from the map, rather than leaving a tombstone. Here’s a playground where peers can add keys, but not delete them. Can you figure out how to get a peer to delete a key?

Turn off the network, add a key to alice’s map, then turn the network back on. Finally, make a change to bob’s map. Since alice sees that the incoming state from bob is missing that key, she removes it from her own state — even though bob never knew about that key in the first place. Whoops!

Here’s a playground with the correct behavior. You can also see what happens when a key is deleted.

Notice how we never remove deleted keys from the map. This is one drawback to CRDTs — we can only ever add information, not remove it. Although from the application’s perspective the key has been fully deleted, the underlying state still records that the key was once there. In technical terms, we say that CRDTs are monotonically increasing data structures.⁹

The final LWW Map method is has, which returns a boolean indicating whether the map contains a given key.

  has(key: string) {
    
    return this.#data.get(key)?.value !== null;
  }

There’s a special case here: if the map contains a register at the given key, but the register contains null, the map is considered to not contain the key.

For posterity, here’s the full LWW Map code:

class LWWMap<T> {
  readonly id: string;
  #data = new Map<string, LWWRegister<T | null>>();

  constructor(id: string, state: State<T>) {
    this.id = id;

    
    for (const [key, register] of Object.entries(state)) {
      this.#data.set(key, new LWWRegister(this.id, register));
    }
  }

  get value() {
    const value: Value<T> = {};

    
    for (const [key, register] of this.#data.entries()) {
      if (register.value !== null) value[key] = register.value;
    }

    return value;
  }

  get state() {
    const state: State<T> = {};

    
    for (const [key, register] of this.#data.entries()) {
      if (register) state[key] = register.state;
    }

    return state;
  }

  has(key: string) {
    return this.#data.get(key)?.value !== null;
  }

  get(key: string) {
    return this.#data.get(key)?.value;
  }

  set(key: string, value: T) {
    
    const register = this.#data.get(key);

    
    if (register) register.set(value);
    
    else this.#data.set(key, new LWWRegister(this.id, [this.id, 1, value]));
  }

  delete(key: string) {
    
    this.#data.get(key)?.set(null);
  }

  merge(state: State<T>) {
    
    for (const [key, remote] of Object.entries(state)) {
      const local = this.#data.get(key);

      
      if (local) local.merge(remote);
      
      else this.#data.set(key, new LWWRegister(this.id, remote));
    }
  }
}

Next Steps

We now have two CRDTs in our toolbox: LWW Register and LWW Map. How do we actually use them? We’ll cover that in the next post: Building a Collaborative Pixel Art Editor with CRDTs Building a Collaborative Pixel Art Editor with CRDTs | jakelazaroff.com CRDTs sound cool, but how are they actually used? Let's learn by building a collaborative pixel art editor. jakelazaroff.com/words/building-a-collaborative-pixel-art-editor-with-crdts/ .