DSQL 如何确保序列扩展。
How DSQL makes sure sequences scale

原始链接: https://blog.benjscho.dev/technical/2026/02/13/dsql-sequences.html

## DSQL 序列支持:摘要 DSQL 现在支持序列来生成唯一标识符,为 UUID 提供了一种替代方案,尤其是在需要数值 ID 时。虽然 UUID 仍然是大规模场景的理想选择,但序列为许多用例提供了一个可行的选项。 序列的功能类似于 Postgres——它们递增并缓存值以最大限度地减少磁盘 I/O。然而,在 DSQL 的分布式架构中,每次 `nextval()` 调用都会遍历数据的完整“生命周期”以确保一致性。 扩展序列具有挑战性,因为它们代表“热键”,但 DSQL 利用水平可扩展性来缓解这个问题。 `CACHE` 值至关重要。DSQL 建议使用 `CACHE=1` 来处理密集打包、低吞吐量的序列,或者使用 `CACHE>=65536` 来实现可扩展性。 实验表明,较大的缓存(65536)在批量插入和并发场景中表现出明显优于 `CACHE=1` 的性能,实现了与 UUID 相当的吞吐量。 选择 `CACHE=1` 适用于低容量插入,而高缓存值可确保序列不会成为瓶颈。 您可以使用 `ALTER SEQUENCE` 稍后调整缓存。 对于优先考虑可扩展性的应用程序,UUID 仍然是推荐的方法。

Hacker News 新闻 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 DSQL 如何确保序列扩展 (benjscho.dev) 4 点赞 steepben 1小时前 | 隐藏 | 过去 | 收藏 | 讨论 帮助 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系 搜索:
相关文章

原文

Sequences are one of those Postgres features that you don’t think much about. You can ask for the next number in the sequence, and you get it. That works pretty well when you have one machine asking for the next number, but what about 10,000?

We’ve just launched Sequence support in DSQL and we’re excited about it. Up until now our recommendation has been to use UUIDs, and for truly massive scale it still is. But we recognize that there’s plenty of places where you’d like to use a unique number for an identity on a table.

If you just want to get started with sequences, here’s how you make one in DSQL:

CREATE SEQUENCE my_sequence CACHE 65536;
SELECT nextval('my_sequence'); 

Or if you want to use it as an identity in a column:

CREATE TABLE orders (
  id BIGINT GENERATED BY DEFAULT AS IDENTITY (CACHE 65536) PRIMARY KEY,
  customer_name TEXT
);

This blog is going to go more into the details of why sequences look like they do in DSQL, but to get started that’s all you need to know!

Sequences in DSQL

In single box (or single writer) SQL systems, sequences are very lightweight. They provide unique values without a unique index and without taking heavy locks. In Postgres, sequences are stored like any other data: in a table. That table is then stored for durability on disk, with writes and updates going through a log for crash recovery. When a backend process calls nextval() it reads the value, increments it, and writes the new value back. To avoid going to disk too much, backends can also cache a number of values, set by the CACHE value of the sequence, we’ll come back to that later.

In a distributed architecture things are a little less simple, but not by much! Marc’s blog on the circle of life is a good primer on DSQL’s architecture. When you call nextval(), the read goes to storage, checks the latest value, and increments the value before writing the update to the journal. So far so simple? The important thing to remember is that getting the next set of values in a sequence goes through a full circle of life.

In distributed systems, creating scalable applications is a mutual responsibility. The big problem with scaling sequences is they’re a classic hot key. DSQL’s advantage is that we can support endless horizontal scaling at every component, but to be able to scale DSQL needs to be able to spread your changes out across workers. For typical inserts we do that by partitioning the data. However, you can’t partition a single row table.

CACHE to the rescue

Remember the CACHE value from sequences in Postgres? This sets how many values a given backend gets on each call to nextval(). So with CACHE=3 a backend would fetch 3 values on every call to nextval(), which they can then use in requests without performing extra expensive IO.

                         ┌─────────────┐
                         │    Disk     │
                         │  seq = 10   │
                         └──────┬──────┘
                                │
           ┌────────────────────┼────────────────────┐
           │                    │                    │
           ▼                    ▼                    ▼
    ┌─────────────┐      ┌─────────────┐      ┌─────────────┐
    │  Backend A  │      │  Backend B  │      │  Backend C  │
    │ cache: 1-3  │      │ cache: 4-6  │      │ cache: 7-9  │
    └─────────────┘      └─────────────┘      └─────────────┘
           │                    │                    │
           ▼                    ▼                    ▼
      nextval()=1          nextval()=4          nextval()=7
      nextval()=2          nextval()=5          nextval()=8
      nextval()=3          nextval()=6          nextval()=9

Each backend reserves a chunk of sequence values on its first call. Subsequent nextval() calls return from the local cache without going to disk. If a backend crashes, those values from the sequence are discarded, so higher cache values can result in gaps in sequences.

DSQL parallelises our compute in the form of the Query Processor. Instead of individual backends, statements are executed by QPs. Here, CACHE functions the same as in Postgres. When a QP calls nextval() it gets a cached set of values, and hands them out. So now to the elephant in the room for DSQL support, we only support CACHE=1 or CACHE>=65536.

The point of these values is to highlight the decision for the developer. Either you want your sequence to be densely packed and low throughput, or you want it to be able to scale. With large cache values (>=65k), sequences are rarely a bottleneck in DSQL transactions.

What if I don’t need scale though?

That is totally fine too! Not every project is trying to hit 100k TPS. We also know there’s plenty of applications where you have a slow rate of inserts and would prefer a dense, increasing sequence. That’s why DSQL supports CACHE=1.

To put some numbers on it, I ran some experiments.

Each of these I tested with CACHE 1, CACHE 65536, and I provided an example with UUID for a value that doesn’t require coordination. Since UUID is always locally generated, there’s no way for it to conflict and serves as a good baseline.

This is what the DDL for my first test looks like:

CREATE SEQUENCE seq_cache_1 CACHE 1;
CREATE SEQUENCE seq_cache_65536 CACHE 65536;

Here in the first test I’m creating two sequences, one with CACHE=1, and one with CACHE=65536. We’re then fetching new values serially, so we’re making one request to get a new value, waiting until we get it back, and then making another. The majority of the time is spent in network time waiting for the request to go from my laptop to DSQL’s QP and back. You’ll notice that the high cache value is faster, because the QP I’m connected to isn’t having to fetch an update from Storage every time, but it’s not faster by much. Comparing with UUID, you can see that’s pretty much the same as our high cache option.

------------------------------------------------------------
Experiment 1: Individual nextval() calls (100 iterations)
------------------------------------------------------------

CACHE=1:
  Mean:  20.26 ms
  Min:   17.94 ms
  Max:   97.10 ms
  Total: 2025.65 ms

CACHE=65536:
  Mean:  13.60 ms
  Min:   11.53 ms
  Max:   26.95 ms
  Total: 1360.24 ms

UUID:
  Mean:  13.31 ms
  Min:   11.81 ms
  Max:   27.11 ms
  Total: 1330.73 ms

Speedup with CACHE=65536 vs CACHE=1: 1.5x faster
Speedup with UUID vs CACHE=1: 1.5x faster

Okay great! But that doesn’t really tell us that much about how it scales. We’re just using a single connection and fetching one value at a time. Let’s look now at the case of a bulk insert. So here, we’re inserting 1000 rows into a table with a sequence:

-- Creating the table
CREATE TABLE bench_table (id BIGINT, data TEXT);

-- My insert statements look like this:
INSERT INTO bench_table (id, data)
SELECT nextval('seq_cache_1'), 'row ' || g
FROM generate_series(1, 1000) g;

We’re still running on one connection, but now we’re running a bulk insert of 1000 rows instead of fetching the nextval a bunch of times. So what does that look like?

------------------------------------------------------------
Experiment 2: Bulk INSERT with 1000 rows
------------------------------------------------------------

CACHE=1:     6229.11 ms total
CACHE=65536: 83.16 ms total
UUID:        77.24 ms total

Sequence CACHE=65536 vs CACHE=1: 74.9x faster
UUID vs CACHE=1: 80.6x faster

Well that’s a big difference! The reason for this is that typically incrementing sequences don’t follow the transaction semantics that we have for other values. It would be strange if something like:

BEGIN;
select nextval('my_seq'); -- returns 4
ROLLBACK;
select nextval('my_seq'); -- returns 4 again ??

were to happen. To preserve the expectation that sequences always return a unique, incrementing value, under the hood they are created by special internal transactions. This means that every call to fetch a nextval is going through the DSQL circle of life. With that in mind, the results for the bulk insert on a single connection make sense! For CACHE=1, even though we’re only on one connection, the QP has to go through the full loop for each row, fetching a value from storage, writing back to the journal, waiting for the transaction to finish before the next value can be read. With a large CACHE value, our QP only needs to do that once. This is on a single region cluster, but on a multi-region cluster the difference would be even more marked, because we’d need to wait for the write to be committed to our second region.

Now that was still on a single connection, what about when we actually want to scale? How do they behave when we throw more connections at them? This experiment is the same as experiment one, except we’re running contested. Instead of just one connection, let’s create 100, and have each of those fetch 100 next values:

------------------------------------------------------------
Experiment 3: Conflict test (100 workers x 100 nextvals each)
------------------------------------------------------------

CACHE=1:
  Total time:  51589.87 ms
  Throughput:  193.8 calls/sec
  Mean:        353.03 ms
  Min:         17.16 ms
  Max:         21182.18 ms
  Errors:      0

CACHE=65536:
  Total time:  3020.29 ms
  Throughput:  3310.9 calls/sec
  Mean:        17.46 ms
  Min:         10.57 ms
  Max:         1538.64 ms
  Errors:      0

UUID:
  Total time:  2902.25 ms
  Throughput:  3445.6 calls/sec
  Mean:        15.00 ms
  Min:         10.86 ms
  Max:         1439.94 ms
  Errors:      0

Throughput speedup with CACHE=65536 vs CACHE=1: 17.1x
Throughput speedup with UUID vs CACHE=1: 17.8x

The throughput difference is again just as marked. In the CACHE=1 case, the majority of internal transactions to fetch a cache value are conflicting. DSQL hides the internal details where these would show up as OCC errors, instead this shows up as latency, as conflicts would in regular Postgres. With high cache values we have almost no contention. Comparing to our baseline of UUIDs we can see the difference is minimal.

So what do I use?

If you want to use a sequence our recommendation is to use a high cache value. It’s going to keep up with your scale and avoid being a bottleneck in your system. If you really want densely packed sequences and you don’t expect your table to ever be running higher than a few transactions per second, then CACHE=1 will work just fine. If you change your mind or see it becoming a blocker down the line, you can always go back and fix it with:

ALTER SEQUENCE my_seq CACHE 65536;

But if you truly don’t want to worry about scale, just use UUIDs:

CREATE TABLE orders (
  id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
  --...
);

联系我们 contact @ memedata.com