SQLite 通过预排序提升性能
SQLite improving performance with pre-sort

原始链接: https://andersmurphy.com/2026/06/07/sqlite-improving-performance-with-pre-sort.html

本文探讨了在使用 B+ 树索引时,使用无序随机标识符(如安全的 160 位令牌)会导致数据库插入性能下降的问题。由于 B+ 树是有序的,插入完全随机的数据会引发频繁的页面分裂和树结构重平衡,从而显著拖慢写入操作。 作者演示了通过在插入前对批量随机数据进行排序,可以缓解这些问题。通过使用 8 字节无符号比较对随机标识符进行内存中排序,作者实现了比未排序随机插入快 2 到 3 倍的插入速度。 核心结论是,在无法使用 UUID7 等顺序替代方案(例如由于熵要求)处理无序主键时,预先对数据批次进行排序是一种极有效的优化策略,能够降低数据库开销。

Hacker News最新 | 过往 | 评论 | 提问 | 展示 | 招聘 | 提交登录SQLite通过预排序提升性能 (andersmurphy.com)11分 由 tosh 2小时前发布 | 隐藏 | 过往 | 收藏 | 讨论 帮助 指导方针 | 常见问题 | 列表 | API | 安全 | 法律 | 申请YC | 联系 搜索:
相关文章

原文


In the last post we showed how the randomness of UUID4 can have a large impact on insert speed and how UUID7 can help. But, what about other data that might have random qualities? Where we can't reach for UUID7 to solve our problems?

Random data

We will use a 160-bit (20 byte) random value generated by SecureRandom similar to the one described in this article. Why? Because for things like session tokens you probably do not want to use UUID7 (leaks information, might not have enough entropy for your use case etc). This could also represent any data that has a primary key that is random (or more specifically unordered).

Here's the code to generate them:

(defn random-unguessable-uid []
  (let [buffer (byte-array 20)]
    (.nextBytes secure-random buffer)))

Let's see how this performs.

(d/q writer
  ["CREATE TABLE IF NOT EXISTS event(id BLOB PRIMARY KEY, data BLOB) WITHOUT ROWID"])
  
(dotimes [_ 10]
  (time
    (d/with-write-tx [db writer]
      (dotimes [_ 1000000]
        (d/q db ["INSERT INTO event (id, data) values (?, ?)"
                 (random-unguessable-id) data])))))

Results:

total rowstime in ms
10000002478
20000004927
30000006262
40000007195
50000008257
60000008704
70000009244
80000009771
900000010387
1000000011103

Around a hundred thousand inserts per second. Just like with UUID4 things are slow.

Pre sort

The defining characteristic of a B+ Tree is that it is ordered. Sequential writes are fast. Random data works against this thrashing pages, causing page splits and tree re-balancing. It's not a good time.

But, we are batching the data already. So what if we sort it before we insert?

First we need a fast way to compare our random id. It's 20 bytes, we don't want to iterate over them, so instead we just take the first 8 bytes and convert them into a long. We probably don't need to compare the entire byte array to get a good enough sort. The important part is this comparison is unsigned to match SQLite.

When two BLOB values are compared, the result is determined using memcmp().

Note: This probably isn't the fastest way to sort random data. I don't write blog post with an internet connection (it's too distracting). Search is also terrible these days. So mostly work with good old offline docs with fuzzy text search using dash (zeal is the linux equivalent). If you know a faster/better way to sort byte arrays in Java/Clojure let me know!

(defn bytes->long [^bytes bytes]
  (-> (ByteBuffer/wrap bytes 0 8)
    (ByteBuffer/.getLong 0)))
    
(defn byte-compare
  "Compares the first 8 most significant bytes of a byte array.
  Big Endian (matches SQLites blob sort)."
  [a b]
    (Long/compareUnsigned
      (bytes->long a)
      (bytes->long b)))

Let's see how this performs:

(d/q writer
  ["CREATE TABLE IF NOT EXISTS event(id BLOB PRIMARY KEY, data BLOB) WITHOUT ROWID"])

(dotimes [_ 10]
  (time
    (d/with-write-tx [db writer]
      (->> (repeatedly 1000000 random-unguessable-id)
        (sort byte-compare)
        (run!
          (fn [id]
            (d/q db ["INSERT INTO event (it, data) values (?, ?)"
                     id data])))))))

Results:

total rowstime in ms
10000001987
20000002251
30000002296
40000002614
50000002687
60000003244
70000003118
80000003311
90000003485
100000003835

Fun! Despite the overhead, sorting the batch improves performance by around 2-3x.

Conclusion

Hopefully, this post helps show you how batching data opens up useful optimisations like pre-sort when dealing with unordered data.

The full benchmark code can be found here.

Thanks to Everyone on the Datastar discord who read drafts of this and gave me feedback.

Discussion

联系我们 contact @ memedata.com