展示 HN: SNKV – SQLite 的 B 树作为键值存储(C/C++ 和 Python 绑定)
Show HN: SNKV – SQLite's B-tree as a key-value store (C/C++ and Python bindings)

原始链接: https://github.com/hash-anu/snkv

## SNKV:基于SQLite的轻量级键值存储 SNKV是一个快速、ACID兼容的嵌入式键值(KV)存储,直接构建在SQLite的B树引擎之上,绕过SQL层以提高性能。它提供SQLite级别的可靠性,并采用KV优先的设计,从而降低了读取密集型和混合工作负载的开销。 主要特性包括:单头文件集成,方便C/C++使用;通过`KVStoreConfig`配置选项;以及全面的测试——包括10GB的崩溃安全性测试。SNKV还提供Python绑定,用于类字典风格的API。 基准测试表明,在大多数KV操作中,SNKV的性能优于SQLite(使用`WITHOUT ROWID`),顺序扫描速度提升高达2倍,随机读取和存在性检查速度提升60-70%。它适用于读取密集型应用、嵌入式系统以及需要简洁KV API而无需SQL开销的场景。 虽然在纯写入吞吐量方面并非最快(推荐使用RocksDB),但SNKV利用现有的SQLite工具进行备份,并使用WAL模式实现并发和SSD优化。它采用Apache 2.0许可。

## SNKV:一个 SQLite B 树键值存储 SNKV 是一个新的 C/C++ 和 Python 库,它利用 SQLite 底层的 B 树引擎作为高性能的键值存储。它绕过了 SQLite 的 SQL 解析器、查询计划器和虚拟机,直接与 B 树交互以实现更快的操作。 基准测试表明,SNKV 在键值工作负载中显著优于标准 SQLite(没有 ROWID)——随机删除速度快高达 104%,混合工作负载速度快 84%。这种加速主要归功于维护一个持久的、缓存的读取游标,避免了 SQLite 反复打开和关闭游标的开销。 虽然 SNKV 提供 ACID 合规性、WAL 并发和列族,但它也承认存在权衡:LMDB 在原始读取方面表现更出色,而 RocksDB 更适合写入密集型任务。它也无法打开使用标准 SQLite CLI 创建的数据库,因为绕过了模式层。该项目旨在为读取密集型键值应用程序提供更低开销的解决方案,作者欢迎独立的基准测试验证。 [https://github.com/hash-anu](https://github.com/hash-anu)
相关文章

原文

Build Memory Leaks Tests Peak Memory GitHub Issues GitHub Closed Issues License


SNKV is a lightweight, ACID-compliant embedded key-value store built directly on SQLite's B-Tree storage engine — without SQL.

The idea: bypass the SQL layer entirely and talk directly to SQLite's storage engine. No SQL parser. No query planner. No virtual machine. Just a clean KV API on top of a proven, battle-tested storage core.

SQLite-grade reliability. KV-first design. Lower overhead for read-heavy and mixed key-value workloads.


Single-header integration — drop it in and go:

#define SNKV_IMPLEMENTATION
#include "snkv.h"

int main(void) {
    KVStore *db;
    kvstore_open("mydb.db", &db, KVSTORE_JOURNAL_WAL);

    kvstore_put(db, "key", 3, "value", 5);

    void *val; int len;
    kvstore_get(db, "key", 3, &val, &len);
    printf("%.*s\n", len, (char*)val);
    snkv_free(val);

    kvstore_close(db);
}

Use kvstore_open_v2 to control how the store is opened. Zero-initialise the config and set only what you need — unset fields resolve to safe defaults.

KVStoreConfig cfg = {0};
cfg.journalMode = KVSTORE_JOURNAL_WAL;   /* WAL mode (default) */
cfg.syncLevel   = KVSTORE_SYNC_NORMAL;   /* survives process crash (default) */
cfg.cacheSize   = 4000;                  /* ~16 MB page cache (default 2000 ≈ 8 MB) */
cfg.pageSize    = 4096;                  /* DB page size, new DBs only (default 4096) */
cfg.busyTimeout = 5000;                  /* retry 5 s on SQLITE_BUSY (default 0) */
cfg.readOnly    = 0;                     /* read-write (default) */

KVStore *db;
kvstore_open_v2("mydb.db", &db, &cfg);
Field Default Options
journalMode KVSTORE_JOURNAL_WAL KVSTORE_JOURNAL_DELETE
syncLevel KVSTORE_SYNC_NORMAL KVSTORE_SYNC_OFF, KVSTORE_SYNC_FULL
cacheSize 2000 pages (~8 MB) Any positive integer
pageSize 4096 bytes Power of 2, 512–65536; new DBs only
readOnly 0 1 to open read-only
busyTimeout 0 (fail immediately) Milliseconds; useful for multi-process use

kvstore_open remains fully supported and uses all defaults except journalMode.


make              # builds libsnkv.a
make snkv.h       # generates single-header version
make examples     # builds examples
make run-examples # run all examples
make test         # run all tests (CI suite)
make clean

Windows (MSYS2 / MinGW64)

1. Install MSYS2.

2. Launch "MSYS2 MinGW 64-bit" from the Start menu (not the plain MSYS2 terminal).

3. Install the toolchain:

pacman -S --needed mingw-w64-x86_64-gcc make

4. Clone and build:

git clone https://github.com/hash-anu/snkv.git
cd snkv
make              # builds libsnkv.a
make snkv.h       # generates single-header
make examples     # builds .exe examples
make run-examples
make test

All commands must be run from the MSYS2 MinGW64 shell. Running mingw32-make from a native cmd.exe or PowerShell window will not work — the Makefile relies on sh and standard Unix tools that are only available inside the MSYS2 environment.


Available on PyPI — no compiler needed:

from snkv import KVStore

with KVStore("mydb.db") as db:
    db["hello"] = "world"
    print(db["hello"].decode())   # world

Full documentation — installation, API reference, examples, and thread-safety notes — is in python/README.md.

SNKV Python API Demo


10 GB Crash-Safety Stress Test

A production-scale kill-9 test is included but kept separate from the CI suite. It writes unique deterministic key-value pairs into a 10 GB WAL-mode database, forcibly kills the writer with SIGKILL during active writes, and verifies on restart that every committed transaction is present with byte-exact values, no partial transactions are visible, and the database has zero corruption.

make test-crash-10gb          # run full 5-cycle kill-9 + verify (Linux / macOS)

# individual modes
./tests/test_crash_10gb write  tests/crash_10gb.db   # continuous writer
./tests/test_crash_10gb verify tests/crash_10gb.db   # post-crash verifier
./tests/test_crash_10gb clean  tests/crash_10gb.db   # remove DB files

Requires ~11 GB free disk. run mode is POSIX-only; write and verify work on all platforms.


Standard database path:

Application → SQL Parser → Query Planner → VDBE (VM) → B-Tree → Disk

SNKV path:

Application → KV API → B-Tree → Disk

By removing the layers you don't need for key-value workloads, SNKV keeps the proven storage core and cuts the overhead.

Layer SQLite SNKV
SQL Parser
Query Planner
VDBE (VM)
B-Tree Engine
Pager / WAL

1M records, Linux, averaged across 3 runs. Both SNKV and SQLite use identical settings: WAL mode, synchronous=NORMAL, 2000-page (8 MB) page cache, 4096-byte pages.

Benchmark source: SNKV · SQLite

SNKV vs SQLite (KV workloads)

SQLite benchmark uses WITHOUT ROWID with a BLOB primary key — the fairest possible comparison, both using a single B-tree keyed on the same field. Both run with identical settings: WAL mode, synchronous=NORMAL, 2000-page (8 MB) cache, 4096-byte pages. This isolates the pure cost of the SQL layer for KV operations.

Note: Both SNKV and SQLite (WITHOUT ROWID) use identical peak RSS (~10.8 MB) since they share the same underlying pager and page cache infrastructure.

Benchmark SQLite SNKV Notes
Sequential writes 140K ops/s 146K ops/s SNKV 1.05x faster
Random reads 87K ops/s 139K ops/s SNKV 1.6x faster
Sequential scan 1.61M ops/s 3.16M ops/s SNKV 2x faster
Random updates 17K ops/s 24K ops/s SNKV 1.4x faster
Random deletes 17K ops/s 20K ops/s SNKV 1.2x faster
Exists checks 87K ops/s 149K ops/s SNKV 1.7x faster
Mixed workload 35K ops/s 50K ops/s SNKV 1.4x faster
Bulk insert 211K ops/s 240K ops/s SNKV 1.1x faster

With identical storage configuration, SNKV wins across every benchmark. The gains come from two sources: bypassing the SQL layer (no parsing, no query planner, no VDBE) and a per-column-family cached read cursor that eliminates repeated cursor open/close overhead on the hot read path. The biggest wins are on read-heavy operations — random reads (+60%), exists checks (+70%), and sequential scan (+100%) — exactly where the cursor caching pays off most.


Running your own LMDB / RocksDB comparison

If you want to benchmark SNKV against LMDB or RocksDB, the benchmark harnesses are here:


SNKV is a good fit if:

  • Your workload is read-heavy or mixed (reads + writes)
  • You're running in a memory-constrained or embedded environment
  • You want a clean KV API without writing SQL strings, preparing statements, and binding parameters
  • You need single-header C integration with no external dependencies
  • You want predictable latency — no compaction stalls, no mmap tuning

Consider alternatives if:

  • You need maximum write/update/delete throughput → RocksDB (LSM-tree)
  • You need maximum read/scan speed and memory isn't a constraint → LMDB (memory-mapped)
  • You already use SQL elsewhere and want to consolidate → SQLite directly

  • ACID Transactions — commit / rollback safety

  • WAL Mode — concurrent readers + single writer

  • Column Families — logical namespaces within a single database

  • Iterators — ordered key traversal

  • Thread Safe — built-in synchronization

  • Single-header — drop snkv.h into any C/C++ project

  • Zero memory leaks — verified with Valgrind

  • SSD-friendly — WAL appends sequentially, reducing random writes

  • Python Bindings — idiomatic Python 3.8+ API with dict-style access, context managers, typed exceptions, and prefix iterators — see python/README.md


Backup & Tooling Compatibility

Because SNKV uses SQLite's file format and pager layer, backup tools that operate at the WAL or page level work out of the box:

  • LiteFS — distributed SQLite replication works with SNKV databases
  • SQLite Online Backup API — operates at the page level, fully compatible
  • WAL-based backup tools — any tool consuming WAL files works correctly
  • Rollback journal tools — journal mode is fully supported

Note: Tools that rely on SQLite's schema layer — like the sqlite3 CLI or DB Browser for SQLite — won't work. SNKV bypasses the schema layer entirely by design.


Internals & Documentation

I documented the SQLite internals explored while building this:


  • Minimalism wins — fewer layers, less overhead
  • Proven foundations — reuse battle-tested storage, don't reinvent it
  • Predictable performance — no hidden query costs, no compaction stalls
  • Honest tradeoffs — SNKV is not the fastest at everything; it's optimized for its target use case

Apache License 2.0 © 2025 Hash Anu

联系我们 contact @ memedata.com