展示 HN：Tonbo – 一款用于无服务器和边缘运行时的嵌入式数据库

展示 HN：Tonbo – 一款用于无服务器和边缘运行时的嵌入式数据库
Show HN: Tonbo – an embedded database for serverless and edge runtimes

## Tonbo：适用于无服务器和边缘计算的嵌入式数据库 Tonbo 是一款轻量级嵌入式数据库，专为无服务器和边缘环境设计。它通过将数据存储为 **Parquet 文件在对象存储（如 S3）上**，并使用清单进行协调，从而独特地解决了无状态计算中持久化状态的需求——无需专用数据库服务器。主要特性包括： * **异步优先架构：** 为无服务器环境中的高性能而构建。 * **无状态计算：** 所有处理都是无状态的，依赖对象存储和清单进行协调。 * **Arrow 原生：** 使用 Arrow 数据格式实现高效的零拷贝查询。 * **开放格式：** 利用标准的 Parquet，确保与现有工具的兼容性。 * **MVCC 和快照隔离：** 提供事务一致性，并针对对象存储优化了合并树。 Tonbo 支持各种运行时（Tokio、WASM、Cloudflare Workers），并提供基本操作、事务、过滤和 S3 集成的示例。目前处于 Alpha 阶段，非常适合追加型数据、事件溯源以及需要结构化持久化而无需服务器管理的应用程序。示例和文档请访问 [https://github.com/tonbo-io/tonbo](https://github.com/tonbo-io/tonbo)。

## Tonbo：适用于无服务器和边缘计算的嵌入式数据库 Tonbo 是一种新型嵌入式数据库，专为无服务器函数、边缘运行时和 AI 代理等现代、短暂的计算环境而设计。该项目在 Hacker News 上分享，旨在提供一种将数据视为*格式*而非持续运行的*服务*的存储解决方案，与传统的数据库模型形成对比。讨论主要集中在与 SlateDB（专注于键/值 OLTP，而 Tonbo 针对 OLAP）的比较以及对延迟、持久性和成本的担忧。创建者澄清 Tonbo 针对在短生命周期沙箱（WASM、Firecracker）中运行的工作负载进行了优化，在这些环境中，传统的数据库连接难以大规模实现。主要特性包括当前的 WASM 大小为 3MB，并且有可能进一步减小，以及 Apache 2.0 许可证。虽然有些人质疑其效率与 Postgres 等成熟解决方案相比，但 Tonbo 的开发者认为它解决了在日益分布式的计算环境中对隔离、可扩展存储日益增长的需求。

原文

Website | Rust Doc | Blog | Community

Tonbo is an embedded database for serverless and edge runtimes. Your data is stored as Parquet on S3, coordination happens through a manifest, and compute stays fully stateless.

Serverless compute is stateless, but your data isn't. Tonbo bridges this gap:

Async-first: The entire storage and query engine is fully async, built for serverless and edge environments.
No server to manage: Data lives on S3, coordination via manifest, compute is stateless
Arrow-native: Define rich data type, declarative schemas, query with zero-copy RecordBatch
Runs anywhere: Tokio, WASM, edge runtimes, or as a storage engine for building your own data infrastructure.
Open formats: Standard Parquet files readable by any tool

Build serverless or edge applications that need a durable state layer without running a database.
Store append-heavy or event-like data directly in S3 and query it with low overhead.
Embed a lightweight MVCC + Parquet storage engine inside your own data infrastructure.
Run workloads in WASM or Cloudflare Workers that require structured persistence.

use tonbo::{db::{AwsCreds, ObjectSpec, S3Spec}, prelude::*};

#[derive(Record)]
struct User {
    #[metadata(k = "tonbo.key", v = "true")]
    id: String,
    name: String,
    score: Option<i64>,
}

// Open on S3
let s3 = S3Spec::new("my-bucket", "data/users", AwsCreds::from_env()?);
let db = DbBuilder::from_schema(User::schema())?
    .object_store(ObjectSpec::s3(s3))?.open().await?;

// Insert
let users = vec![User { id: "u1".into(), name: "Alice".into(), score: Some(100) }];
let mut builders = User::new_builders(users.len());
builders.append_rows(users);
db.ingest(builders.finish().into_record_batch()).await?;

// Query
let filter = Predicate::gt(ColumnRef::new("score"), ScalarValue::from(80_i64));
let results = db.scan().filter(filter).collect().await?;

For local development, use .on_disk("/tmp/users")? instead. See examples/ for more.

cargo add [email protected] tokio

Or add to Cargo.toml:

[dependencies]
tonbo = "0.4.0-a0"
tokio = { version = "1", features = ["rt-multi-thread", "macros"] }

Run with cargo run --example <name>:

01_basic: Define schema, insert, and query in 30 lines
02_transaction: MVCC transactions with upsert, delete, and read-your-writes
02b_snapshot: Consistent point-in-time reads while writes continue
03_filter: Predicates: eq, gt, in, is_null, and, or, not
04_s3: Store Parquet files on S3/R2/MinIO with zero server config
05_scan_options: Projection pushdown reads only the columns you need
06_composite_key: Multi-column keys for time-series and partitioned data
07_streaming: Process millions of rows without loading into memory
08_nested_types: Deep struct nesting + Lists stored as Arrow StructArray
09_time_travel: Query historical snapshots via MVCC timestamps

Tonbo implements a merge-tree optimized for object storage: writes go to WAL → MemTable → Parquet SSTables, with MVCC for snapshot isolation and a manifest for coordination via compare-and-swap:

Stateless compute: A worker only needs to read and update the manifest; no long-lived coordinator is required.
Object storage CAS: The manifest is committed using compare-and-swap on S3, so any function can safely participate in commits.
Immutable data: Data files are write-once Parquet SSTables, which matches the strengths of S3 and other object stores.

See docs/overview.md for the full design.

Install coverage tooling once:

rustup component add llvm-tools-preview
cargo install cargo-llvm-cov --version 0.6.12 --locked

Run coverage locally:

cargo llvm-cov --workspace --lcov --output-path lcov.info --summary

Generate an HTML report:

cargo llvm-cov --workspace --html

Tonbo is currently in alpha. APIs may change, and we're actively iterating based on feedback. We recommend starting with development and non-critical workloads before moving to production.

Storage

Schema & Query

Backends

Runtime

Integrations

Apache License 2.0. See LICENSE for details.

展示 HN：Tonbo – 一款用于无服务器和边缘运行时的嵌入式数据库 Show HN: Tonbo – an embedded database for serverless and edge runtimes

展示 HN：Tonbo – 一款用于无服务器和边缘运行时的嵌入式数据库
Show HN: Tonbo – an embedded database for serverless and edge runtimes