展示HN:S2-lite,一个开源流存储
Show HN: S2-lite, an open source Stream Store

原始链接: https://github.com/s2-streamstore/s2

## s2-lite:可自托管的实时数据存储 s2-lite 是 s2.dev API 的开源实现,使您能够在本地运行用于流式数据的无服务器数据存储。它是一个单节点二进制文件,没有外部依赖,利用 SlateDB 和对象存储(如 AWS S3 或 Tigris)来实现持久数据存储。 s2-lite 也可以完全在内存中运行,非常适合测试。设置通过 Docker 简单明了,只需环境变量进行存储桶配置和凭证设置。 主要功能包括实时流式传输、通过对象存储实现持久性,以及与 s2 CLI 和 SDK 兼容。该系统使用流水线架构以提高性能,并利用 Tokio 任务进行流管理。虽然删除功能仍在开发中,但核心功能(如 basin/stream 创建和记录访问)已通过 RESTful API 和流式会话完全支持。详细规范可通过 OpenAPI 和 Protobuf 定义获得。

## S2-lite:开源流存储 Shikhar推出了S2-lite,一个开源(MIT许可)的流存储,用Rust构建,旨在解决像他们之前的云服务S2.dev一样,缺乏开源流存储选项的问题。S2-lite利用SlateDB,一个基于对象存储(如AWS S3)的嵌入式键值数据库,提供与S2.dev相似的持久性。 与Kafka或Redis Streams不同,S2-lite被设计用来处理大量持久流,将它们视为SlateDB中的键。它可以运行在对象存储之上,或完全在内存中用于开发。虽然目前缺乏资源删除功能,但它基本上是可用的,并包含一个CLI用于快速测试(甚至可以流式传输星球大战!)。 开发者旨在简化操作,与S2.dev的多租户Kubernetes架构不同。未来的改进将侧重于写入流水线,以提高性能,尤其是在使用高延迟存储(如S3)时。用户可以使用`s2 bench`命令进行性能测试。
相关文章

原文

s2.dev is a serverless datastore for real-time, streaming data.

s2-lite is an open source, self-hostable server implementation of the S2 API.

It uses SlateDB as its storage engine, which relies entirely on object storage for durability.

It is easy to run s2-lite against object stores like AWS S3 and Tigris. It is a single-node binary with no other external dependencies. Just like s2.dev, data is always durable on object storage before being acknowledged or returned to readers.

You can also simply not specify a --bucket, which makes it operate entirely in-memory. This is great for integration tests involving S2.

Note

Point the S2 CLI or SDKs at your lite instance like this:

export S2_ACCOUNT_ENDPOINT="http://localhost:8080"
export S2_BASIN_ENDPOINT="http://localhost:8080"
export S2_ACCESS_TOKEN="redundant"

Here's how you can run in-memory without any external dependency:

docker run -p 8080:80 ghcr.io/s2-streamstore/s2-lite
AWS S3 bucket example
docker run -p 8080:80 \
  -e AWS_PROFILE=${AWS_PROFILE} \
  -v ~/.aws:/root/.aws:ro \
  ghcr.io/s2-streamstore/s2-lite \
  --bucket ${S3_BUCKET} \
  --path s2lite
Static credentials example (Tigris, R2 etc)
docker run -p 8080:80 \
  -e AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID} \
  -e AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY} \
  -e AWS_ENDPOINT_URL_S3=${AWS_ENDPOINT_URL_S3} \
  ghcr.io/s2-streamstore/s2-lite \
  --bucket ${S3_BUCKET} \
  --path s2lite

Let's make sure the server is ready:

while ! curl -sf ${S2_ACCOUNT_ENDPOINT}/ping -o /dev/null; do echo Waiting...; sleep 2; done && echo Up!

Install the CLI or upgrade it if s2 --version is older than 0.25

Let's create a basin with auto-creation of streams enabled:

s2 create-basin liteness --create-stream-on-append --create-stream-on-read

Test your performance:

s2 bench liteness --target-mibps 10 --duration 5s --catchup-delay 0s

S2 Ping Test

Now let's try streaming sessions. In one or more new terminals (make sure you re-export the env vars noted above),

s2 read s2://liteness/starwars 2> /dev/null

Now back from your original terminal, let's write to the stream:

nc starwars.s2.dev 23 | s2 append s2://liteness/starwars

S2 Star Wars Streaming

/ping will pong

/metrics returns Prometheus text format

Settings reference

Use SL8_ prefixed environment variables, e.g.:

# Defaults to 50ms for remote bucket / 5ms in-memory
SL8_FLUSH_INTERVAL=10ms

Concepts

  • HTTP serving is implemented using axum
  • Each stream corresponds to a Tokio task called streamer that owns the current tail position, serializes appends, and broadcasts acknowledged records to followers
  • Appends are pipelined to improve performance against high-latency object storage
  • lite::backend::kv::Key documents the data modeling in SlateDB
  • Deletion is not fully plumbed up yet
  • Pipelining needs to be made safe and default #48

Tip

Complete specs are available: OpenAPI for the REST-ful core, Protobuf definitions, and S2S which is the streaming session protocol.

Fully supported

  • /basins
  • /streams
  • /streams/{stream}/records

Important

Unlike the cloud service where the basin is implicit as a subdomain, /streams/* requests must specify the basin using the S2-Basin header. The SDKs take care of this automatically.

Not supported

  • /access-tokens #28
  • /metrics
联系我们 contact @ memedata.com