黑天应用视图
Blacksky AppView

原始链接: https://github.com/blacksky-algorithms/atproto

## Blacksky AppView 实现总结 此仓库包含 Blacksky 对 AT 协议参考实现的修改版本,为他们的 AppView 提供支持,地址为 api.blacksky.community,发布目的是为了透明度和社区利益。虽然不接受贡献,但它强调了相对于上游 `bluesky-social/atproto` 实现的关键改进,主要集中在可扩展性和运营效率上。 核心变化是用 **rsky-wintermute**,一个基于 Rust 的索引器,替换了上游 TypeScript firehose 消费者。Wintermute 通过并行处理提供了显著的性能提升(10,000+ 条记录/秒 vs. ~90/秒),并将实时索引与回填分离。 关键优化包括 PostgreSQL `LATERAL JOIN` 查询以加快时间线生成速度,以及 Redis 缓存用于演员资料和记录元数据(目前由于时间戳错误而禁用)。Blacksky 还实现了服务器端通知偏好强制执行,并修复了帐户迁移后的身份验证问题。 此外,此修改版本还包括 **社区帖子** 的基础设施,这是 Blacksky 特有的功能。启动完整的网络 AppView 需要大量资源(详情请参见原始文档),并涉及解决潜在问题,例如 JSON 损坏和队列中毒。

## Blacksky:为黑人社区打造的新型社交网络 一款名为Blacksky(基于Atproto协议)的新型社交网络正在受到关注,它专为美国黑人社区打造。与Bluesky不同,Blacksky限制只有黑人用户才能发布内容,旨在提供一个具有定制审核决策和社区自治感的平台。 非黑人用户可以访问和互动公开帖子,但不能创建账户直接发布。该平台允许非黑人用户通过付费订阅来托管账户,从而为平台的基础设施做出贡献。Blacksky提供精选信息流、独立的基础设施以及专用的PDS(个人数据存储)供用户使用。 围绕着这种排他性空间的需求和影响,人们展开了讨论。一些人认为这是走向自决的重要一步,而另一些人则质疑其长期目标。有人将其与ActivityPub进行比较,认为其去中心化程度更高,也有人则为Atproto的方法辩护。对话还涉及身份验证和潜在的滥用问题,但最终强调了Blacksky在去中心化社交媒体格局中的独特地位。
相关文章

原文

This is Blacksky's fork of the AT Protocol reference implementation by Bluesky Social PBC. It powers the AppView at api.blacksky.community.

We're publishing this for transparency and so other communities can benefit from the work. This repository is not accepting contributions, issues, or PRs. If you want the canonical atproto implementation, use bluesky-social/atproto.

All changes are in packages/bsky (appview logic), services/bsky (runtime config), and one custom migration. Everything else is upstream.

Why Not the Built-in Firehose Consumer?

The upstream dataplane includes a TypeScript firehose consumer (subscription.ts) that indexes events directly. We replaced it with rsky-wintermute, a Rust indexer, for several reasons:

  • Performance at scale: The TypeScript consumer processes events sequentially. At network scale (~1,000 events/second, 18.5 billion total records), a full backfill at ~90 records/sec would take 6.5 years. Wintermute targets 10,000+ records/sec with parallel queue processing.
  • Backfill architecture: Wintermute separates live indexing from backfill into independent queues (firehose_live, firehose_backfill, repo_backfill, labels). Live events are never blocked by backfill work.
  • Operational tooling: Wintermute includes utilities for direct indexing of specific accounts, PLC directory bulk import, label stream replay, blob reference repair, and queue management -- all needed when bootstrapping an AppView from scratch.

The dataplane and appview from this repo still run as-is. They read from the PostgreSQL database that wintermute writes to. We just don't start the built-in firehose subscription.

Performance & Operational Fixes

These are broadly useful to anyone self-hosting an AppView at scale.

LATERAL JOIN query optimization (packages/bsky/src/data-plane/server/routes/feeds.ts)

  • getTimeline and getListFeed rewritten with PostgreSQL LATERAL JOINs to force per-user index usage instead of full table scans. Major improvement for users following thousands of accounts.

Redis caching layer (packages/bsky/src/data-plane/server/cache/)

  • Actor profiles (60s TTL), records (5m), interaction counts (30s), post metadata (5m)
  • Reduces database load under production traffic
  • Known issue: The actor cache has a protobuf timestamp serialization bug where Timestamp objects lose their .toDate() method after JSON round-tripping through Redis, causing incomplete profile hydration on cache hits. We currently run with Redis caching disabled. The fix is to serialize timestamps as ISO strings on cache write and reconstruct on read.

Notification preferences server-side enforcement (packages/bsky/src/api/app/bsky/notification/listNotifications.ts)

  • When the client doesn't specify reasons, the server applies the user's saved notification preferences. Without this, preferences are only enforced client-side and have no effect.

Auth verifier stale signing key fix (packages/bsky/src/auth-verifier.ts)

  • On JWT verification retry (forceRefresh), bypasses the dataplane's in-memory identity cache and resolves the DID document directly from PLC directory. Fixes authentication failures after account migration where the signing key rotates but the cache holds the old key.

JSON sanitization (packages/bsky/src/data-plane/server/routes/records.ts)

  • Strips null bytes (\u0000) and control characters from stored records before JSON parsing. These are valid per RFC 8259 but rejected by Node.js JSON.parse(), causing silent rowToRecord parse failures in the dataplane that surface as missing posts.

Community Posts (Blacksky-specific)

Infrastructure for private community posts that live on the AppView rather than individual PDSes. Specific to how Blacksky works, but could serve as a reference for other communities.

  • Custom lexicon namespace community.blacksky.feed.* with endpoints for submit, get, delete, timeline, and thread views
  • Separate community_post table (migration: 20260202T120000000Z-add-community-post.ts)
  • Membership gating at the dataplane and API layer
  • Integration with getPostThreadV2 for mixed standard/community post threads
  • Requires a separate membership database (BLACKSKY_MEMBERSHIP_DB_URL)
Bluesky Relay (bsky.network)
     |
     v
rsky-wintermute -----> PostgreSQL 17 <----- Palomar
  (Rust indexer)            |                (Go search)
  - firehose consumer       |                     |
  - backfiller              |                     v
  - label indexer           |               OpenSearch
  - direct indexer          |
                            v
                    bsky-dataplane (gRPC :2585) <--- Redis (optional)
                            |
                            v
                    bsky-appview (HTTP :2584)
                            |
                            v
                    Reverse proxy (Caddy/nginx)
Component Source Purpose
rsky-wintermute blacksky-algorithms/rsky Rust firehose indexer: consumes events, backfills repos, indexes records into PostgreSQL
rsky-relay blacksky-algorithms/rsky AT Protocol relay for receiving moderation labels from labeler services
rsky-video blacksky-algorithms/rsky Video upload service: transcodes via Bunny Stream CDN, uploads blob refs to user PDSes
bsky-dataplane This repo (services/bsky) gRPC data layer over PostgreSQL
bsky-appview This repo (services/bsky) HTTP API server for app.bsky.* XRPC endpoints
Palomar blacksky-algorithms/indigo Full-text search: indexes profiles and posts into OpenSearch with follower count boosting
palomar-sync blacksky-algorithms/rsky Syncs follower counts and PageRank scores from PostgreSQL to OpenSearch

rsky-wintermute in Detail

Wintermute is a monolithic Rust service with four parallel processing paths:

  • Ingester: Connects to bsky.network firehose via WebSocket, writes events to Fjall (embedded key-value store) queues
  • Indexer: Reads from queues, parses records, writes to PostgreSQL with ON CONFLICT for idempotency
  • Backfiller: Fetches full repo CAR files from PDSes, unpacks records into the backfill queue
  • Label indexer: Subscribes to labeler WebSocket streams, processes label create/negate events

Additional CLI tools included in the rsky repo:

  • queue_backfill -- queue DIDs for backfill from CSV, PDS discovery, or direct DID lists
  • direct_index -- fetch and index specific repos bypassing queues (useful for fixing individual accounts)
  • label_sync -- replay label streams from cursor 0 to catch up on missed negations
  • plc_import -- bulk import handle/DID mappings from PLC directory
  • palomar-sync -- sync follower counts and PageRank to OpenSearch

Video upload service for users whose PDS doesn't support Bluesky's video.bsky.app. Uses its own DID (did:web:video.blacksky.community) to authenticate to user PDSes via service auth JWTs. Flow:

  1. Client gets service auth token from PDS (audience: video service DID)
  2. Client uploads video bytes to rsky-video
  3. rsky-video generates a CID, uploads the blob to the user's PDS
  4. Video forwarded to Bunny Stream CDN for transcoding
  5. On completion, client creates the post referencing the blob -- PDS validates the blob exists

Moderation labels come from labeler services (e.g., Bluesky's Ozone) via WebSocket subscription. Wintermute's ingester processes labels in a dedicated label_live queue (low volume, separate from the main firehose). The label_sync tool can replay a labeler's full stream to catch up on missed negations (label removals) without reinserting labels.

  • Node.js 18+ and pnpm (for building the dataplane and appview)
  • PostgreSQL 17 with the bsky schema
  • Redis (optional, for caching -- see known issue above)
  • rsky-wintermute consuming the firehose and populating the database
  • OpenSearch (if running Palomar search)

The bsky schema is created by the dataplane's migrations. On first run, the dataplane will apply all migrations automatically. The only Blacksky-specific migration is 20260202T120000000Z-add-community-post.ts (community posts table). If you don't need community posts, you can remove it.

rsky-wintermute writes to this same schema. All its INSERT statements use ON CONFLICT so it's safe to run wintermute and the dataplane migrations in any order.

node services/bsky/dataplane.js
Variable Required Description
DB_PRIMARY_URL Yes PostgreSQL connection string with ?options=-csearch_path%3Dbsky
DB_REPLICA_URL No Read replica connection string
BSKY_DATAPLANE_PORT No gRPC port (default 2585)
BSKY_REDIS_HOST No Redis host:port for caching (currently recommended to leave disabled)
BLACKSKY_MEMBERSHIP_DB_URL No Separate DB for community membership (Blacksky-specific)
node services/bsky/api.js
Variable Required Description
BSKY_APPVIEW_PORT No HTTP port (default 2584)
BSKY_DATAPLANE_URLS Yes Comma-separated dataplane gRPC URLs
BSKY_DID Yes The AppView's DID (e.g. did:web:api.example.com)
BSKY_MOD_SERVICE_DID Yes Ozone moderation service DID
BSKY_ADMIN_PASSWORDS Yes Comma-separated admin passwords for basic auth

A full-network backfill (all ~42M users, ~18.5B records) takes weeks even with wintermute's parallel processing. Expect:

  • Live indexing: Keeps up in real-time from day one (~1,000 events/sec)
  • Full backfill: 2-4 weeks at 10,000 records/sec depending on PDS responsiveness and network conditions
  • Partial backfill: Hours to days for a subset of users (e.g., community members only)

During backfill, the AppView is functional but will show incomplete data for users that haven't been backfilled yet. Live events are indexed immediately regardless of backfill progress.

Problems We Solved Getting Here

These are issues we encountered bootstrapping a full-network AppView. If you're doing the same, you'll likely hit some of these:

COPY text format JSON corruption: PostgreSQL's COPY text protocol treats backslash as an escape character. If your bulk loader doesn't escape backslashes in JSON strings, \" becomes " and you get silently corrupted records. The record.json column is type text (not jsonb), so PostgreSQL won't catch this. We found ~66,000 corrupted records and had to repair them by re-fetching from the public API.

Null bytes in JSON: Some AT Protocol records contain \u0000 (null byte), which is valid JSON per RFC 8259 but rejected by Node.js JSON.parse(). The dataplane silently returns null for these records. Strip null bytes before writing to the database.

Timestamp format sensitivity: The dataplane expects timestamps with millisecond precision and Z suffix (2026-01-12T19:45:23.307Z). Nanosecond precision or timezone offset format (+00:00) causes subtle sorting and comparison issues.

Notification table bloat: Without a unique constraint on (did, recordUri, reason), the notification table grows unbounded with duplicates. Ours reached 1.3 billion rows (663 GB) before we caught it. Adding ON CONFLICT DO NOTHING to INSERTs only helps if the unique index exists first, and creating the index requires deduplication of the existing data.

Post embed tables: The post_embed_image and post_embed_video tables aren't populated by default if your indexer doesn't handle them. Without these, the media filter on getAuthorFeed returns nothing. These need to be backfilled separately.

Label negation ordering: Label negation (removal) events reference the original label by source, URI, and value. If negations arrive before the original label (common during backfill), they're silently dropped. The label_sync tool replays the full stream to catch these.

Fjall queue poisoning: The Fjall embedded database (used for wintermute's queues) can enter a "poisoned" state after crashes, blocking all queue operations. The fix is to delete the queue database directory and restart -- wintermute will catch up from the relay's cursor (relays keep ~72 hours of history).

TLS provider initialization: Rust's rustls requires explicitly installing a crypto provider before any TLS connection. Without rustls::crypto::aws_lc_rs::default_provider().install_default() at startup, the first WebSocket connection to the firehose panics.

Signing key rotation after account migration: When users migrate between PDSes, their signing key changes. The dataplane caches identity data with a staleTTL of 1 hour. During that window, JWT verification fails for migrated users. The fix is to bypass the cache on verification retry and resolve directly from PLC directory.

Based on running a full-network AppView (all ~42M users, ~18.5B records).

Resource Minimum Recommended
CPU 16 cores 48+ cores
RAM 64 GB 256 GB
Storage 10 TB NVMe 28+ TB NVMe (RAID)
PostgreSQL Dedicated, same machine or low-latency Same machine recommended
Network Sustained 100 Mbps 1 Gbps+

Storage breakdown (approximate, full network):

Table group Size
Posts + records ~3.5 TB
Likes ~2 TB
Follows ~500 GB
Notifications ~600 GB
Indexes ~4 TB
OpenSearch (Palomar) ~500 GB

For a smaller community running a partial AppView (indexing only community members), requirements scale roughly linearly with indexed accounts.

git remote add upstream https://github.com/bluesky-social/atproto.git
git fetch upstream
git merge upstream/main

Conflicts will typically be in packages/bsky/src/data-plane/server/routes/ and packages/bsky/src/api/. Resolve by keeping our additions alongside upstream changes.

Same as upstream: dual-licensed under MIT and Apache 2.0. See LICENSE-MIT.txt and LICENSE-APACHE.txt.

联系我们 contact @ memedata.com