可在48MB中索引10亿向量的向量数据库。

可在48MB中索引10亿向量的向量数据库。
Vector database that can index 1B vectors in 48M

原始链接: https://www.vectroid.com/blog/why-and-how-we-built-Vectroid

## Vectroid：无需权衡的无服务器向量搜索 Vectroid 是一种新型无服务器向量数据库，旨在同时提供高精度、低延迟和成本效益。与现有方案需要在这些因素之间做出妥协不同，Vectroid 追求均衡的性能。其核心创新在于动态优化资源分配，以支持内存密集型 HNSW 算法，传统上认为该算法对于注重成本的系统来说过于昂贵。Vectroid 利用实际工作负载的突发性，并利用可调节的向量压缩来有效管理 HNSW 的内存占用。 **主要特性包括：**高性能的基于 HNSW 的搜索、近乎实时的索引、可扩展至数十亿向量的大规模可扩展性，以及独立扩展摄取、索引和查询层。初步基准测试展示了 Vectroid 的能力：在 48 分钟内索引 10 亿个向量，并实现 34 毫秒的 P99 延迟。Vectroid 独特地在扩展到每秒 10 次查询的同时，保持 >90% 的召回率。 Vectroid 构建于独立可扩展的微服务之上，并利用云对象存储，为 demanding 的向量搜索应用提供强大且适应性强的解决方案。

原文

We are excited to announce Vectroid, a serverless vector search solution that delivers exceptional accuracy and low latency in a cost effective package. Vectroid is not just another vector search solution—it’s a search engine that performs and scales in all scenarios.

Why we built Vectroid

Talk to any team working with large, low latency vector workloads and you’ll hear a familiar story: something always has to give. Vector databases often make significant tradeoffs between speed, accuracy, and cost. That’s the nature of the mathematical underpinnings of vector search works—taking algorithmic shortcuts to get near-perfect results in a short amount of time.

There are some common permutations of these tradeoffs:

Very high accuracy, but very expensive and slow

Fast speed and tolerable accuracy, but very expensive

Cheap and fast, but inaccurate to a disqualifying degree

Based on the existing vector database landscape, it would seem that building a cost effective vector database requires sacrificing either speed or accuracy at scale. We believe that’s a false pretense: building a cost-efficient database is possible with high accuracy and low latency. We just need to rethink our underlying mechanism.

Our “aha” moment

Query speed and recall are largely a function of the chosen ANN algorithm. Algorithms which are both fast and accurate like HNSW (Hierarchical Navigable Small Worlds) are memory intensive and expensive to index. The traditional assumption is that these types of algorithms are untenable for a cost-conscious system.

We had two major realizations which challenged this assumption.

Demand for in-memory HNSW is not static. Real world usage patterns are bursty and uneven. A cost efficient database can optimize for this reality by making resource allocation dynamic and by individually scaling system components as needed.
HNSW’s memory footprint is tunable. It can be easily be flattened (ex. by compressing vectors using quantization) and expanded (ex. by increasing layer count), which gives us the flexibility to experiment with different configurations to find a goldilocks setup.

What is Vectroid?

Vectroid is a serverless vector database with premium performance. It delivers the same or stronger balance of speed and recall promised by high-end offerings, but costs less than competitors.

Performant vector search: HNSW for ultra fast, high recall similarity search.
Near real-time search capabilities: Newly ingested records are searchable almost instantly.
Massive scalability: Seamlessly handles billions of vectors in a single namespace.
Cost efficient resource utilization: Scaling each layer (ingest, index, query) separately.

How Vectroid performs

The core philosophy of Vectroid is that optimizing for one metric at any cost to the others doesn’t make for a robust system. Instead, vector search should be designed for balanced performance across recall, latency, and cost so users don’t have to make painful tradeoffs as workloads grow.

When tested against other state-of-the-art vector search, Vectroid is not only competitive but the most consistent across the board. Across all of our tests, Vectroid is the only databases that was able to maintain over 90% recall while scaling to 10 query threads per second—all while maintaining good latency scores.

Some early benchmarks:

Indexed 1B vectors (Deep1B) in ~48 minutes

Achieved P99 latency of 34ms on the MS Marco 138M vector / 1024 dimensions dataset

We’ll be releasing the full benchmark suite (with setup details so anyone can reproduce them) in an upcoming post. For now, these numbers highlight the kind of scale and performance we designed Vectroid to handle.

How Vectroid works

Vectroid is composed of two independently scalable microservices for writes and reads.

As the diagram shows, index state, vector data, and metadata are persisted to cloud object storage (GCS for now, S3 coming soon). Disk, cache, and in-memory storage layers each employ a usage-aware model for index lifecycle in which indexes are lazily loaded from object storage on demand and evicted when idle.

For fast, high-recall ANN search, we chose the HNSW algorithm. It offers excellent latency and accuracy tradeoffs, supports incremental updates, and performs well across large-scale workloads. To patch its limitations, we added a number of targeted optimizations:

Slow indexing speed ⇒ in-memory write buffer to ensure newly inserted vectors are immediately searchable

High indexing cost ⇒ batched, highly concurrent and partitioned indexing

High memory usage ⇒ vector compression via quantization

Final Thoughts

We’re just getting started. If you’re building applications that rely on fast, scalable vector search (or you’re running up against the limits of your current stack), we’d love to hear from you. Start using Vectroid today or sign up for our newsletter to follow along as we continue building.