Hey HN!
We’ve just open-sourced model2vec-rs, a Rust crate for loading and running Model2Vec static embedding models with zero Python dependency. This allows you to embed text at (very) high throughput; for example, in a Rust-based microservice or CLI tool. This can be used for semantic search, retrieval, RAG, or any other text embedding usecase.
Main Features:
- Rust-native inference: Load any Model2Vec model from Hugging Face or your local path with StaticModel::from_pretrained(...).
- Tiny footprint: The crate itself is only ~1.7 mb, with embedding models between 7 and 30 mb.
Performance:
We benchmarked single-threaded on a CPU:
- Python: ~4650 embeddings/sec
- Rust: ~8000 embeddings/sec (~1.7× speedup)
First open-source project in Rust for us, so would be great to get some feedback!
Edit: it seems like it just splits in to sentences which is a weird thing to do given in English only 95%ish percent agreement is even possible on what a sentence is. ``` // Process in batches for batch in sentences.chunks(batch_size) { // Truncate each sentence to max_length * median_token_length chars let truncated: Vec<&str> = batch .iter() .map(|text| { if let Some(max_tok) = max_length { Self::truncate_str(text, max_tok, self.median_token_length) } else { text.as_str() } }) .collect(); ```
reply