Show HN：Model2vec-Rs – Rust 语言实现的快速静态文本嵌入

Show HN：Model2vec-Rs – Rust 语言实现的快速静态文本嵌入
Show HN: Model2vec-Rs – Fast Static Text Embeddings in Rust

原始链接: https://github.com/MinishLab/model2vec-rs

`model2vec-rs` crates 提供了一个轻量级的 Rust 实现，用于加载和使用 Model2Vec 静态嵌入模型进行推理。对于蒸馏和训练等任务，Python 的 `Model2Vec` 包仍然适用。使用 `StaticModel::from_pretrained`，您可以轻松地从 Hugging Face Hub 或本地路径加载模型。该 crate 允许您将句子列表编码为嵌入，可以直接编码，也可以使用 `encode_with_args` 函数使用 `max_length` 和 `batch_size` 等可配置参数进行编码。提供的命令行界面简化了从单个句子或整个文件中生成嵌入的过程。基准测试表明，与 Python 版本相比，Rust 实现的吞吐量显著提高，在单线程 CPU 环境下，速度达到大约 8000 样本/秒，而 Python 版本为 4650 样本/秒。这代表着大约 1.7 倍的性能提升。Hugging Face Hub 上提供了一些预训练模型，可立即使用。该 crate 使用 MIT 许可证。

MinishLab 推出的一个新的 Rust 库 Model2vec-rs 提供了快速静态文本嵌入，无需依赖 Python。这使得基于 Rust 的应用能够进行高吞吐量的文本嵌入，用于语义搜索和 RAG 等任务。其主要特性包括：通过 `StaticModel::from_pretrained` 使用 Rust 原生推理加载来自 Hugging Face 或本地路径的 Model2Vec 模型；占用空间小（库大小约 1.7MB，模型大小 7-30MB）。基准测试显示，其性能远超 Python，CPU 上速度可达约 8000 个嵌入/秒，而 Python 约为 4650 个嵌入/秒，速度提升约 1.7 倍。讨论重点介绍了其处理长文档的方法，允许用户设置标记的截断长度，以及加载自定义模型。团队还提供了模型和基准信息以供用例参考。作者欢迎反馈和贡献，以促进 Rust 机器学习生态系统的增长。

（评论） 2024-07-29

Rust CUDA 项目 2025-04-11

苹果公司推出的FastVLM：显著更快的视觉语言模型 2025-05-13

Show HN：使用图形着色器实现的GPT-2 2025-05-02

原文

This crate provides a lightweight Rust implementation for loading and inference of Model2Vec static embedding models. For distillation and training, the Python Model2Vec package can be used.

Add the crate:

Make embeddings:

use anyhow::Result;
use model2vec_rs::model::StaticModel;

fn main() -> Result<()> {
    // Load a model from the Hugging Face Hub or a local path
    // args = (repo_or_path, token, normalize, subfolder)
    let model = StaticModel::from_pretrained("minishlab/potion-base-8M", None, None, None)?;

    // Prepare a list of sentences
    let sentences = vec![
        "Hello world".to_string(),
        "Rust is awesome".to_string(),
    ];

    // Create embeddings
    let embeddings = model.encode(&sentences);
    println!("Embeddings: {:?}", embeddings);

    Ok(())
}

Make embeddings with the CLI:

# Single sentence
cargo run -- encode "Hello world" minishlab/potion-base-8M

# Multiple lines from a file
echo -e "Hello world\nRust is awesome" > input.txt
cargo run -- encode input.txt minishlab/potion-base-8M --output embeds.json

Make embeddings with custom encode args:

let embeddings = model.encode_with_args(
    &sentences,     // input texts
    Some(512),  // max length
    1024,       // batch size
);

We provide a number of models that can be used out of the box. These models are available on the HuggingFace hub and can be loaded using the from_pretrained method. The models are listed below.

We compared the performance of the Rust implementation with the Python version of Model2Vec. The benchmark was run single-threaded on a CPU.

Implementation	Throughput
Rust	8000 samples/second
Python	4650 samples/second

The Rust version is roughly 1.7× faster than the Python version.

MIT

Show HN：Model2vec-Rs – Rust 语言实现的快速静态文本嵌入 Show HN: Model2vec-Rs – Fast Static Text Embeddings in Rust

Show HN：Model2vec-Rs – Rust 语言实现的快速静态文本嵌入
Show HN: Model2vec-Rs – Fast Static Text Embeddings in Rust