展示 HN：Sweep，用于下一编辑自动补全的 1.5B 开源权重模型

展示 HN：Sweep，用于下一编辑自动补全的 1.5B 开源权重模型
Show HN: Sweep, Open-weights 1.5B model for next-edit autocomplete

原始链接: https://huggingface.co/sweepai/sweep-next-edit-1.5B

一个用于next-edit自动补全的15亿参数模型，量化为Q8_0 GGUF格式。模型描述：Sweep Next-Edit 在你进行代码编辑之前预测你的下一次编辑。它在你的笔记本电脑上本地运行，速度低于500毫秒（使用推测解码），并且在next-edit基准测试中表现优于其4倍大小的模型。使用方法：下载run_model.py和模型文件，然后：uv pip install llama-cpp-python huggingface_hub python run_model.py 模型详情：格式：GGUF（Q8_0量化）参数：15亿上下文长度：8192个token基础模型：Qwen2.5-Coder 示例：该模型使用特定的提示格式，包含文件上下文、最近的diff和当前状态来预测下一次编辑。有关完整示例，请参阅run_model.py。链接：许可证：Apache 2.0

## Sweep：开源的下一编辑自动补全模型 William Zeng 和 Sweep AI 团队发布了 Sweep，一个 1.5B 参数的开源模型，专为“下一编辑”自动补全设计——基于最近的更改预测代码编辑，类似于 Cursor。Sweep 在 Hugging Face 上可用，并作为 JetBrains 插件提供，旨在在速度和准确性上超越更大的模型，同时足够小以在本地运行。他们的研究的主要发现包括，对于模型理解而言，简单的“原始/更新”diff 格式比统一 diff 格式更重要，并且强化学习 (RL) 可以优化监督微调 (SFT) 的结果。训练在 8x H100 GPU 上大约花费了 4 小时。该团队鼓励社区贡献，并欢迎与 VSCode 和 Neovim 等编辑器集成。一位评论员 plutodev 强调了训练效率，并建议像 io.net 这样的去中心化 GPU 聚合器作为替代基础设施选项。

A 1.5B parameter model for next-edit autocomplete, quantized to Q8_0 GGUF format.

Model Description

Sweep Next-Edit predicts your next code edit before you make it. It runs locally on your laptop in under 500ms (with speculative decoding) and outperforms models over 4x its size on next-edit benchmarks.

Usage

Download run_model.py and the model file, then:

uv pip install llama-cpp-python huggingface_hub
python run_model.py

Model Details

Format: GGUF (Q8_0 quantization)
Parameters: 1.5B
Context Length: 8192 tokens
Base Model: Qwen2.5-Coder

Example

The model uses a specific prompt format with file context, recent diffs, and current state to predict the next edit. See run_model.py for a complete example.

Links

License

Apache 2.0