萨洛米,一个关于极低比特Transformer量化的研究仓库。
Salomi, a research repo on extreme low-bit transformer quantization

原始链接: https://github.com/OrionsLock/SALOMI

SALOMI 是一个研究存储库,调查极低比特的 Transformer 量化,具体研究二进制或近二进制权重表示是否能与三元方法相媲美。它提供量化、推理、评估和实验工具,但旨在作为研究工作区,而非即用型软件包。 主要发现表明,严格的 1 比特事后量化对于 GPT-2 级别的语言建模是不可行的;使用诸如 Hessian 引导的向量量化等技术,略高的比特率(~1.2-1.35 bpp)可以产生更实用的结果。 该存储库包含广泛的文档——特别是 `RESEARCH.md`,提供全面的概述,以及 `HONEST_ASSESSMENT.md`,对结果进行现实评估。存在历史实验文件,但建议用户优先考虑策划的文档和验证的测试,以便最准确地理解项目的当前结论。代码采用 Apache-2.0 许可。

对不起。
相关文章

原文

SALOMI is a research repository focused on extreme low-bit transformer quantization and inference, especially the question of whether binary or near-binary weight representations can approach or exceed ternary baselines under realistic evaluation.

This repository contains:

  • the onebit/ package for quantization, runtime inference, evaluation, kernels, and related tooling,
  • a large tests/ tree for validation and experimentation,
  • research writeups under docs/,
  • and historical paper-style materials under onebit/research/paper/.

This repository is best treated as a research workspace rather than a one-command product package.

Typical setup:

python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt
pytest

Notes:

  • pyopencl is optional unless you want to explore the OpenCL backend.
  • some research scripts expect Hugging Face model/data downloads and may require extra environment setup or credentials depending on your machine state.
  • for a guided overview, read RESEARCH.md before running older experiment scripts.

This is a research repository, not a polished production package.

The most important repo-level conclusion is:

  • strict 1.00 bpp post-hoc binary quantization does not hold up as a strong GPT-2–class language modeling solution under rigorous evaluation
  • more credible practical results in this repo cluster around ~1.2-1.35 bpp using Hessian-guided VQ, mixed precision, or magnitude-recovery methods
  • RESEARCH.md — comprehensive repo-level research report and maturity assessment
  • docs/HONEST_ASSESSMENT.md — strongest reality-check document
  • docs/PROJECT_ANALYSIS_SUMMARY.md — validation and failure-mode summary
  • docs/REPOSITORY_GUIDE.md — curated technical guide to the repository
  • docs/ARCHIVE.md — explanation of historical experiment files and naming
  • REPRODUCIBILITY.md — environment and rerun guidance
  • CONTRIBUTING.md — contribution and repo hygiene expectations

Some materials under onebit/research/paper/ preserve earlier, more optimistic draft claims. For the most defensible current interpretation of the repository, prefer:

over historical paper-draft numbers when they conflict.

What Makes This Public-Ready

This repo has been curated to improve GitHub readiness:

  • README.md gives the top-level framing
  • RESEARCH.md is the comprehensive research report
  • requirements.txt documents the dependency surface
  • .gitignore excludes common local caches and transient files
  • LICENSE now provides clear reuse terms under Apache-2.0

This repository is licensed under Apache-2.0. See LICENSE.

SALOMI/
├── README.md
├── RESEARCH.md
├── onebit/
├── docs/
├── tests/
└── research/result artifacts and experiment scripts

The strongest honest framing for this project is:

A serious research and systems exploration of extreme LLM quantization, including both promising methods and rigorous evidence about where naive sub-1-bit claims break down.

Some filenames, especially under onebit/research/, preserve the chronology of the work rather than an ideal public taxonomy. Names like novel_ideas_v*.py are intentionally kept as part of the research trail. Public-facing readers should prioritize the curated documents and validated test paths over historical experiment filenames.

Recommended Reading Order

  1. README.md
  2. RESEARCH.md
  3. docs/HONEST_ASSESSMENT.md
  4. docs/PROJECT_ANALYSIS_SUMMARY.md
  5. docs/REPOSITORY_GUIDE.md

If you want the corrected, defensible story of the repo, read in that order before opening the historical paper drafts.

联系我们 contact @ memedata.com