展示 HN: TurboQuant-WASM – Google 的向量量化技术在浏览器中运行

展示 HN: TurboQuant-WASM – Google 的向量量化技术在浏览器中运行
Show HN: TurboQuant-WASM – Google's vector quantization in the browser

原始链接: https://github.com/teamchong/turboquant-wasm

## TurboQuant-WASM：浏览器和Node.js中的高效向量量化 TurboQuant-WASM 通过 WebAssembly (WASM) 将基于 Google Research 的“TurboQuant”论文的最先进向量量化技术带到 Web 浏览器和 Node.js。该实现实现了大约 6 倍的压缩（~4.5 位/维度），同时保留内积精度，并通过严格的黄金值测试验证。主要功能包括用于轻松集成的 TypeScript API (`TurboQuant.init()`, `encode()`, `decode()`, `dot()`)、用于性能的宽松 SIMD 优化（使用 FMA 指令）以及紧凑的 npm 包 (`turboquant-wasm`)。一个在线演示展示了向量搜索、图像相似性和 3D 高斯飞溅压缩直接在浏览器中。WASM 构建需要 Zig 0.15.2 和 Bun 用于构建，并且与现代浏览器（Chrome 114+、Firefox 128+、Safari 18+）和 Node.js 20+ 兼容。该项目采用 MIT 许可，并与原始 Zig 实现保持位相同的输出。

## TurboQuant-WASM：在浏览器中进行向量量化一个名为TurboQuant-WASM (TQ) 的新项目，使用 WebAssembly 将 Google 的向量量化技术带到浏览器中。虽然有望减少下载大小——在 gzip 压缩有限的浏览器环境中这是一个关键优势——但初步用户测试显示存在性能权衡。一位用户发现，TQ 在使用 8 位量化时，搜索质量与 32 位浮点数相似，但如果没有 GPU 加速，则明显*更慢*。另一位用户将 TQ 实现为 SQLite 扩展，证实了节省空间，但查询时间变慢。尽管人们对其潜力感兴趣，但一些评论员表示怀疑，引用了缓慢的演示（800 毫秒 vs 2.6 毫秒），并质疑在显著影响搜索速度的情况下，空间节省的价值（1.2MB vs 7.2MB 内存）。此外，还对项目的起源和初步讨论的质量提出了担忧。

原文

Experimental WASM + relaxed SIMD build of botirk38/turboquant for browsers and Node.js.

Based on the paper "TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate" (Google Research, ICLR 2026).

Live Demo — vector search, image similarity, and 3D Gaussian Splatting compression running in the browser.

npm package with embedded WASM — npm install turboquant-wasm
Relaxed SIMD — @mulAdd FMA maps to f32x4.relaxed_madd
SIMD-vectorized QJL sign packing/unpacking and scaling
TypeScript API — TurboQuant.init() / encode() / decode() / dot()
Golden-value tests — byte-identical output with the reference Zig implementation

The WASM binary uses relaxed SIMD instructions:

Runtime	Minimum Version
Chrome	114+
Firefox	128+
Safari	18+
Node.js	20+

import { TurboQuant } from "turboquant-wasm";

const tq = await TurboQuant.init({ dim: 1024, seed: 42 });

// Compress a vector (~4.5 bits/dim, ~6x compression)
const compressed = tq.encode(myFloat32Array);

// Decode back
const decoded = tq.decode(compressed);

// Fast dot product without decoding
const score = tq.dot(queryVector, compressed);

tq.destroy();

class TurboQuant {
  static async init(config: { dim: number; seed: number }): Promise<TurboQuant>;
  encode(vector: Float32Array): Uint8Array;
  decode(compressed: Uint8Array): Float32Array;
  dot(query: Float32Array, compressed: Uint8Array): number;
  destroy(): void;
}

# Run tests
zig test -target aarch64-macos src/turboquant.zig

# Full npm build (zig -> wasm-opt -> base64 embed -> bun + tsc)
bun run build

# Build WASM only
bun run build:zig

Requires Zig 0.15.2 and Bun.

Encoding preserves inner products — verified by golden-value tests and distortion bounds:

MSE decreases with dimension (unit vectors)
Bits/dim is ~4.5 (payload only, excluding 22-byte header)
Dot product preservation — mean absolute error < 1.0 for unit vectors at dim=128
Bit-identical output with botirk38/turboquant for same input + seed

MIT

展示 HN: TurboQuant-WASM – Google 的向量量化技术在浏览器中运行 Show HN: TurboQuant-WASM – Google's vector quantization in the browser

展示 HN: TurboQuant-WASM – Google 的向量量化技术在浏览器中运行
Show HN: TurboQuant-WASM – Google's vector quantization in the browser