推理提供者如何证明他们没有提供量化模型
How an inference provider can prove they're not serving a quantized model

原始链接: https://tinfoil.sh/blog/2026-02-03-proving-model-identity

## 使用Tinfoil的Modelwrap进行可验证推理 Tinfoil开发了**Modelwrap**,一个系统,保证推理API中使用模型权重的完整性。与仅仅指定模型名称不同,Modelwrap通过密码学方法验证正在提供的是*精确的*、未篡改的权重——这是一种至关重要的保证,即使对于开源模型来说也常常缺乏。 Modelwrap通过**Merkle树**(用于高效的数据验证)和**dm-verity**(一种在每次磁盘读取时强制执行密码学检查的内核级系统)的组合来实现这一点。它的工作原理是创建一个“可证明磁盘”,在*运行时*将模型权重与公开提交的哈希值进行验证,从而防止静默降级或修改。 虽然安全飞地已经证明了初始加载的代码,但Modelwrap将其扩展到*启动后*加载的数据,确保权重没有被更改。这对于公共模型和私有模型都很有价值;用户可以独立验证公共模型,而私有模型可以从一致、可验证的性能中受益。 性能开销很小——初始模型加载速度慢约80%,但加载完成后推理速度不受影响。Modelwrap是开源的,并且在GitHub上可用,为AI推理提供了新的信任和透明度标准。

黑客新闻 新的 | 过去的 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 推理提供者如何证明他们没有提供量化模型 (tinfoil.sh) 6 分,来自 FrasiertheLion 1 小时前 | 隐藏 | 过去的 | 收藏 | 讨论 帮助 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系 搜索:
相关文章

原文
← Back to Posts

Feb 3, 202612 min read

Tinfoil Team

Updated Feb 18, 2026

When you call an inference API, how do you know which model you're actually served? Sure you can specify the name of the model you expect to process your request, but ultimately you have no guarantee that the provider is actually serving it.

When talking to an open-source model, are you being served the exact weights that the model publisher released on Hugging Face? Or is it a silently quantized version, or a version with a smaller context window that changes based on how much traffic the provider is experiencing?

The situation gets even murkier when using a closed-source model provider. How do you know that you are getting the same model each time?

People routinely report wide variation in evals across providers (and sometimes across time within the same provider, see Claude Opus performance tracking). And sometimes, accidental misconfigurations can silently degrade quality.

Verifiable Inference with Modelwrap

At Tinfoil, we built Modelwrap, which is a way to cryptographically guarantee that we are serving a specific, untampered set of weights that our clients can verify on each request. This is a very strong guarantee that so far we have not seen any other inference provider offer. At its core, Modelwrap consists of the following components:

  1. A public commitment to model weights
  2. A mechanism to bind the public commitment to the inference server
  3. A process to verify, client-side, that the inference server is using the committed model weights

In this post, we go into the technical details of how we built Modelwrap.

Why is this harder than just vanilla attestation?

We run models in secure hardware enclaves, which already allow us to prove which code we're running inside the enclave through attestation. In a nutshell, the way that attestation works is by measuring the initial state of the machine at launch time. When you boot an enclave with a given binary, the hardware produces a signed report that proves exactly what binary was loaded.

But attestation measures launch state, not runtime state. Weights are loaded from disk storage after the enclave has already booted. If you rely on basic attestation, the attestation report will not include this post-boot loading from disk storage. So the real problem becomes:

How do we make attestation meaningfully bind to data that is fetched later?

The trick is to attest that the enclave includes both the expected hash of the data and some code that will check that hash after it's loaded. With Modelwrap we end up attesting two things:

  1. The cryptographic commitment to the model weights (a single root hash via a Merkle tree)
  2. The enforcement mechanism that checks the commitment (kernel-level verification on every read via dm-verity)

We're still using boot-time attestation, but now the attestation process proves that "the system is configured so that it cannot read bytes that don't match the committed hash," which provides a runtime guarantee. The enforcement mechanism we use under-the-hood is called dm-verity, a kernel-level system for verifying cryptographic commitments of read-only filesystems.

Building Blocks

Before diving in, we need to explain the building blocks that make up Modelwrap.

Merkle Tree

Suppose the model weights are 140GB. We would like to prove that "these are exactly the weights we publicly committed to". A Merkle tree lets you authenticate large amounts of data (140GB) with a small commitment (32 bytes). The idea is to split the data into fixed-size blocks of 4KB each, then hash pairs of those hashes together, and keep going until you're left with a single root hash. If any part of the data changes anywhere, the root hash also changes. Such a hash tree makes it easy to verify any single block corresponds to the commitment at read time: all we need is to check hashes along the path from the block to that root.

Merkle tree structure for model weights

Merkle trees are commonly used in situations where large amounts of data must be verified piecewise such as in Certificate Transparency and Sigstore.

dm-verity

The Merkle tree gives us a way to verify any block with one root hash. But that alone won't stop anyone from reading bad data. Something needs to actually enforce verification on every read. This is where dm-verity comes in. dm-verity is a Linux kernel subsystem that verifies disk reads using a Merkle tree. When the inference server (such as vLLM) calls read() to load model weights, dm-verity automatically intercepts the request, fetches the block, hashes it, walks the Merkle tree to the root, and compares against the provided root hash. If there's a match, it returns the data. If there is no match, it returns an I/O error and the application never receives the corrupted block.

We want to emphasize that the magic of dm-verity is that vLLM (or any application doing the reads) has no idea this verification is happening! This means there is no need for code changes or special APIs. Everything happens transparently at the kernel level. If the weights don't match the committed root hash, it simply becomes impossible to read from disk and the application gets an I/O error.

dm-verity comparison

This is how Android verified boot has worked since 2013. The bootloader passes a trusted root hash to the kernel. Every block read from the system partition gets checked against the hash tree. Billions of devices use this to catch any disk tampering today.

Modelwrap Architecture

Modelwrap uses a combination of dm-verity and read-only filesystems to create what's called an "attested disk". This is a standard technique used in confidential computing for OS image integrity. For instance, this pattern was used by Confidential Containers for OS and Amazon Linux attestable AMIs.

Our main insight behind Modelwrap is that model weights have the same properties as OS images because (1) they're large, (2) read-only at inference time, and (3) in our case need strong integrity guarantees.

Modelwrap Architecture

Modelwrap proceeds in three phases:

1. Computing the model weight commitment

Modelwrap first downloads the model weights at a specific version. It then computes a Merkle hash tree (using dm-verity) and outputs the root hash as the public commitment. Anyone can run Modelwrap themselves to ensure that the root hash corresponds to the right model weights.

This commitment is provided to the kernel when booting a new secure enclave.

Deterministic download

Modelwrap takes a Hugging Face model with an explicit commit hash: