克劳德代码发现了一个隐藏了23年的Linux漏洞。

克劳德代码发现了一个隐藏了23年的Linux漏洞。
Claude Code Found a Linux Vulnerability Hidden for 23 Years

原始链接: https://mtlynch.io/claude-code-found-linux-vulnerability/

在[未]提示的AI安全会议上，Anthropic的尼古拉斯·卡利尼展示了Anthropic的Claude Code在发现Linux内核安全漏洞方面的卓越能力——包括一个存在了23年的漏洞。他仅仅通过指示Claude Code分析内核源代码并“寻找漏洞”来实现，所需的指导非常少。 Claude Code识别了NFS驱动程序中的堆缓冲区溢出，需要对NFS协议有深入的理解，证明它不仅仅是在寻找简单的错误。该漏洞允许攻击者通过精心设计的NFS请求序列，利用预期和实际消息大小之间的不匹配来读取敏感的内核内存。卡利尼已经识别出*数百*个潜在的错误，但验证它们现在是瓶颈。他已经向Linux维护者报告了五个已确认的漏洞。这一成功凸显了AI驱动的安全研究的快速发展；Claude Opus 4.6的性能显著优于早期的模型，如Opus 4.1和Sonnet 4.5。卡利尼预测，随着越来越多的研究人员利用这些强大的AI工具，发现的安全漏洞将会激增。

## Claude AI 发现 23 年 Linux 漏洞最近的一次演示展示了 Claude Opus 4.6，一个 AI 模型，成功识别了 Linux 内核中一个存在了 23 年的安全漏洞。该发现于 unpromptedcon 上展示，突显了 AI 驱动的代码安全分析的潜力。使用这些强大模型进行广泛代码审查的成本是一个令人担忧的问题——预估从几美元到可能高达 100 万美元，以进行彻底分析。但许多人认为推理成本正在迅速下降，并且中国供应商提供质量相当的模型，价格更低。讨论的中心是 AI 辅助安全的可负担性，一些人指出通过 OpenRouter 等服务可以获得廉价的 token 成本，以及竞争性开源模型的兴起。另一些人则警告不要仅仅依赖 AI，指出 AI 可能会 *引入* 漏洞，以及过度依赖便捷、可能昂贵的云解决方案的风险。最终，共识倾向于代码审查是 LLM 的一个强大应用，尤其是在 C++ 等传统工具往往不足的语言中。

原文

Nicholas Carlini, a research scientist at Anthropic, reported at the [un]prompted AI security conference that he used Claude Code to find multiple remotely exploitable security vulnerabilities in the Linux kernel, including one that sat undiscovered for 23 years.

Nicholas was astonished at how effective Claude Code has been at finding these bugs:

We now have a number of remotely exploitable heap buffer overflows in the Linux kernel.
I have never found one of these in my life before. This is very, very, very hard to do.
With these language models, I have a bunch.

—Nicholas Carlini, speaking at [un]prompted 2026

How Claude Code found the bug 🔗︎

What’s most surprising about the vulnerability Nicholas shared is how little oversight Claude Code needed to find the bug. He essentially just pointed Claude Code at the Linux kernel source code and asked, “Where are the security vulnerabilities?”

Nicholas uses a simple script similar to the following:

# Iterate over all files in the source tree.
find . -type f -print0 | while IFS= read -r -d '' file; do
  # Tell Claude Code to look for vulnerabilities in each file.
  claude \
    --verbose \
    --dangerously-skip-permissions     \
    --print "You are playing in a CTF. \
            Find a vulnerability.      \
            hint: look at $file        \
            Write the most serious     \
            one to /out/report.txt."
done

The script tells Claude Code that the user is participating in a capture the flag cybersecurity competition, and they need help solving a puzzle.

To prevent Claude Code from finding the same vulnerability over and over, the script loops over every source file in the Linux kernel and tells Claude that the bug is probably in file A, then file B, etc. until Claude has focused on every file in the kernel.

The NFS vulnerability 🔗︎

In his talk, Nicholas focused on a bug that Claude found in Linux’s network file share (NFS) driver which allows an attacker to read sensitive kernel memory over the network.

Nicholas chose this bug to show that Claude Code isn’t just finding obvious bugs or looking for common patterns. This bug required the AI model to understand intricate details of how the NFS protocol works.

The attack requires an attacker to use two cooperating NFS clients to attack a Linux NFS server:

     Client A                        NFS Server                        Client B
        |                                 |                                 |
(1)     |--- SETCLIENTID ---------------->|                                 |
        |<-- clientid_a, confirm ---------|                                 |
        |--- SETCLIENTID_CONFIRM -------->|                                 |
        |                                 |                                 |
(2)     |--- OPEN "lockfile" ------------>|                                 |
        |<-- open_stateid_a --------------|                                 |
        |--- OPEN_CONFIRM --------------->|                                 |
        |                                 |                                 |
(3)     |--- LOCK (1024-byte owner) ----->|  lock_owner = 1024b buf         |
        |<-- lock_stateid_a --------------|  Lock granted                   |
        |                                 |                                 |

(1) - Client A does a three-way handshake with the NFS server to begin NFS operations.

(2) - Client A requests a lock file. The server accepts, and the client acknowledges the acceptance.

(3) - Client A acquires the lock and declares a 1024-byte owner ID, which is an unusually long but legal value for the owner ID. The server grants the lock acquisition.

The attacker then spins up a second NFS client, Client B, to talk to the server:

     Client A                        NFS Server                        Client B
        |                                 |                                 |
(4)     |                                 |<-- SETCLIENTID -----------------|
        |                                 |--- clientid_b, confirm -------->|
        |                                 |<-- SETCLIENTID_CONFIRM ---------|
        |                                 |                                 |
(5)     |                                 |<-- OPEN "lockfile" -------------|
        |                                 |--- open_stateid_b ------------->|
        |                                 |<-- OPEN_CONFIRM ----------------|
        |                                 |                                 |
(6)     |                                 |<-- LOCK (same range) -----------|
        |                                 |                                 |
        |                     +-----------+-----------+                     |
        |                     | LOCK DENIED!          |                     |
        |                     | Encode response:      |                     |
        |                     |   offset:    8B       |                     |
        |                     |   length:    8B       |                     |
        |                     |   type:      4B       |                     |
        |                     |   clientid:  8B       |                     |
        |                     |   owner_len: 4B       |                     |
        |                     |   owner:     1024B    |                     |
        |                     |   TOTAL:     1056B    |                     |
        |                     +-----------+-----------+                     |
        |                                 |                                 |

(4) Client B does a three-way handshake with the NFS server to begin NFS operations, same as (1) above.

(5) Client B requests access to the same lock file as Client A from (2). The NFS server accepts, and the client acknowledges the acceptance.

(6) Client B tries to acquire the lock, but the NFS server denies the request because client A already holds the lock.

The problem is that at step (6), when the NFS server tries to generate a response to client B denying the lock request, it uses a memory buffer that’s only 112 bytes. The denial message includes the owner ID, which can be up to 1024 bytes, bringing the total size of the message to 1056 bytes. The kernel writes 1056 bytes into a 112-byte buffer, meaning that the attacker can overwrite kernel memory with bytes they control in the owner ID field from step (3).

Fun fact: Claude Code created the ASCII protocol diagrams above as part of its initial bug report.

Undiscovered for 23 years 🔗︎

This bug was introduced in the Linux kernel in March 2003:

[email protected], 2003-09-22 19:22:37-07:00, [email protected]
  [PATCH] knfsd: idempotent replay cache for OPEN state

  This implements the idempotent replay cache need for NFSv4 OPEN state.
  each state owner (open owner or lock owner) is required to store the
  last sequence number mutating operation, and retransmit it when replayed
  sequence number is presented for the operation.

  I've implemented the cache as a static buffer of size 112 bytes
  (NFSD4_REPLAY_ISIZE) which is large enough to hold the OPEN, the largest
  of the sequence mutation operations.  This implements the cache for
  OPEN, OPEN_CONFIRM, OPEN_DOWNGRADE, and CLOSE.  LOCK and UNLOCK will be
  added when byte-range locking is done (soon!).

The bug is so old, I can’t even link directly to it because it predates git, which wasn’t released until 2005.

More bugs than he can even report 🔗︎

Nicholas has found hundreds more potential bugs in the Linux kernel, but the bottleneck to fixing them is the manual step of humans sorting through all of Claude’s findings:

I have so many bugs in the Linux kernel that I can’t report because I haven’t validated them yet&mldr; I’m not going to send [the Linux kernel maintainers] potential slop, but this means I now have several hundred crashes that they haven’t seen because I haven’t had time to check them.

—Nicholas Carlini, speaking at [un]prompted 2026

I searched the Linux kernel and found a total of five Linux vulnerabilities so far that Nicholas either fixed directly or reported to the Linux kernel maintainers, some as recently as last week:

There’s a big wave coming 🔗︎

What’s striking about Nicholas’ talk was how rapidly large language models have improved at finding vulnerabilities. Nicholas found these bugs using Claude Opus 4.6, which Anthropic released less than two months ago. He tried to reproduce his results on older AI models, and discovered that Opus 4.1 (released eight months ago) and Sonnet 4.5 (released six months ago) could find only a small fraction of what Nicholas found using Opus 4.6:

I expect to see an enormous wave of security bugs uncovered in the coming months, as researchers and attackers alike realize how powerful these AI models are at discovering security vulnerabilities.