复制失败破坏者:K8s CVE-2026-31431 修复
Copy-fail-destroyer: K8s remediation for CVE-2026-31431

原始链接: https://github.com/NorskHelsenett/copy-fail-destroyer

此 Kubernetes DaemonSet 代理主动解决 CVE-2026-31431(“Copy Fail”),这是一个允许非特权用户进行页面缓存写入的 Linux 内核漏洞。它在每个节点上运行,每 5 分钟检查内核是否已修补该漏洞,并通过 AF_ALG socket 接口探测漏洞。 该代理提供三种修复模式:`unload`(默认,卸载易受攻击的模块)、`blacklist`(卸载并阻止重新加载)和 `disabled`(仅检测)。它公开 Prometheus 指标,用于监控内核补丁状态、漏洞是否存在、修复是否成功以及模块的可访问性。 部署通过 Kubernetes 清单或 Helm 图表实现,需要特权安全上下文。可以使用提供的 PrometheusRule 定义配置警报,用于关键、警告和信息性警报。GitHub Actions 工作流可自动构建、测试和发布容器镜像和 Helm 图表,具体取决于标记的发布版本。该代理旨在易于与现有监控系统集成,并提供针对此内核漏洞的强大防御。

一种新的 Kubernetes 修复工具“Copy-fail-destroyer”,旨在解决 CVE-2026-31431,正在 Hacker News 上受到批评。虽然该工具意在帮助缓解内核漏洞,但评论员对其有效性和安全性表示严重担忧。 一位用户指出,该工具通过黑名单模块的方式不足,因为该模块仍可以通过其他方式加载。另一位评论员批评了作者的资质,认为依赖 Copilot 等 AI 工具并不等同于安全专业知识。此外,运行来自未知来源的特权工作负载被认为是有风险的。 一个关键点是,该工具在 Google Kubernetes Engine (GKE) 等环境中无效,因为易受攻击的模块可能直接编译到内核中。总而言之,讨论凸显了对该工具实用性和安全影响的怀疑。
相关文章

原文

A Kubernetes DaemonSet agent that detects and remediates CVE-2026-31431 ("Copy Fail") — an algif_aead in-place logic flaw in the Linux kernel allowing unprivileged page-cache writes via the AF_ALG socket interface.

On each node the agent runs a loop every 5 minutes that:

  1. Checks the kernel version against all known patched stable branches.
  2. Probes the AF_ALG module by attempting to create and bind an AF_ALG socket to aead / authenc(hmac(sha256),cbc(aes)) — the exact algorithm the exploit targets. This is safe and non-destructive.
  3. Remediates based on the configured REMEDIATION_MODE (see below).
  4. Exposes Prometheus metrics so you can alert and track status across the fleet.

Set via the REMEDIATION_MODE environment variable (or remediationMode in the Helm chart):

Mode Behaviour
unload (default) Unloads the algif_aead kernel module via delete_module
blacklist Unloads the module and writes a modprobe blacklist rule to prevent auto-reload
disabled Detect and report only — no remediation is performed

All metrics are exposed on :9100/metrics.

Metric Description
cve_2026_31431_kernel_needs_patching 1 if the kernel version is not patched for CVE-2026-31431
cve_2026_31431_vulnerable 1 if the kernel is vulnerable to CVE-2026-31431 and the module is reachable
cve_2026_31431_module_reachable 1 if the AF_ALG aead algorithm can be bound
cve_2026_31431_remediation_applied 1 if the algif_aead module was successfully unloaded

CVE-2026-31431 (Copy Fail)

  • 7.0+ (mainline)
  • 6.19.12+, 6.18.22+
  • Kernels before 4.14 are not affected (bug introduced in 4.14)
cmd/destroyer/main.go          # Entry point — metrics server, check loop, remediation
pkg/detector/
  cve202631431.go              # CVE-2026-31431 (Copy Fail) detection
  probe_linux.go               # AF_ALG module probe (Linux)
  probe_other.go               # Probe stub (non-Linux)
  remediate_linux.go           # Module unload via delete_module (Linux)
  remediate_other.go           # Remediation stub (non-Linux)
deploy/namespace.yaml          # Namespace with Pod Security Admission policy
deploy/daemonset.yaml          # Kubernetes DaemonSet manifest
Dockerfile                     # Multi-stage build (scratch final image)
# Native
go build ./cmd/destroyer

# Linux cross-compile (for container image)
CGO_ENABLED=0 GOOS=linux go build -o destroyer ./cmd/destroyer
docker build -t copy-fail-destroyer .

The agent requires a privileged security context to unload kernel modules and probe AF_ALG sockets. The root filesystem is read-only.

kubectl apply -f deploy/namespace.yaml
kubectl apply -f deploy/daemonset.yaml
helm install copy-fail-destroyer oci://ghcr.io/norskhelsenett/helm/copy-fail-destroyer \
  --namespace copy-fail-destroyer --create-namespace

Override the remediation mode:

helm install copy-fail-destroyer oci://ghcr.io/norskhelsenett/helm/copy-fail-destroyer \
  --namespace copy-fail-destroyer --create-namespace \
  --set remediationMode=disabled

An Application manifest is provided at deploy/argocd-application.yaml. Edit targetRevision to pin a chart version:

kubectl apply -f deploy/argocd-application.yaml

The DaemonSet includes Prometheus scrape annotations (prometheus.io/scrape: "true", port 9100).

If you use the Prometheus Operator, deploy the PodMonitor to have metrics scraped automatically:

# Raw manifest
kubectl apply -f deploy/podmonitor.yaml

# Or via Helm
helm install copy-fail-destroyer oci://ghcr.io/norskhelsenett/helm/copy-fail-destroyer \
  --namespace copy-fail-destroyer --create-namespace \
  --set metrics.podMonitor.enabled=true

Alert rules (PrometheusRule) for Alertmanager are also available:

# Raw manifest
kubectl apply -f deploy/prometheusrule.yaml

# Or via Helm with extra alert labels
helm install copy-fail-destroyer oci://ghcr.io/norskhelsenett/helm/copy-fail-destroyer \
  --namespace copy-fail-destroyer --create-namespace \
  --set metrics.prometheusRule.enabled=true \
  --set metrics.prometheusRule.extraAlertLabels.team=platform

Three alerts are defined:

Alert Severity Description
CopyFailVulnerable critical Kernel is vulnerable and AF_ALG module is reachable
CopyFailKernelNeedsPatching warning Kernel version is unpatched (module may be mitigated)
CopyFailRemediationFailed warning Module still reachable after remediation attempt

A GitHub Actions workflow (.github/workflows/build.yaml) triggers on versioned tags (v*). It:

  1. Runs go test ./...
  2. Builds the Linux binary
  3. Builds and pushes a container image to ghcr.io/norskhelsenett/copy-fail-destroyer
  4. Packages and pushes the Helm chart to oci://ghcr.io/norskhelsenett/helm/copy-fail-destroyer

Tags are derived from the Git tag — e.g. pushing v1.2.3 produces image tags 1.2.3 and 1.2.

git tag v1.0.0
git push origin v1.0.0
联系我们 contact @ memedata.com