展示HN:PII-Shield – 带JSON完整性的日志清理Sidecar (Go, 熵)
Show HN: PII-Shield – Log Sanitization Sidecar with JSON Integrity (Go, Entropy)

原始链接: https://github.com/aragossa/pii-shield

## PII-Shield:Kubernetes零代码日志脱敏 PII-Shield是一个高性能、零代码的Kubernetes边车容器,旨在通过在日志离开应用程序*之前*屏蔽个人身份信息 (PII) 来防止数据泄露。与传统的日志过滤方法(如Fluentd/Logstash中的正则表达式)相比,PII-Shield使用Go语言编写,具有低延迟处理能力,速度更快且资源消耗更少。 它利用上下文感知的熵分析来检测密钥——即使没有已知密钥——并将其替换为确定性哈希(例如,`[HIDDEN:a1b2c]`),以便在不暴露敏感数据的情况下进行质量保证关联。 PII-Shield无需代码更改,适用于任何语言,并且可以作为边车轻松部署,使用initContainer复制二进制文件即可。它经过严格测试,在各种场景下均具有高精度,包括多语言日志和复杂数据结构。配置通过环境变量管理,包括生产环境所需的HMAC盐。 在Docker Hub上找到它:`docker pull thelisdeep/pii-shield:latest`。

## PII-Shield:Kubernetes日志清理 PII-Shield是一个开源(Apache 2.0)工具,旨在自动检测并删除应用程序日志中的敏感信息。它可作为Kubernetes sidecar或CLI工具使用,利用香农熵分析来识别潜在的密钥——即使没有预定义的模式,例如API密钥(“sk-live-…”)。然后,它使用HMAC确定性地删除这些密钥。 一个关键特性是它能够解析和重建JSON日志,确保为ELK或Datadog等SIEM工具提供有效的输出。确定性删除意味着相同的密钥*始终*哈希到相同的占位符,有助于调试,而不会暴露原始数据。 目前,PII-Shield专注于高熵密钥,而不是姓名等个人身份信息。开发者欢迎反馈,特别是关于熵阈值逻辑的反馈,该逻辑目前经过调整,以避免审查UUID,同时有效地捕获API密钥。 [https://github.com/aragossa/pii-shield](https://github.com/aragossa/pii-shield)
相关文章

原文

Zero-code log sanitization sidecar for Kubernetes. Prevents data leaks (GDPR/SOC2) by redacting PII from logs before they leave the pod.

License Docker Pulls Go Report Card

Developers often forget to mask sensitive data. Traditional regex filters in Fluentd/Logstash are slow, hard to maintain, and consume expensive CPU on log aggregators.

PII-Shield sits right next to your app container:

  • High Performance: Written in Go, designed for low-latency log processing.
  • Context-Aware Entropy Analysis: Detected high-entropy secrets even without keys (e.g. Error: ... 44saCk9...) by analyzing context keywords.
  • 100% Accuracy: Verified against "Wild" stress tests including binary garbage, JSON nesting, and multilingual logs.
  • Deterministic Hashing: Replaces secrets with unique hashes (e.g., [HIDDEN:a1b2c]), allowing QA to correlate errors without seeing the raw data.
  • Drop-in: No code changes required. Works with any language (Node, Python, Java, Go).

Get the latest lightweight image from Docker Hub:

docker pull thelisdeep/pii-shield:latest

See CONFIGURATION.md for a full list of environment variables, including:

  • PII_SALT: Custom HMAC salt (Required for production).
  • PII_ADAPTIVE_THRESHOLD: Enable dynamic entropy baselines.
  • PII_DISABLE_BIGRAM_CHECK: Optimize for non-English logs.
  1. Test Locally (CLI) You can pipe any log output through PII-Shield to see it in action immediately:
# Emulate a log with a sensitive password
echo "Error: User password=MySecretPass123! failed login" | docker run -i --rm thelisdeep/pii-shield:latest

# Output: Error: User password=[HIDDEN:8f3a11] failed login
  1. Kubernetes (Sidecar Pattern) To use PII-Shield as a pipe wrapper for your application, use an initContainer to copy the binary into a shared volume.
apiVersion: v1
kind: Pod
metadata:
  name: secure-app
spec:
  volumes:
  - name: bin-dir
    emptyDir: {}
  
  # 1. Copy the PII-Shield binary to a shared volume
  initContainers:
  - name: install-shield
    image: thelisdeep/pii-shield:latest
    command: ["cp", "/bin/pii-shield", "/opt/bin/pii-shield"]
    volumeMounts:
    - name: bin-dir
      mountPath: /opt/bin

  # 2. Run your app and pipe logs through PII-Shield
  containers:
  - name: my-app
    image: my-app:1.0
    command: ["/bin/sh", "-c"]
    # Pipe stderr/stdout through the sanitizer
    args: ["./start-app.sh 2>&1 | /opt/bin/pii-shield"] 
    volumeMounts:
    - name: bin-dir
      mountPath: /opt/bin

This project is verified with a comprehensive suite:

  1. Unit Tests: Cover edge cases, multilingual support, and JSON integrity.
  2. Fuzzing: Native Go fuzzing ensures crash safety against invalid inputs.
  3. Stress Testing: ./full_stress_test.sh validates 100% detection accuracy on mixed workloads.

Distributed under the Apache 2.0 License. See LICENSE for more information.

联系我们 contact @ memedata.com