展示:适用于任何 AI 工具的本地个人数据脱敏方案
Show HN: Local personal data redaction for any AI tools

原始链接: https://github.com/sophia486/pii-gui

PII GUI 是一款基于 Tauri 2 (React/Rust) 构建的本地优先、开源桌面应用程序,旨在检测并遮盖 PDF、Markdown 及纯文本文件中的敏感信息。通过在设备本地处理数据,确保文档内容不会离开您的机器。 **主要功能:** * **灵活的检测:** 使用可自定义的正则表达式规则或本地 ONNX 模型(可选,需一次性下载),识别电子邮件、地址和账号等个人身份信息(PII)。 * **工作流程:** 支持“先审查后遮盖”的流程,用户可通过图形界面手动切换和确认每一处匹配项。 * **安全导出:** 对于 PDF 文件,通过在敏感数据上方覆盖不透明矩形来实现“彻底遮盖”,确保底层文本无法被还原。 * **高级处理:** 具备针对长文档的页面感知分块功能、通过本地 SQLite 数据库保存的任务历史记录,以及多语言支持(英语、韩语、日语)。 * **注重隐私:** 架构将检测过程和文件输入输出限制在沙箱应用目录内,为处理敏感文档提供了一个安全环境。 PII GUI 适用于 macOS、Windows 和 Linux,采用 AGPL v3.0 许可证,欢迎社区贡献以改进其隐私分类检测模型。

抱歉。
相关文章

原文

Find and redact personal information in documents — entirely on your device.

Load a PDF, markdown, or text file, detect PII with built-in rules or local ONNX models, review every match, and export a safely redacted copy. No document content ever leaves your machine.

License: AGPL v3 Built with Tauri Platforms Stars

Example · Features · Detection Backends · Setup · Development · Roadmap

PII GUI — Local-first PII redaction

PII GUI is a Tauri 2 desktop app (React 19 + TypeScript frontend, Rust backend) for local-first PII detection and redaction. Detection runs on-device with regex rules or quantized ONNX models; the only network access is the optional one-time model download.

PII GUI supports two local workflows.

  • Local inference only — PII detection runs entirely on-device. The only network access is the optional one-time model download from Hugging Face.
  • PDF, Markdown, and plain-text input — PDFs are parsed with pdf.js, preserving per-character positions so detections are highlighted directly on the rendered page.
  • Custom rules — add your own regex or exact-match filters on top of any backend.
  • Review before redacting — toggle individual matches on or off in the workbench before export.
  • True PDF redaction — exported PDFs burn opaque rectangles into the rendered pages with pdf-lib, so redacted text is not recoverable from the output file.
  • Task history and persistence — tabs, custom rules, and filter results survive restarts via a local SQLite database and on-disk result files.
  • Long-document support — input is split into token-bounded, page-aware chunks and processed through a task queue.
  • Localized UI — English, Korean, and Japanese.
Backend Best for
Regex (built-in) Instant baseline detection of emails, phones, URLs, dates, account numbers, and secrets
OpenAI Privacy Filter Long English documents and broad privacy-taxonomy detection
BardsAI EU PII European-language text where names, addresses, and ID-like entities matter

Matches are labeled with a fixed privacy taxonomy:

account_number · private_address · private_email · private_person · private_phone · private_url · private_date · secret

  • Node.js 24+
  • pnpm
  • Rust and Cargo
  • Tauri v2 platform prerequisites for your OS

Download the latest installer for macOS, Windows, or Linux from the Releases page.

On first launch, the onboarding flow lets you pick a default backend. Regex works immediately; the ONNX models are optional downloads (fetched from Hugging Face into the app data directory, and removable at any time from Settings).

Install from source:

Local release signing values are optional for development. If you need updater signing locally, copy the environment template and fill in your own key:

Document (PDF / md / txt)
  → text extraction (pdf.js, per-character boxes for PDFs)
  → token-bounded, page-aware chunking
  → task queue → Rust `redact_text` command
  → regex / ONNX inference (ort + tokenizers)
  → matches + redacted text
  → review & toggle matches in the UI
  → export (burned-in PDF redaction or redacted text)

The frontend (React) handles document parsing, chunking, review, and export. The Rust backend (src-tauri/) owns the detection engines, model lifecycle (download / verify / delete), and file I/O — all writes are confined to the Tauri app data directory.

cd tauri
pnpm install
pnpm tauri dev
cd tauri
pnpm tauri build
cd tauri
pnpm test:unit              # frontend unit tests (vitest)
pnpm build                  # typecheck + frontend build

cd src-tauri
cargo test                  # Rust backend tests
tauri/                      # the desktop app
  src/                      # React frontend
    App.tsx                 # orchestrator: tabs, routing, workbench
    components/             # PDF preview, shadcn/Radix UI primitives
    lib/
      pdf-document.ts       # pdf.js text + char-box extraction
      pii-text-chunks.ts    # token-bounded chunking
      pii-task-queue.ts     # detection task queue
      redaction-policy.ts   # match merge/select/restore logic
      pdf-redacted-export.ts# burned-in PDF redaction export
      app-persistence.ts    # SQLite + result-file persistence
      i18n.ts               # en / ko / ja UI copy
  src-tauri/                # Rust backend
    src/lib.rs              # Tauri commands: redact_text, model lifecycle, file I/O
    src/redact_engine.rs    # regex / ONNX / BardsAI detection backends
docs/assets/                # README thumbnail and screenshot assets
.github/workflows/release.yml  # cross-platform release builds

Contributions are welcome!

  • Bug reports & feature requests — open an issue with steps to reproduce or a short description of the use case.
  • Pull requests — keep changes small and focused. Before submitting, run the checks for the area you touched:
    • Frontend: pnpm test:unit and pnpm build from tauri/
    • Rust backend: cargo test from tauri/src-tauri/
  • Detection quality — false positives/negatives are especially useful to report; include the backend (regex / Privacy Filter / BardsAI) and a minimal, PII-free sample that reproduces the issue.
  • Benchmarks — keep local benchmark scripts and outputs out of commits; /benchmarks/ is ignored.

PII GUI is licensed under the GNU Affero General Public License v3.0.

PII GUI builds on several open-source projects and model releases:

Run the smallest check that proves the change, then widen as needed:

cd tauri && pnpm test:unit
cd tauri && pnpm build
cd tauri/src-tauri && cargo check
git diff --check

For packaging, run pnpm tauri build on the target platform before making release claims.

联系我们 contact @ memedata.com