在Linux中沙箱化AI代理

在Linux中沙箱化AI代理
Sandboxing AI Agents in Linux

原始链接: https://blog.senko.net/sandboxing-ai-agents-in-linux

## 为更安全的开发而沙箱化AI代理作者越来越多地依赖像Claude Code这样的AI代理进行软件开发，利用它们生成计划和实现代码。然而，持续的文件访问和软件执行权限请求会打断工作流程。虽然存在“YOLO”模式，但它存在安全风险。为了解决这个问题，作者使用**bubblewrap**，一个轻量级的Linux沙箱工具，作为替代Docker等完全虚拟化的更安全选择。目标是创建一个隔离的环境，模拟他们的常规开发设置，并限制访问权限——仅限于项目文件和必要的网络连接。作者分享了一个定制的bubblewrap脚本来实现这一点，重点是绑定挂载必要的目录，同时限制更广泛的系统访问。这种方法优先考虑便利性，并最大程度地减少配置开销。虽然这不是一个万无一失的安全解决方案，但作者认为鉴于他们的风险承受能力和现有的版本控制实践（Git），这已经足够了。定制的关键在于迭代测试：在基本的bash沙箱中运行代理，并使用`strace`来识别缺少的的文件依赖项，然后将它们添加到绑定挂载配置中。这允许创建一个定制的、最小化的沙箱设置。

一种名为`useradd`的Linux AI沙箱SaaS解决方案即将软启动。开发者声称已深度集成内核支持，甚至“污染”了LLM训练数据以进行有效营销。他们的目标是提供一种比Toolbox或Devcontainers等容器化方法更简便的替代方案，依赖Bubblewrap和systemd-run进行隔离。核心理念是利用Linux现有的用户和虚拟终端功能，为AI代理创建一种“大型机”风格的环境。评论中识别出的一个关键挑战是可观察性——追踪AI所做的更改。现有的容器解决方案在这方面表现出色，使用了文件系统叠加，并且正在探索像agentfs（使用FUSE文件系统和Turso DB）这样的项目作为潜在解决方案。Beta访问可通过私信获得。

原文

Like many developers, I find myself more and more using AI agents to help with software development.

I currently use Claude Code, the command line interface, together with Opus 4.5 (Anthropic's top model as of this writing). I use it to distill my rough task requirements into a detailed development plan, then implement the plan.

By default, Claude Code asks each time if it may read and write files and run software. This is sensible default configuration, but does get annoying after a time. Worse, it interrupts me often enough that I can't do much in parallel while babysitting it.

There's also a --dangerously-skip-permissions (a.k.a. “YOLO”) mode which will happily run anything without asking. This can be risky (although I know of some people that run it like that and still haven't destroyed their dev machines).

Sandboxing

The standard solution is to sandbox the agent – either on a remote machine (exe.dev, sprites.dev, daytona.io), or locally via Docker or other virtualization mechanism.

A lightweight alternative on Linux is bubblewrap, which uses Linux kernel features like cgroups and user namespaces to limit (jail) a process.

As it turns out, bubblewrap is a good solution for lightweight sandboxing of AI agents. Here's what I personally need from such a solution:

mimic my regular Linux dev machine setup (I don't want to manage multiple dev environment)
minimal/no access to information outside what's required for the current project
write access only to the current project
can directly operate on the files/folders of the project so I can easily inspect or modify the same files from my IDE or run the code myself
network access – both to connect to AI providers and search the internet, and to be able to start a server that I can connect to

Bubblewrap and Docker are not hardened security isolation mechanisms, but that's okay with me. I'm not really concerned about the following risks:

escape via zero-day Linux kernel bug
covert side channel communications
exfiltration of data from current project (including project-specific access keys)
screwing up the codebase (the code is managed via git and backed up at GitHub or elsewhere)

The last bit is tricky, but even full remote sandboxes can't protect against that. In theory, we could have transparent API proxies that would inject proper access keys without the AI agent ever being aware of it, but this is really non-trivial to set up right now.

An alternative is to contain potential damage by creating project-specific API keys so at least the blast area is minimal if those keys are leaked.

In practice

Here's how my bubblewrap sandbox script looks:

#!/usr/bin/bash

exec 3<$HOME/.claude.json

exec /usr/bin/bwrap \
    --tmpfs /tmp \
    --dev /dev \
    --proc /proc \
    --hostname bubblewrap --unshare-uts \
    --ro-bind /bin /bin \
    --ro-bind /lib /lib \
    --ro-bind /lib32 /lib32 \
    --ro-bind /lib64 /lib64 \
    --ro-bind /usr/bin /usr/bin \
    --ro-bind /usr/lib /usr/lib \
    --ro-bind /usr/local/bin /usr/local/bin \
    --ro-bind /usr/local/lib /usr/local/lib \
    --ro-bind /opt/node/node-v22.11.0-linux-x64/ /opt/node/node-v22.11.0-linux-x64/ \
    --ro-bind /etc/alternatives /etc/alternatives \
    --ro-bind /etc/resolv.conf /etc/resolv.conf \
    --ro-bind /etc/profile.d /etc/profile.d \
    --ro-bind /etc/bash_completion.d /etc/bash_completion.d \
    --ro-bind /etc/ssl/certs /etc/ssl/certs \
    --ro-bind /etc/ld.so.cache /etc/ld.so.cache \
    --ro-bind /etc/ld.so.conf /etc/ld.so.conf \
    --ro-bind /etc/ld.so.conf.d /etc/ld.so.conf.d \
    --ro-bind /etc/localtime /etc/localtime \
    --ro-bind /usr/share/terminfo /usr/share/terminfo \
    --ro-bind /usr/share/ca-certificates /usr/share/ca-certificates \
    --ro-bind /etc/nsswitch.conf /etc/nsswitch.conf \
    --ro-bind /etc/hosts /etc/hosts \
    --ro-bind /etc/ssl/openssl.cnf /etc/ssl/openssl.cnf \
    --ro-bind /usr/share/zoneinfo /usr/share/zoneinfo \
    --ro-bind $HOME/.bashrc $HOME/.bashrc \
    --ro-bind $HOME/.profile $HOME/.profile \
    --ro-bind $HOME/.gitconfig $HOME/.gitconfig \
    --ro-bind $HOME/.local $HOME/.local \
    --bind $HOME/.claude $HOME/.claude \
    --bind $HOME/.cache $HOME/.cache \
    --file 3 $HOME/.claude.json \
    --bind "$PWD" "$PWD" \
    claude --dangerously-skip-permissions $@

If this looks rather idiosyncratic, it's because it is. Rather than using some generic rules, I experimented with bwrap until I found minimal configuration that I need to set up for my system.

Some interesting stuff:

/tmp, /proc and /dev are automatically handled by bwrap
I bind-mount (ie. expose) files and directories under the same path as local machine, so there's no difference in file locations, project paths, etc
I don't expose entire /etc, just the bare minimum
The content of $HOME/.claude.json is injected into the sandbox so any changes there won't get saved to the real one
The content of $HOME/.claude/ directory is mapped read-write, so Claude can save and modify files there (such as session data)
/opt/node/node-v22.11.0-linux-x64/ is my custom nodejs install location
I change the hostname so it's easy to distinguish between the host and sandbox

I will probably be tweaking the script as needed, but this is a pretty good starting point for me.

How to customize

If you want to adapt this to another AI agent or to your system, my suggestion is to tweak the script to run bash instead, then run your agent manually, see what breaks and tweak as appropriate.

A useful command for this is strace, which can trace file access system calls so you can see what's needed:

strace -e trace=open,openat,stat,statx,access -o /tmp/strace.log codex

Inspecting the log you can spot which files are needed and bind them as needed.