生产环境磁盘空间不足
Running Out of Disk Space in Production

原始链接: https://alt-romes.github.io/posts/2026-04-01-running-out-of-disk-space-on-launch.html

## Kanjideck 服务器启动与故障排除 一个服务器在小型 Hetzner 机器上启动(4GB 内存,40GB 磁盘,NixOS),用于分发 Kanjideck 文件,包括一个大型的 2.2GB 文件。 宣布可用后,服务器立即被流量淹没,并迅速耗尽磁盘空间。 最初的问题源于 Plausible Analytics 的数据库(8.5GB)和 Nix 存储(15GB)占用了大部分空间。 紧急措施包括清除日志并尝试清除 Nix 存储,但空间太有限。 最终,Nix 存储被移动到单独的卷,解决了眼前的危机。 然而,用户报告大型文件下载不完整。 这被追溯到 Nginx 的缓冲配置;增加 `proxy_max_temp_file_size` 允许成功交付。 随后的磁盘空间激增显示 Nginx 正在保留 14.5GB 已删除的临时文件。 禁用 Nginx 缓冲 (`proxy_buffering off`) 并设置 `proxy_max_temp_file_size 0` 最终稳定了系统。 服务器在初始启动期间经历了大约 2 小时的停机和部分功能,这凸显了仔细配置和在压力下冷静故障排除的重要性。

黑客新闻 新的 | 过去的 | 评论 | 提问 | 展示 | 工作 | 提交 登录 生产环境磁盘空间不足 (alt-romes.github.io) 18 分,romes 1小时前 | 隐藏 | 过去的 | 收藏 | 2 评论 帮助 flanfly 13 分钟前 [–] 一个技巧是始终在系统上保留一些压载文件,几 GiB 的零,可以在这种情况下删除。这不能解决问题,但可以为你争取时间并释放空间,用于锁定文件等,以便获得一个可用的系统。回复 jaapz 0 分钟前 | 父级 [–] 喜欢这个解决方案的简单和实用性。 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系 搜索:
相关文章

原文

Last night I put up a simple server which allowed customers to download the digital Kanjideck files. This server is hosted on a small Hetzner machine running NixOS, at 4GB of RAM and 40GB of disk space. One of these downloadable files weights 2.2GB.

The matter at hand boils down to a simple Haskell program which serves static files (with some extra steps regarding authorization) plus an nginx reverse proxy which proxies requests to a certain “virtual host” to the Haskell program.

Fig 1. Simplified server architecture

Not even minutes after I announced that the files were finally available, hundreds of customers visited my server all at once. As the logs started flying off of my screen with all the accesses, I started noticing a particularly interesting message, repeated over and over again:

Hetzner didn’t have an available cloud instance with more space for me to upgrade to.

Plan B: I could still buy more space as a separate Volume.

The /nix/store is an immutable store and I had heard of people setting up their nix stores on separate drives before. It was also the largest system component at 12GB now. A perfect candidate.

Luckily (rather, due to NixOS) everything went smoothly with this transition. Following the instructions on “Moving the store” in the NixOS Wiki worked flawlessly. The new Volume was labeled nix with mkfs.ext4 -L nix /dev/sdb and the mounting migration first done manually, but at the end of the day we have a final declarative configuration of the system:

proxy_max_temp_file_size? Let’s read the documentation more carefully this time:

When buffering of responses from the proxied server is enabled, […], a part of the response can be saved to a temporary file.

The zero value disables buffering of responses to temporary files.

Nginx is buffering the 2.2GB file my program is serving to temporary files. Oh dear. Let’s fix that: