詹妮弗·安妮斯顿和《老友记》耗费了我们377GB空间,并破坏了Ext4硬链接。
Jennifer Aniston and Friends Cost Us 377GB and Broke Ext4 Hardlinks

原始链接: https://blog.discourse.org/2026/04/how-jennifer-aniston-and-friends-cost-us-377gb-and-broke-ext4-hardlinks/

## 备份优化与去重限制 最近,一项针对拥有大量用户上传的平台备份的优化工作,发现了一个令人惊讶的问题:大量文件重复。一些站点重复存储了相同的文件,文件名不同——一个站点甚至将一个反应GIF重复了246,000次,使备份大小增加了16倍。 解决方案是使用硬链接进行基于内容的去重,目标是仅存储每个唯一文件一次。初步测试成功,但生产备份暴露了一个隐藏的限制:ext4文件系统每个文件的大约硬链接限制为65,000个。 当达到此限制时,硬链接创建失败,回退到完整文件下载——虽然这仍然比下载*所有*重复文件有显著改进,但并非理想状态。团队通过修改代码来优雅地处理`Errno::EMLINK`错误(文件系统硬链接限制已达到),通过复制文件而不是硬链接来确保不同文件系统的兼容性。 这次经历强调了测试故障模式的重要性,承认文件系统级别的约束,并在优化中构建优雅的降级机制——即使这些优化具有巨大的潜在收益。最终,修复将传输大小从377GB减少到大约6.4MB,针对那个有问题GIF。

对不起。
相关文章

原文

It started with backup issues. Sites with hundreds of gigabytes of uploads were running out of disk space during backup generation. One site had 600+ GB of uploads and the backup process kept dying.

While looking into reliable large backups, we discovered something wild in one of those sites: the actual unique content was a fraction of the reported size. They were storing the same files over and over again, each with a different filename. The duplication was absurd.

So we shipped an optimization. Detect duplicate files by their content hash, use hardlinks instead of downloading each copy. I wrote some new tests, they all passed, it got approved and merged. But unfortunately, a fix like this is kind of hard to actually fully test.

Then someone ran it on a real production backup and hit a filesystem limit I didn't know existed. The culprit? A single reaction GIF, duplicated 246,173 times...

Discourse has a feature called secure uploads. When a file moves between security contexts (say, from a private message to a public post), the system creates a new copy with a randomized SHA1. The original content is identical, but Discourse treats it as a new file.

This happens constantly with reaction GIFs and popular images. Users share them across posts, embed them in PMs, repost in different categories. Each context creates another copy.

This is mostly fine for normal operation. But for backups, it's a disaster.

One customer had 432 GB of uploads. Unique content? 26 GB. The rest was duplicates. A 16x inflation factor, all going into the backup archive.

The fix seemed straightforward. Discourse tracks the original content hash in original_sha1. During backup:

  1. Group uploads by original_sha1
  2. Download the first file in each group
  3. Create hardlinks for the duplicates

Hardlinks point multiple filenames to the same data on disk. GNU tar preserves them, so the archive stores the data once. Download 26 GB, archive 26 GB, everyone wins.

def process_upload_group(upload_group)
  primary = upload_group.first
  primary_filename = upload_path_in_archive(primary)

  return if !download_upload_to_file(primary, primary_filename)

  # Create hardlinks for all duplicates in this group
  upload_group.drop(1).each do |duplicate|
    duplicate_filename = upload_path_in_archive(duplicate)
    hardlink_or_download(primary_filename, duplicate, duplicate_filename)
  end
end

The hardlink_or_download method falls back to downloading if the hardlink fails:

def hardlink_or_download(source_filename, upload_data, target_filename)
  FileUtils.mkdir_p(File.dirname(target_filename))
  FileUtils.ln(source_filename, target_filename)  # Create hardlink
  increment_and_log_progress(:hardlinked)
rescue StandardError => ex
  # Fallback: download if hardlink fails
  log "Failed to create hardlink, downloading instead", ex
  download_upload_to_file(upload_data, target_filename)
end

We shipped it and got positive feedback...

A colleague then used the new version to run a backup on a large site. The logs looked great:

53000 files processed (25 downloaded, 52975 hardlinked). Still processing...
54000 files processed (25 downloaded, 53975 hardlinked). Still processing...
...
64000 files processed (25 downloaded, 63975 hardlinked). Still processing...
65000 files processed (25 downloaded, 64975 hardlinked). Still processing...
Failed to create hardlink for upload ID 482897, downloading instead
Failed to create hardlink for upload ID 457497, downloading instead
Failed to create hardlink for upload ID 867574, downloading instead

At 65,000 hardlinks, it started failing. Turns out ext4 has a limit: roughly 65,000 hardlinks per inode. One file can only have 65,000 names pointing to it.

The fallback worked and it didn't fail completely. The backup finished. But instead of one download for all 246,173 duplicates, we got one download plus ~181,000 fallback downloads after hitting the limit.

Still better than 246,173 downloads. But not the win I expected.

So what file had 246,173 copies?

Upload.where(original_sha1: '27b7a62e34...').count
=> 246173

Upload.where(original_sha1: '27b7a62e34...').first.filesize
=> 1643869

1.6 MB. Duplicated a quarter million times. That's 377 GB of backup bloat from a single image.

And then I saw what it was...

A reaction GIF. Used constantly in posts, PMs, everywhere. Each use in a different security context creates a new copy. 246,173 copies of Rachel from Friends doing a happy dance.

One GIF broke the hardlink limit.

Without deduplication: 246,173 downloads, 377 GB transferred.

With deduplication (hitting limit): ~4 downloads, ~6.4 MB transferred.

The filesystem limit turned my "download once" into "download four times." I can live with that.

My first instinct was to track hardlink counts and proactively rotate before hitting the limit. But a colleague pointed out the flaw: we have no idea what filesystem is being used. ext4 has one limit, XFS another, ZFS another. Picking a magic number is fragile.

Better approach: let the filesystem tell us when we've hit the limit.

def create_hardlink(source_filename, upload_data, target_filename)
  FileUtils.mkdir_p(File.dirname(target_filename))
  FileUtils.ln(source_filename, target_filename)
  source_filename
rescue Errno::EMLINK
  # Filesystem hardlink limit reached - copy and use as new primary
  FileUtils.cp(source_filename, target_filename)
  target_filename
rescue StandardError => ex
  download_upload_to_file(upload_data, target_filename)
  source_filename
end

When Errno::EMLINK fires, we already have the file locally. No need to re-download. Just copy it and use the copy as the new primary for subsequent hardlinks. Works on any filesystem, no configuration needed.

Filesystems have opinions. ext4's hardlink limit exists to prevent certain classes of bugs and attacks. It's not arbitrary.

The fallback saved the feature. Without graceful degradation, that backup would have failed entirely. Instead it completed, just slower than optimal.

Production always finds the edge cases. 246,000 copies of one file is absurd. But absurd things happen at scale.

A few concrete takeaways:

  • Test for failure modes, not just success paths. The hardlink fallback was built-in from the start, but I never expected to actually need it.
  • Optimizations that reduce work by 16x still need to handle edge cases. A 99.998% improvement with graceful degradation beats a 100% improvement that crashes.
  • Track filesystem-level constraints early. Hardlink limits, inode counts, path lengths - these are real operational boundaries, not theoretical concerns.

And now I know Jennifer Aniston can stress-test infrastructure.

联系我们 contact @ memedata.com