Libbbf:限定书籍格式,一种高性能的漫画和漫画书容器。
Libbbf: Bound Book Format, A high-performance container for comics and manga

原始链接: https://github.com/ef1500/libbbf

## 绑定书格式 (BBF) 概要 BBF 是一种新型、高性能的二进制容器格式,专为数字漫画和漫画设计,旨在效率和功能上超越 CBR/CBZ 和 PDF。它通过 **DirectStorage/mmap** 兼容性、将数据对齐到 4096 字节边界以实现快速访问,以及利用 **并行 XXH3 完整性检查**(比传统方法快 10 倍)来优先考虑速度。 BBF 的页脚索引结构能够实现快速随机访问和仅追加创建。它支持 **混合编解码器**(如 AVIF 和 PNG)并具有明确的编解码器标记、**数据去重**和 **任意 UTF-8 元数据**。一个关键工具 **bbfmux** 能够促进 BBF 文件的创建、管理和验证。 特性包括分层章节(卷和章节)、自定义页面排序和强大的提取功能。BBF 具有诸如坏位检测以及与 ZIP/RAR 等格式相比的简化解析器等优势。它在 MIT 许可下分发,旨在成为一种健壮且面向未来的归档格式。

Hacker News 新闻 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 Libbbf: 绑定书籍格式,一个高性能的漫画和漫画容器 (github.com/ef1500) 15 分,zdw 1小时前 | 隐藏 | 过去 | 收藏 | 1 条评论 its-summertime 11分钟前 [–] https://www.reddit.com/r/selfhosted/comments/1qi64pr/i_got_i...回复 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请YC | 联系 搜索:
相关文章

原文

alt text alt text

Warning

Official Source Notice: Please only download releases from this repository (ef1500/libbbf). External mirrors or forks may contain malware.

Bound Book Format (.bbf) is a high-performance binary container designed specifically for digital comic books and manga. Unlike CBR/CBZ, BBF is built for DirectSotrage/mmap, easy integrity checks, and mixed-codec containerization.


  • C++17 compliant compiler (GCC/Clang/MSVC), and optionally CMake
  • xxHash library
cmake -B build
cmake --build build
sudo cmake --install build

Linux

g++ -std=c++17 bbfenc.cpp libbbf.cpp xxhash.c -o bbfmux -pthread

Windows

g++ -std=c++17 bbfenc.cpp libbbf.cpp xxhash.c -o bbfmux -municode

Alternatively, if you need python support, use libbbf-python.


BBF is designed as a Footer-indexed binary format. This allows for rapid append-only creation and immediate random access to any page without scanning the entire file.

The bbfmux reference implementation utilizes Memory Mapping (mmap/MapViewOfFile). Instead of reading file data into intermediate buffers, the tool maps the container directly into the process address space. This allows the CPU to access image data at the speed of your NVMe drive's hardware limit.

High-Speed Parallel Verification

Integrity checks utilize Parallel XXH3. On multi-core systems, the verifier splits the asset table into chunks and validates multiple pages simultaneously. This makes BBF verification up to 10x faster than ZIP/RAR CRC checks.

Every asset in a BBF file starts on a 4096-byte boundary. This alignment is critical for modern hardware, allowing for DirectStorage transfers directly from disk to GPU memory, bypassing CPU bottlenecks entirely.

  1. Header (13 bytes): Magic BBF1, versioning, and initial padding.
  2. Page Data: The raw image payloads (AVIF, PNG, etc.), each padded to 4096-byte boundaries.
  3. String Pool: A deduplicated pool of null-terminated strings for metadata and section titles.
  4. Asset Table: A registry of physical data blobs with XXH3 hashes.
  5. Page Table: The logical reading order, mapping logical pages to assets.
  6. Section Table: Markers for chapters, volumes, or gallery sections.
  7. Metadata Table: Key-Value pairs for archival data (Author, Scanlation team, etc.).
  8. Footer (76 bytes): Table offsets and a final integrity hash.

NOTE: libbbf.h includes a flags field, as well as extra padding for each asset entry. This is so that in the future libbbf can accomodate future technical advancements in both readers and image storage.

Feature Comparison: Digital Comic & Archival Formats

Feature BBF CBZ (Zip) CBR (Rar) PDF EPUB Folder
Random Page Access
Native Data Deduplication ⚠️ [1]
Per-Asset Integrity (XXH3)
4KB Sector Alignment
Native Sections/Chapters
Arbitrary Metadata (UTF-8) ⚠️ [2]
Mixed-Codec Support
DirectStorage/mmap Ready ⚠️ [3]
Low Parser Complexity ⚠️ [4]
Bit-Rot Detection ⚠️ [5] ⚠️ [5]
Streaming-Friendly Index ⚠️ [6] ⚠️ [6] ✅ [7] ⚠️
Wide Software Support
[1] - PDF supports XObjects to reuse resources, but lacks native content-hash deduplication; identical images must be manually referenced.
[2] - CBZ does not support metadata natively in the ZIP spec; it relies on unofficial sidecar files like ComicInfo.xml.
[3] - While folders allow memory mapping, individual images within them are rarely sector-aligned for optimized DirectStorage throughput.
[4] - ZIP/RAR require large, complex libraries (zlib/libarchive); BBF is a "Plain Old Data" (POD) format requiring only a few lines of C++ to parse.
[5] - ZIP/RAR use CRC32, which is aging, collision-prone, and significantly slower to verify than XXH3 for large archival collections.
[6] - Because the index is at the end (Footer), web-based streaming requires a "Range Request" to the end of the file before reading pages.
[7] - PDF supports "Linearization" (Fast Web View), allowing the header and first pages to be read before the rest of the file is downloaded.

Graphical Comparison (BBF vs. CBZ)

performance_grid

BBF uses XXH3_64 hashing to identify identical pages. If a book contains duplicate pages, the data is stored exactly once on disk while being referenced multiple times in the Page Table.

BBF stores a 64-bit hash for every individual asset. The bbfmux --verify command can pinpoint exactly which page has been damaged, rather than simply failing to open the entire archive.

Preserve covers in Lossless PNG while encoding internal story pages in AVIF to save 70% space. BBF explicitly flags the codec for every asset, allowing readers to initialize the correct decoder instantly without "guessing" the file type.


The included bbfmux tool is a reference implementation for creating and managing BBF files.

The bbfmux utility provides a powerful interface for managing Bound Book files:

  • Flexible Ingestion: Create books by passing individual files, entire directories, or a mix of both.
  • Logical Structuring: Add named Sections (Chapters, Volumes, Extras, Galleries) to define the internal hierarchy of the book.
  • Custom Metadata: Embed arbitrary Key:Value pairs into the global string pool for archival indexing.
  • Content-Aware Extraction: Extract the entire book or target specific sections by name.

You can mix individual images and folders. bbfmux sorts inputs alphabetically, deduplicates identical assets, and aligns data to 4096-byte boundaries. See Advanced CLI Usage for how to specify your own custom page orders.

# Basic creation with metadata
bbfmux cover.png ./chapter1/ endcard.png \
  --meta=Title:"Akira" \
  --meta=Author:"Katsuhiro Otomo" \
  --meta=Tags:"[Action, Sci-Fi, Cyberpunk]" \
  akira.bbf

Hierarchical Sections (Volumes & Chapters)

BBF supports nesting sections. By defining a Parent relationship, you can group chapters into volumes. This allows readers to display a nested Table of Contents and enables bulk-extraction of entire volumes.

Syntax: --section="Name":Page[:ParentName]

# Create a book with nested chapters
bbfmux ./manga_folder/ \
  --section="Volume 1":1 \
  --section="Chapter 1":1:"Volume 1" \
  --section="Chapter 2":20:"Volume 1" \
  --section="Volume 2":180 \
  --section="Chapter 3":180:"Volume 2" \
  manga.bbf

Scan the archive for bit-rot or data corruption. BBF uses XXH3_64 hashes to verify every individual image payload.

bbfmux input.bbf --verify

Extract the entire book, a specific volume, or a single chapter. When extracting a parent section (like a Volume), bbfmux automatically includes all child chapters.

Extract a specific section:

bbfmux input.bbf --extract --section="Volume 1" --outdir="./Volume1"

Extract the entire book:

bbfmux input.bbf --extract --outdir="./unpacked_book"

View Metadata & Structure

View the version, page count, deduplication stats, hierarchical sections, and all embedded metadata.

bbfmux input_book.bbf --info

bbfmux also supports more advanced options, allowing full-control over your .bbf files.

Custom Page Ordering (--order)

You can precisely control the reading order using a text file or inline arguments.

  • Positive Integers: Fixed 1-based index (e.g., cover.png:1).
  • Negative Integers: Fixed position from the end (e.g., credits.png:-1 is always the last page).
  • Unspecified: Sorted alphabetically between the fixed pages.
# Using an order file
bbfmux ./images/ --order=pages.txt out.bbf

# pages.txt example:
cover.png:1
page1.png:2
page2.png:3
credits.png:-1

Batch Section Import (--sections)

Sections define Chapters or Volumes. You can target a page by its index or filename.

# Target by filename
bbfmux ./folder/ --section="Chapter 1":"001.png" out.bbf

# Using a sections file
bbfmux ./folder/ --sections=sectionexample.txt out.bbf

# sectionexample.txt example (Name:Target[:Parent]):
"Volume 1":"001.png"
"Chapter 1":"001.png":"Volume 1"
"Chapter 2":"050.png":"Volume 1"

BBF allows for verification of data to detect bit-rot.

# Verify everything (All assets and Directory structure)
bbfmux input.bbf --verify

# Verify only the directory hash (Instant)
bbfmux input.bbf --verify -1

# Verify a specific asset by index
bbfmux input.bbf --verify 42

The --rangekey option allows you to extract a range of sections. The extractor starts at the specified --section and stops when it finds a section whose title matches the rangekey.

# Extract Chapter 2 up until it hits Chapter 4
bbfmux manga.bbf --extract --section="Chapter 2" --rangekey="Chapter 4" --outdir="./Ch2_to_Ch4"

# Extract Volume 2 until it encounters the string "Chapter 60"
bbfmux manga.bbf --extract --section="Volume 2" --rangekey="Chapter 60" --outdir="./Volume_2_to_Chapter_60"

Distributed under the MIT License. See LICENSE for more information.

联系我们 contact @ memedata.com