水印分割
Watermark segmentation

原始链接: https://github.com/Diffusion-Dynamics/watermark-segmentation

Diffusion Dynamics的“水印分割”代码库展示了clear.photo的核心技术,重点在于至关重要的第一步:水印分割。它采用深度学习方法生成突出显示基于logo水印的掩码,其灵感来源于扩散模型和图像恢复的研究。 该代码库提供了一个功能性的代码库用于训练分割模型,它利用segmentation-models-pytorch库和预训练的ImageNet骨干网络来加快训练速度。训练由pytorch-lightning管理,兼容苹果M系列芯片和英伟达GPU。一个关键元素是`dataset.py`中的动态数据增强,它将随机水印应用于背景图像以提高鲁棒性。 `watermark-segmentation.ipynb`笔记本指导用户加载图像、预处理、运行模型以及对生成的掩码进行后处理。一个常见的细化步骤是使用形态学膨胀来增强掩码覆盖率。该项目作为一个构建更复杂水印去除工具的基础示例。虽然一个可用于生产的系统需要大量的工程工作,但该代码库提供了一个清晰易懂的水印分割基线。

Hacker News 上的一个帖子讨论了一个关于水印分割的项目。用户 xnx 询问是否有任何水印去除系统允许对具有相同水印的多个图像进行“训练”以改进去除效果,并提到了谷歌2017年的一篇论文。他们认为,多个具有相同水印的例子应该简化这个过程。 另一个用户 speerer 则质疑这项技术的普遍建设性应用,暗示其潜在用途包括去除扫描件上的污渍或红眼校正。DavidVoid 则提出了一个问题:这项特定技术是否存在任何合法用途。

原文

Python Version License arXiv arXiv arXiv arXiv

Diffusion Dynamics Logo

clear.photo Logo

This repository by Diffusion Dynamics, showcases the core technology behind the watermark segmentation capabilities of our first product, clear.photo. This work leverages insights from research on diffusion models for image restoration tasks.

Effective watermark removal hinges on accurately identifying the watermark's precise location and shape within the image. This code tackles the first crucial step: watermark segmentation.

We present a deep learning approach trained to generate masks highlighting watermark regions. This repository focuses on segmenting logo-based watermarks, demonstrating a robust technique adaptable to various watermark types. The methodologies employed draw inspiration from advancements in image segmentation.

This repository aims to consolidate key ideas from recent research in visible watermark removal and segmentation, including techniques presented in:

  • arXiv:2108.03581 "Visible Watermark Removal via Self-calibrated Localization and Background Refinement"
  • arXiv:2012.07616 "WDNet: Watermark-Decomposition Network for Visible Watermark Removal"
  • arXiv:2312.14383 "Removing Interference and Recovering Content Imaginatively for Visible Watermark Removal"
  • arXiv:2502.02676 "Blind Visible Watermark Removal with Morphological Dilation"

It distills these concepts into a minimal, functional codebase focused purely on the segmentation task. The goal is to provide a clear, understandable baseline that is easy to modify and build upon, even allowing for fine-tuning on consumer hardware like laptops with Apple M-series chips. It serves as a foundational example demonstrating the core techniques applicable to building more complex tools like clear.photo.

Background: The Role of Segmentation in Watermark Removal

A typical advanced watermark removal pipeline involves:

  1. Segmentation: Generating a precise mask that isolates the watermark pixels from the background image content.
  2. Inpainting/Restoration: Using the mask to guide an algorithm (often generative, like diffusion models) to intelligently 'fill in' the region previously occupied by the watermark, seamlessly reconstructing the underlying image.

This project provides the necessary tools to train a watermark segmentation model and use it for inference. Key components include:

  • watermark-segmentation.ipynb: A Jupyter notebook containing the end-to-end workflow:
  • dataset.py: A Python script defining the Dataset class responsible for generating training data. It dynamically applies logo watermarks (from the logos/ directory) onto background images with randomized properties (scale, rotation, opacity, position, blend mode) to create diverse training samples and their corresponding ground truth masks. This data augmentation strategy is key to the model's robustness.
  • requirements.txt: Lists all necessary Python dependencies.
  • *.pth: Model weights from different training epochs.
  • logos/: A directory for sample watermark logos. Populate this with logos relevant to your use case.
  • lightning_logs/: Default directory where PyTorch Lightning saves training logs and checkpoints.

Follow these steps to set up and run the project:

1. Prerequisites:

  • Python 3.10 or later.
  • wget and unzip (or equivalent tools for downloading and extracting datasets).
python --version
# Ensure it shows 3.10.x or newer

2. Clone the Repository:

git clone https://github.com/Diffusion-Dynamics/watermark-segmentation
cd watermark-segmentation

3. Install Dependencies:

It's recommended to use a virtual environment:

python -m venv venv
source venv/bin/activate # On Windows use `virtual\Scripts\activate`
pip install -r requirements.txt

4. Prepare Datasets:

  • Background Images: You need a dataset of diverse background images. The original setup used Flickr8k. Download and extract it (or your chosen dataset) into a known location.
    # Example using Flickr8k
    wget https://github.com/jbrownlee/Datasets/releases/download/Flickr8k/Flickr8k_Dataset.zip
    unzip Flickr8k_Dataset.zip
    # Ensure the notebook points to the 'Flicker8k_Dataset' directory
  • Watermark Logos: Place your desired watermark logo images (PNG format with transparency is recommended) into the logos/ directory. The dataset.py script will randomly use these.

5. Run the Notebook:

  • Launch Jupyter Lab or Jupyter Notebook:
  • Open watermark-segmentation.ipynb.
  • Crucially, update the paths to your background image dataset and logo directory within the notebook cells if they differ from the defaults.
  • Execute the cells to train the model and perform inference.

Model Architecture and Training

The core segmentation model leverages standard segmentation architectures conveniently provided by the library segmentation-models-pytorch. This library offers pre-implemented models with various backbones. To accelerate training and reduce the need for vast amounts of training data, the model utilizes a backbone pre-trained on the ImageNet dataset.

Training is managed using pytorch-lightning, which simplifies the training loop, facilitates multi-GPU training, and integrates logging. The training process is compatible with both Apple M-series chips (via MPS) and NVIDIA GPUs (via CUDA). While NVIDIA GPUs generally offer faster training, fine-tuning on an Apple M-series chip is feasible and can typically be completed within a few hours.

The key to achieving good performance and generalization lies in the data generation strategy within dataset.py. This involves synthetic data augmentation: dynamically applying diverse and randomized watermarks (varying size, position, opacity, rotation, blend modes) onto clean background images during training. This forces the model learns to identify watermarks under various conditions, making it robust to unseen watermarks and backgrounds. This approach is informed by techniques discussed in related image restoration research (arXiv:2502.02676).

The notebook (watermark-segmentation.ipynb) demonstrates how to:

  1. Load an input image containing a watermark.
  2. Preprocess the image for the model.
  3. Run the model to obtain a segmentation mask (usually a probability map).
  4. Post-process the mask (e.g., thresholding) to get a binary segmentation. A common refinement step, inspired by techniques like morphological dilation discussed in works such as "Blind Visible Watermark Removal with Morphological Dilation" (arXiv:2502.02676), can be applied here to potentially expand the mask slightly, ensuring better coverage for subsequent removal steps, although this specific paper focuses on the refinement stage within the removal process itself.

The output mask precisely identifies the watermark regions, ready for use in downstream removal tasks.

Building a robust, production-ready watermark segmentation and removal system involves significant engineering challenges beyond the scope of this repository. If you require a fast, scalable, and reliable solution designed for real-world demands, consider using our platform: clear.photo.

联系我们 contact @ memedata.com