感知图像编码:实际学习型图像压缩中的关键因素
Perceptual Image Codec: What Matters in Practical Learned Image Compression

原始链接: https://apple.github.io/ml-pico/

PICO(感知图像编解码器)是一种新型的、经学习的图像压缩框架,专为人类视觉感知和实际设备性能而优化。通过对数百万种模型配置进行广泛搜索,研究人员在感知质量与实际运行效率之间取得了平衡。 PICO 的性能显著优于当前的行业标准,与 AV1、VVC 和 JPEG-AI 等编解码器相比,比特率节省了 2.3–3 倍,并比现有的先进学习型编解码器提升了 20–40% 的效率。除了压缩效率,PICO 还针对硬件进行了深度优化;在 iPhone 17 Pro Max 上,它能在 230 毫秒内编码 1200 万像素图像,并在 150 毫秒内完成解码——这一速度超过了大多数在高端桌面 GPU 上运行的同类机器学习编解码器。此外,该编解码器还确保了跨平台的稳健性,使其成为实际部署的可行解决方案。

抱歉。
相关文章

原文

We introduce PICO (Perceptual Image Codec) — the first learned codec that is both practical, and optimized directly for the human visual system. To derive it, we perform a comprehensive study of modeling choices for practical learned codecs, and search over millions of model configurations to jointly optimize over perceptual quality and on-device runtime.

Based on large-scale subjective user studies, PICO provides 2.3-3× bitrate savings against AV1, AV2, VVC, ECM and JPEG-AI, and 20-40% bitrate savings against the best learned codec alternatives. At the same time, on an iPhone 17 Pro Max, it encodes 12MP images as fast as 230ms, and decodes them in 150ms — faster than most top ML-based codecs run on a V100 GPU. Different from most learned codecs, PICO furthermore comes with cross-platform robustness guarantees.

Interactive comparison across different images. PICO (Ours) is fixed on the left. Select an image and comparison method from the overlay buttons, then drag the slider to compare. Best viewed on a large screen.

If you find our work useful, please cite:

@article{tatwawadi2026pico,
  title={What Matters in Practical Learned Image Compression},
  author={Tatwawadi, Kedar and Rahimzadeh, Parisa and Sun, Zhanghao and Chen, Zhiqi and Yang, Ziyun and Nair, Sanjay and Hasteer, Divija and Rippel, Oren},
  journal={arXiv preprint arXiv:2605.05148},
  year={2026}
}
联系我们 contact @ memedata.com