FLUX.2 [克莱因]: 通往交互式视觉智能
FLUX.2 [Klein]: Towards Interactive Visual Intelligence

原始链接: https://bfl.ai/blog/flux2-klein-towards-interactive-visual-intelligence

## FLUX.2 [klein]:快速、易用的图像生成与编辑 FLUX.2 [klein] 是一系列新型图像模型,旨在实现**实时性能**,同时不牺牲质量。这些模型实现了**亚秒级推理**——在 0.5 秒内生成或编辑图像——并且可以在配备至少 13GB VRAM 的消费级硬件上运行。 [klein] 系列将**文本到图像生成、图像编辑和多参考能力**统一到一个紧凑的架构中。提供 4B 和 9B 参数版本,以及蒸馏模型和基础(未蒸馏)模型,它们为各种用例提供灵活性,从快速原型设计到研究和微调。 主要特性包括**4B 模型的 Apache 2.0 许可**、开放权重以供定制,以及与 NVIDIA 合作开发的优化量化版本(FP8 和 NVFP4),以实现更快的性能。FLUX.2 [klein] 旨在为开发者和创作者提供易于访问、交互式的视觉智能,用于实时设计和智能视觉推理等应用。 **在此试用演示并访问 API/权重:**[链接到演示]

黑客新闻 新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 FLUX.2 [Klein]: 朝向交互式视觉智能 (bfl.ai) 9 分,来自 GaggiX 1 小时前 | 隐藏 | 过去 | 收藏 | 1 条评论 codezero 15 分钟前 [–] 我感到惊叹,但并不完全意外,这些模型在质量和效果不断提高的同时,体积却越来越小。z image turbo 非常厉害,我期待尝试这个。 一个较早的讨论帖有很多评论:https://news.ycombinator.com/item?id=46046916 回复 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系 搜索:
相关文章

原文

Today, we release the FLUX.2 [klein] model family, our fastest image models to date. FLUX.2 [klein] unifies generation and editing in a single compact architecture, delivering state-of-the-art quality with end-to-end inference as low as under a second. Built for applications that require real-time image generation without sacrificing quality, and runs on consumer hardware with as little as 13GB VRAM.

Try it now for free here

Demo showing editing with FLUX.2 [klein]

Why go [klein]?

Visual Intelligence is entering a new era. As AI agents become more capable, they need visual generation that can keep up; models that respond in real-time, iterate quickly, and run efficiently on accessible hardware.

The klein name comes from the German word for "small", reflecting both the compact model size and the minimal latency. But FLUX.2 [klein] is anything but limited. These models deliver exceptional performance in text-to-image generation, image editing and multi-reference generation, typically reserved for much larger models.

What's New

  • Sub-second inference. Generate or edit images in under 0.5s on modern hardware.
  • Photorealistic outputs and high diversity, especially in the base variants.
  • Unified generation and editing. Text-to-image, image editing, and multi-reference support in a single model while delivering frontier performance.
  • Runs on consumer GPUs. The 4B model fits in ~13GB VRAM (RTX 3090/4070 and above).
  • Developer-friendly & Accessible: Apache 2.0 on 4B models, open weights for 9B models. Full open weights for customization and fine-tuning.
  • API and open weights. Production-ready API or run locally with full weights.

Note: The “FLUX [dev] Non-Commercial License” has been renamed to “FLUX Non-Commercial License” and will apply to the 9B Klein models. No material changes have been made to the license.

Text to Image collage using FLUX.2 [klein]

The FLUX.2 [klein] Model Family

FLUX.2 [klein] 9B

Our flagship small model. Defines the Pareto frontier for quality vs. latency across text-to-image, single-reference editing, and multi-reference generation. Matches or exceeds models 5x its size - in under half a second. Built on a 9B flow model with 8B Qwen3 text embedder, step-distilled to 4 inference steps.

Combine multiple input images, blend concepts, and iterate on complex compositions - all at sub-second speed with frontier-level quality. No model this fast has ever done this well.

License: FLUX NCL

Imagine editing collage using FLUX.2 [klein]


FLUX.2 [klein] 4B:

Fully open under Apache 2.0. Our most accessible model, it runs on consumer GPUs like the RTX 3090/4070. Compact but capable: supports T2I, I2I, and multi-reference at quality that punches above its size. Built for local development and edge deployment.

License: Apache 2.0

FLUX.2 [klein] Base 9B / 4B:

The full-capacity foundation models. Undistilled, preserving complete training signal for maximum flexibility. Ideal for fine-tuning, LoRA training, research, and custom pipelines where control matters more than speed. Higher output diversity than the distilled models.

License: 4B Base under Apache 2.0, 9B Base under FLUX NCL

Output Diversity using FLUX.2 [klein]

Quantized versions

We are also releasing FP8 and NVFP4 versions of all [klein] variants, developed in collaboration with NVIDIA for optimized inference on RTX GPUs. Same capabilities, smaller footprint - compatible with even more hardware.

  • FP8: Up to 1.6x faster, up to 40% less VRAM
  • NVFP4: Up to 2.7x faster, up to 55% less VRAM

Benchmarks on RTX 5080/5090, T2I at 1024×1024
Same licenses apply: Apache 2.0 for 4B variants, FLUX NCL for 9B.


Performance Analysis

FLUX.2 [klein] Elo vs Latency (top) and VRAM (bottom) across Text-to-Image, Image-to-Image Single Reference, and Multi-Reference tasks. FLUX.2 [klein] matches or exceeds Qwen's quality at a fraction of the latency and VRAM, and outperforms Z-Image while supporting both text-to-image generation and (multi-reference) image editing in a unified model. The base variants trade some speed for full customizability and fine-tuning, making them better suited for research and adaptation to specific use cases. Speed is measured on a GB200 in bf16.

Into the New

FLUX.2 [klein] is more than a faster model. It's a step toward our vision of interactive visual intelligence. We believe the future belongs to creators and developers with AI that can see, create, and iterate in real-time. Systems that enable new categories of applications: real-time design tools, agentic visual reasoning, interactive content creation.

Resources

Try it

Build with it

Learn more

联系我们 contact @ memedata.com