在预览版中使用 Gemini 2.0 创建和编辑图像

在预览版中使用 Gemini 2.0 创建和编辑图像
Create and edit images with Gemini 2.0 in preview

原始链接: https://developers.googleblog.com/en/generate-images-gemini-2-0-flash-preview/

Google正在其Gemini 2.0 Flash中推出图像生成功能的预览版，开发者可以通过Google AI Studio和Vertex AI中的Gemini API访问该功能，模型名称为“gemini-2.0-flash-preview-image-generation”。此版本为开发者提供了更高的速率限制和改进的定价。升级后的模型具有更高的视觉质量、更准确的文本渲染，并且与实验版本相比，过滤器阻塞率显著降低。让开发者兴奋的关键功能包括：重新诠释产品、以对话方式编辑图像的特定部分以及动态创建带有文本和图像的新产品SKU。开发者可以利用API构建基于文本提示生成图像的应用程序，提供的Python代码就是一个示例。Google鼓励开发者探索这些新功能，并预期未来在质量、功能和速率限制方面会有所改进。

Hacker News 上的讨论线程关注 Google 新发布的 Gemini 2.0 图像生成能力，用户分享了他们的使用体验，并将其与 OpenAI 的 4o 和 Midjourney 等其他模型进行了比较。vunderba 的 genai-showdown.specr.net 提供了一个图像生成模型的测试套件，尽管一些用户觉得 Gemini 2.0 的美学质量令人失望。主要观点包括 Gemini 2.0 的多模态特性和速度优势，但也指出了其在保持图像质量和复制特定细节或风格方面的不足。用户建议改进测试网站，例如添加突出显示失败模式的提示，例如时钟上的特定时间或不寻常的建筑比例。该线程还涉及 Gemini 2.0 图像生成的成本、电子商务的潜在用例以及对 AI 生成内容涌入的担忧。总的来说，讨论反映了围绕当前 AI 图像生成现状的兴奋和怀疑情绪并存。

Based on the enthusiasm from developers, we are excited to announce that Image Generation capabilities are now available in preview with Gemini 2.0 Flash.

Developers can start integrating conversational image generation and editing with higher rate limits via the Gemini API in Google AI Studio and Vertex AI today using the model name “gemini-2.0-flash-preview-image-generation”.

What's new in Gemini 2.0 Flash image generation

In addition to enabling higher rate limits and pricing, we have also improved the model with:

Better visual quality (vs experimental version)

More accurate text rendering (vs experimental version)

Significantly reduced filter block rates (vs experimental version)

Gemini 2.0 Flash image generation in action

We have loved seeing the community reception of Gemini's image generation capabilities. Here’s a closer look at some of the key functionalities developers have been excited about:

1) Recontextualize products in new environments.

3) Edit specific parts of images conversationally, without changing anything else.

4) Dynamically create new product SKUs with text rendering and image.

(right click and open in new tab to view)

from google import genai
from google.genai import types
client = genai.Client(api_key="GEMINI_API_KEY")
response = client.models.generate_content(
   model="gemini-2.0-flash-preview-image-generation",
   contents=(
       "Show me how to bake a macaron with images."
   ),
   config=types.GenerateContentConfig(
        response_modalities=["TEXT", "IMAGE"]
   ),
)

You can read more about image generation in our API docs. This preview is available for developers to start building through Google AI Studio and Vertex AI .

We look forward to bringing further quality improvements, new capabilities, and expanded rate limits soon. We can’t wait to see what you build with Gemini 2.0 Flash Image Generation.