(评论)
(comments)

原始链接: https://news.ycombinator.com/item?id=43474112

Hacker News 用户正在讨论 OpenAI 新的 40 亿参数图像生成能力。最初的反应褒贬不一,一些人对最初的结果印象不佳,并指出了文本生成的不足。一些用户将其与 FLUX 不利对比。 其他人则指出图像生成速度缓慢,估计每张图像大约需要 30 秒,并推测它使用了类似 DALL-E 的基于 token 的解码方法。这与 Google 的 Gemini 形成对比,Gemini 可以更快地生成和编辑图像。缺乏 readily available 的 API 和预期的较高成本也是令人担忧的问题。 一些人认为发布时机与 Google Gemini 2.5 的发布相吻合,这种情况以前也发生过几次。一位用户认为,ChatGPT 新的、广泛可用的图像生成功能将严重影响小型 AI 图像生成初创公司和数字艺术家,创造一个简单的“表情包生成器”。他们希望开发一个免费且快速的开源模型来竞争。总的来说,生成的图像的文本连贯性和质量比之前的迭代更好。

相关文章
  • (评论) 2025-03-25
  • (评论) 2025-02-28
  • (评论) 2023-11-07
  • (评论) 2024-08-03
  • Flux:具有 12B 参数的开源文本到图像模型 2024-08-02

  • 原文
    Hacker News new | past | comments | ask | show | jobs | submit login
    4o Image Generation (openai.com)
    40 points by meetpateltech 23 minutes ago | hide | past | favorite | 8 comments










    Tried it, the "compise armporressed" and "Pros: made bord reqotons" didn't impress me in the slightest.


    Looks about what you'd get with FLUX and attaching some language model to enhance your prompt with eg more text


    Flux doesn't do text that good


    OpenAI's livestream of GPT-4o Image Generation shows that it is slowwwwwwwwww (maybe 30 seconds per image, which Sam Altman had to spin "it's slow but the generated images are worth it"). Instead of using a diffusion approach, it appears to be generating the image tokens and decoding them akin to the original DALL-E (https://openai.com/index/dall-e/), which allows for streaming partial generations from top to bottom. In contrast, Google's Gemini can generate images and make edits in seconds.

    No API yet, and given the slowness I imagine it will cost much more than the $0.03+/image of competitors.



    Did they time it with the Gemini 2.5 launch? https://news.ycombinator.com/item?id=43473489

    Was it public information when Google was going to launch their new models? Interesting timing.



    "Interesting timing" It's like the 4th time by my counting they've done this


    > ChatGPT’s new image generation in GPT‑4o rolls out starting today to Plus, Pro, Team, and Free users as the default image generator in ChatGPT, with access coming soon to Enterprise and Edu. For those who hold a special place in their hearts for DALL·E, it can still be accessed through a dedicated DALL·E GPT.

    > Developers will soon be able to generate images with GPT‑4o via the API, with access rolling out in the next few weeks.

    That's it folks. Tens of thousands of so-called "AI" image generator startups have been obliterated and taking digital artists with them all reduced to near zero.

    Now you have a widely accessible meme generator with the name "ChatGPT".

    The last task is for an open weight model that competes against this and is faster and all for free.



    Yep. The coherence and text quality is insanely good. Keen to play with it to find it's "mangled hands" style deficiencies, because of course they cherry picked the best examples.






    Join us for AI Startup School this June 16-17 in San Francisco!


    Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact



    Search:
    联系我们 contact @ memedata.com