(评论)
(comments)

原始链接: https://news.ycombinator.com/item?id=43694546

Cohere发布了新的嵌入模型Embed 4,并被赞誉为潜在的最佳模型。然而,用户simonw对依赖专有、仅API的模型持保留意见,尤其是在存在强大的开源替代方案的情况下。他称赞Nomic的做法:通过API或开源权重提供模型以供非商业用途,之后再根据Apache 2.0许可,从而确保模型的长期可用性。 另一位用户lukebuehler强调了多模态嵌入选项的匮乏,特别是文本和图像的结合,他认为Embed 4是一个受欢迎的补充。他还指出了谷歌模型的限制,即其仅支持30个文本token。 第三位用户moojacob指出,在基准测试中,Embed 4的表现不如voyage-3-large,并质疑了嵌入模型基准测试的整体效用。moralestapia承认该模型价格昂贵,但强调其基准测试结果很有前景。

相关文章
  • Cohere发布Embed 4 2025-04-15
  • (评论) 2025-03-13
  • Llama 4 牧群 2025-04-05
  • (评论) 2025-04-05
  • (评论) 2023-12-02

  • 原文
    Hacker News new | past | comments | ask | show | jobs | submit login
    Cohere Launches Embed 4 (cohere.com)
    11 points by rekovacs 42 minutes ago | hide | past | favorite | 4 comments










    I have huge respect for Cohere and this embedding model looks like it could be best-in-class, but I find it hard to commit to a proprietary embedding model that's only available via an API when there are such good open weight models available.

    I really like the approach Nomic take: their most recent models are available via their API or as open weights for non-commercial use only (unless you buy a license). They later relicense their older models under Apache 2.0 licenses.

    This gives me confidence that I can continue to use my calculated vectors in the future even if Nomic's model is no longer available because I can run the local one instead.

    Nomic Embed Vision 1.5 for example started out as CC-BY-NC-4.0 but was later relicensed to Apache 2.0: https://www.nomic.ai/blog/posts/nomic-embed-vision



    I just started to look into multi-modal embedding models recently, and I was surprised how few options there are.

    For example, Google's model only supports 30 text tokens [1]!!

    This is definitely a welcome addition.

    Any pointers to similarly powerful embedding models? I'm looking specifically for text and images? I wish there'd be also one that could do audio and video, but I don't think that exists.

    [1] https://cloud.google.com/vertex-ai/generative-ai/docs/embedd...



    Seems to under-perform voyage-3-large on the same benchmark. At the same time, I'm unsure how useful benchmarks are for embeddings.


    A bit expensive but the benchmarks look quite good!






    Join us for AI Startup School this June 16-17 in San Francisco!


    Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact



    Search:
    联系我们 contact @ memedata.com