Cerebras 代码现已支持 GLM 4.6，速度达 1000 个 token/秒。
Cerebras Code now supports GLM 4.6 at 1000 tokens/sec

原始链接: https://www.cerebras.ai/code

Cerebras是一家最近估值81亿美元的人工智能技术公司，在完成11亿美元融资后，正通过其“Cerebras Code Pro”服务提供访问GLM-4.6的权限，GLM-4.6是一款性能卓越的开源编码模型。 GLM-4.6在代码生成方面表现出色，在工具调用方面取得了领先的分数，并且在Web开发性能方面与Sonnet 4.5相当，速度超过每秒1000个token。它被设计为通过API密钥与流行的AI代码编辑器（如Cline和RooCode）无缝集成。 Cerebras提供三个等级：免费选项（访问受限）、“Pro”等级（50美元/月，用于中等使用量，每天最多2400万token）和“Max”等级（200美元/月，用于繁重开发工作流程，每天最多1.2亿token）。这使得开发者能够在不破坏现有工具和工作流程的情况下，利用强大的AI编码辅助。

Hacker News 新闻 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交登录 Cerebras Code 现在支持 GLM 4.6，速度达到 1000 tokens/秒 (cerebras.ai) 10 分，nathabonfim59 发表于 3 小时前 | 隐藏 | 过去 | 收藏 | 1 条评论 alyxya 发表于 3 分钟前 [–] 如果该页面能提供更多信息就更好了。我假设这只是输出 token 生成速度。它是否使用推测解码来达到 1000 tokens/秒？是否使用了有损量化来加速？我认为模型每秒生成的 token 数是我关心的事情中相对靠后的，模型/推理质量和利用率在很大程度上影响了我对使用编码代理的感受。回复考虑申请 YC 2026 冬季批次！申请截止日期为 11 月 10 日指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系方式搜索：

Cerebras Raises $1.1B Series G at $8.1B Valuation: Read the Press Release

now upgraded with glm 4.6

THE FASTEST WAY TO CODE WITH AI

Stop waiting on your model. Cerebras runs GLM 4.6 — the best-in-class model for code generation, at 1,000 tokens+ per second — so you can stay in flow.

Try Now

State of the Art Frontier Model

GLM-4.6 is one of the world’s top open coding models: #1 for tool calling on the Berkeley Function Calling Leaderboard and on par with Sonnet 4.5 in web-dev performance.

Bring Your Own AI Code Editor

Use Cerebras Code Pro with any AI-friendly editor or agent that accepts your API key. Works out of the box with Cline, RooCode, OpenCode, Crush, and more. Integrate instantly and code without switching tools.

Free

$0

GLM 4.6 access with limited tokens and requests.

Great for trying out Cerebras inference or building a small demo in your favorite AI Code Editor.

Coming Soon

Pro

$50

GLM4.6 access with fast, high-context completions. Send up to 24million tokens per day, enough for 3–4 hours of uninterrupted vibe coding.

Ideal for indie devs, simple agentic workflows, and weekend projects.

get started

Max

$200

GLM4.6 access for heavy coding workflows. Send up to 120 million tokens/day.

Ideal for full-time development, IDE integrations, code refactoring, and multi-agent systems.

get started

Follow

Get Updates

Newsletter signup

Company

News

Insights

[email protected]

1237 E. Arques Ave  Sunnyvale, CA 94085

Cerebras 代码现已支持 GLM 4.6，速度达 1000 个 token/秒。 Cerebras Code now supports GLM 4.6 at 1000 tokens/sec