英伟达发布Vera CPU,专为代理AI打造。
Nvidia Launches Vera CPU, Purpose-Built for Agentic AI

原始链接: https://nvidianews.nvidia.com/news/nvidia-launches-vera-cpu-purpose-built-for-agentic-ai

## NVIDIA 发布 Vera CPU,迎接 AI 代理时代 NVIDIA 发布了 Vera CPU,这是一款专为代理 AI 和强化学习需求设计的处理器,承诺比传统 CPU **效率提高一倍,性能提升 50%**。这款新 CPU 专注于为能够规划、与数据交互和执行任务的 AI 模型提供动力——本质上,是 AI “工厂”。 Vera 拥有 88 个定制核心和下一代低功耗内存子系统,提供高带宽和响应速度,这对于编码助手和智能代理等大规模 AI 服务至关重要。它专为可扩展性而设计,新的机架配置支持数千个并发 CPU 环境。 **阿里巴巴、Meta、Oracle Cloud Infrastructure 和 CoreWeave** 等主要厂商已经开始采用 Vera,同时还有 **Dell、HPE、Lenovo 和 Supermicro** 等硬件合作伙伴。早期测试显示,在编码和流式数据等领域取得了显著的性能提升,国家实验室和云提供商也计划部署。 NVIDIA Vera 目前正在生产中,并将于今年下半年通过合作伙伴提供,旨在普及先进 AI 能力的访问。

英伟达最近发布了Vera CPU,专门为“代理AI”工作负载设计。然而,这一消息在Hacker News上引发了一些关于其架构和连接性的讨论。 虽然Vera CPU拥有令人印象深刻的带宽——据报道比PCIe Gen 6高7倍——但评论员指出,数据需要*通过*PCIe,然后通过网络(即使是高速800Gbe)与其他系统通信,这会产生延迟和开销。 对话中表达了对更直接的芯片间通信结构的需求,并提到了谷歌的高扇出系统和微软的Maia 200(片上2.8Tbps以太网)作为例子。人们希望像CXL这样的技术能够减少连接开销和延迟,简化带宽密集型应用的通信,可能不仅仅局限于AI。许多大型AI集群已经使用InfiniBand网络代替标准的PCIe以太网进行节点间通信。
相关文章

原文

NVIDIA Vera CPU Delivers the Highest Performance and Energy Efficiency for Data Processing, AI Training and Agentic Inference at Scale

News Summary:

  • The NVIDIA Vera CPU delivers results with twice the efficiency and 50% faster than traditional CPUs.
  • Customers collaborating with NVIDIA to deploy Vera CPU include Alibaba, ByteDance, Meta and Oracle Cloud Infrastructure, along with CoreWeave, Lambda, Nebius and Nscale.
  • Manufacturing partners already adopting the Vera CPU include Dell Technologies, HPE, Lenovo and Supermicro, along with ASUS, Compal, Foxconn, GIGABYTE, Pegatron, Quanta Cloud Technology (QCT), Wistron and Wiwynn.

GTC—NVIDIA today launched the NVIDIA Vera CPU, the world’s first processor purpose-built for the age of agentic AI and reinforcement learning — delivering results with twice the efficiency and 50% faster than traditional rack-scale CPUs.

As reasoning and agentic AI advances, scale, performance and cost are increasingly driven by the infrastructure supporting the models that plan tasks, run tools, interact with data, run code and validate results.

The NVIDIA Vera CPU builds on the success of the NVIDIA Grace™ CPU, enabling organizations of all sizes and across industries to build AI factories that unlock agentic AI at scale. With the highest single-thread performance and bandwidth per core, Vera is a new class of CPU that delivers higher AI throughput, responsiveness and efficiency for large-scale AI services such as coding assistants, as well as consumer and enterprise agents.

Leading hyperscalers collaborating with NVIDIA to deploy Vera include Alibaba, CoreWeave, Meta and Oracle Cloud Infrastructure, as well as global system makers Dell Technologies, HPE, Lenovo, Supermicro and others. This broad adoption establishes Vera as the new CPU standard for the AI workloads that matter most for developers, startups, public-private institutions and enterprises — helping democratize access to AI and accelerating innovation.

“Vera is arriving at a turning point for AI. As intelligence becomes agentic — capable of reasoning and acting — the importance of the systems orchestrating that work is elevated,” said Jensen Huang, founder and CEO of NVIDIA. “The CPU is no longer simply supporting the model; it’s driving it. With breakthrough performance and energy efficiency, Vera unlocks AI systems that think faster and scale further.”

Configurable for Every Data Center
NVIDIA announced a new Vera CPU rack integrating 256 liquid-cooled Vera CPUs to sustain more than 22,500 concurrent CPU environments, each running independently at full performance. AI factories can quickly deploy and scale to tens of thousands of simultaneous instances and agentic tools in a single rack.

The new Vera rack is built using the NVIDIA MGX™ modular reference architecture, supported by 80 ecosystem partners worldwide.

As part of the NVIDIA Vera Rubin NVL72 platform, Vera CPUs are paired with NVIDIA GPUs through NVIDIA NVLink™-C2C interconnect technology, with 1.8 TB/s of coherent bandwidth — 7x the bandwidth of PCIe Gen 6 — for high-speed data sharing between CPUs and GPUs. Additionally, NVIDIA introduced new reference designs that use Vera as the host CPU for NVIDIA HGX™ Rubin NVL8 systems, coordinating data movement and system control for GPU-accelerated workloads.

Vera systems partners are providing both dual and single-socket CPU server configurations, optimal for workloads such as reinforcement learning, agentic inference, data processing, orchestration, storage management, cloud applications and high-performance computing.

Across all configurations, Vera systems integrate NVIDIA ConnectX® SuperNIC cards and NVIDIA BlueField®-4 DPUs for accelerated networking, storage and security, which are critical for agentic AI. This enables customers to optimize for their specific workloads while maintaining a single software stack across the NVIDIA platform.

Designed for Agentic Scaling
By combining high-performance, energy-efficient CPU cores, a high-bandwidth memory subsystem and the second-generation NVIDIA Scalable Coherency Fabric, Vera enables faster agentic responses under the extreme utilization conditions common for agentic AI and reinforcement learning.

Vera features 88 custom NVIDIA-designed Olympus cores, delivering high performance for compilers, runtime engines, analytics pipelines, agentic tooling and orchestration services. Each core can run two tasks, using NVIDIA Spatial Multithreading, to deliver consistent, predictable performance — ideal for multi-tenant AI factories running many jobs at once.

To further enhance energy efficiency, Vera introduces the second generation of NVIDIA’s low-power memory subsystem, now built on LPDDR5X memory and delivering up to 1.2 TB/s of bandwidth — twice the bandwidth and at half the power compared with general-purpose CPUs.

Widespread Ecosystem Support
Cursor, an innovator in AI-native software development, is adopting NVIDIA Vera to boost performance for its AI coding agents.

“We’re excited to use NVIDIA Vera CPUs to improve overall throughput and efficiency so we can deliver faster, more responsive coding agent experiences for our customers,” said Michael Truell, cofounder and CEO of Cursor. 

Redpanda, a leading streaming data and AI platform, is using Vera to dramatically boost performance.

“Redpanda recently tested NVIDIA Vera running Apache Kafka-compatible workloads and saw dramatically better performance than other systems we’ve benchmarked, delivering up to 5.5x lower latency,” said Alex Gallego, founder and CEO of Redpanda. “Vera represents a new direction in CPU architecture, with more memory and less overhead per core, enabling our customers to scale real-time streaming workloads further than ever and unlock new AI and agentic applications.”

National laboratories planning to deploy Vera CPUs include Leibniz Supercomputing Centre, Los Alamos National Laboratory, Lawrence Berkeley National Laboratory's National Energy Research Scientific Computing Center and the Texas Advanced Computing Center (TACC).

“At TACC, we recently tested NVIDIA’s Vera CPU platform as we prepare for deployment in our upcoming Horizon system — and running six of our scientific applications, we saw impressive early results,” said John Cazes, director of high-performance computing at TACC. “Vera’s per-core performance and memory bandwidth represent a giant step forward for scientific computing, and we look forward to bringing Vera-based nodes to our CPU users on Horizon later this year.”

Leading cloud service providers planning to deploy Vera CPUs include Alibaba, ByteDance, Cloudflare, CoreWeave, Crusoe, Lambda, Nebius, Nscale, Oracle Cloud Infrastructure, Together.AI and Vultr.

Leading infrastructure providers adopting Vera CPUs include Aivres, ASRock Rack, ASUS, Compal, Cisco, Dell, Foxconn, GIGABYTE, HPE, Hyve, Inventec, Lenovo, MiTAC, MSI, Pegatron, Quanta Cloud Technology (QCT), Supermicro, Wistron and Wiwynn.

Availability
NVIDIA Vera is in full production and will be available from partners in the second half of this year.

Watch the GTC keynote from Huang and explore sessions.

联系我们 contact @ memedata.com