英伟达发布Vera CPU,专为代理AI打造。
Nvidia Launches Vera CPU, Purpose-Built for Agentic AI

原始链接: https://nvidianews.nvidia.com/news/nvidia-launches-vera-cpu-purpose-built-for-agentic-ai

## NVIDIA 发布 Vera CPU,迎接 AI 代理时代 NVIDIA 发布了 Vera CPU,这是一款专为代理 AI 和强化学习需求设计的处理器,承诺比传统 CPU **效率提高一倍,性能提升 50%**。这款新 CPU 专注于为能够规划、与数据交互和执行任务的 AI 模型提供动力——本质上,是 AI “工厂”。 Vera 拥有 88 个定制核心和下一代低功耗内存子系统,提供高带宽和响应速度,这对于编码助手和智能代理等大规模 AI 服务至关重要。它专为可扩展性而设计,新的机架配置支持数千个并发 CPU 环境。 **阿里巴巴、Meta、Oracle Cloud Infrastructure 和 CoreWeave** 等主要厂商已经开始采用 Vera,同时还有 **Dell、HPE、Lenovo 和 Supermicro** 等硬件合作伙伴。早期测试显示,在编码和流式数据等领域取得了显著的性能提升,国家实验室和云提供商也计划部署。 NVIDIA Vera 目前正在生产中,并将于今年下半年通过合作伙伴提供,旨在普及先进 AI 能力的访问。

## Nvidia 发布 Vera CPU 用于 AI – 摘要 Nvidia 发布了 Vera CPU,专为高需求 AI 工作负载设计,引发了对其新颖性和目标用例的讨论。虽然宣传为“代理 AI”用途,但许多评论员指出,它更准确地描述为针对 AI *集群* 优化的 CPU,在 Grace CPU 等先前设计的基础上,提高了性能和带宽。 主要特性包括与传统 CPU(EPYC/Xeon)相比,显著降低的延迟——对于流式 AI 代理至关重要——以及 CPU 和 GPU 之间的高带宽互连。该芯片拥有 88 个 ARM v9 核心和硬件 FP8 支持。 讨论集中在它是否代表着真正的飞跃,还是仅仅是巧妙的营销。一些人强调了令人印象深刻的工程技术,而另一些人则质疑对 AI 特定硬件的关注及其对通用计算的潜在影响。人们也对 Nvidia 更广泛的伦理影响以及 AI 基础设施日益增长的成本/专业化表示担忧。最终,Vera 似乎旨在用于大规模数据中心部署,而非个人消费者。
相关文章

原文

NVIDIA Vera CPU Delivers the Highest Performance and Energy Efficiency for Data Processing, AI Training and Agentic Inference at Scale

News Summary:

  • The NVIDIA Vera CPU delivers results with twice the efficiency and 50% faster than traditional CPUs.
  • Customers collaborating with NVIDIA to deploy Vera CPU include Alibaba, ByteDance, Meta and Oracle Cloud Infrastructure, along with CoreWeave, Lambda, Nebius and Nscale.
  • Manufacturing partners already adopting the Vera CPU include Dell Technologies, HPE, Lenovo and Supermicro, along with ASUS, Compal, Foxconn, GIGABYTE, Pegatron, Quanta Cloud Technology (QCT), Wistron and Wiwynn.

GTC—NVIDIA today launched the NVIDIA Vera CPU, the world’s first processor purpose-built for the age of agentic AI and reinforcement learning — delivering results with twice the efficiency and 50% faster than traditional rack-scale CPUs.

As reasoning and agentic AI advances, scale, performance and cost are increasingly driven by the infrastructure supporting the models that plan tasks, run tools, interact with data, run code and validate results.

The NVIDIA Vera CPU builds on the success of the NVIDIA Grace™ CPU, enabling organizations of all sizes and across industries to build AI factories that unlock agentic AI at scale. With the highest single-thread performance and bandwidth per core, Vera is a new class of CPU that delivers higher AI throughput, responsiveness and efficiency for large-scale AI services such as coding assistants, as well as consumer and enterprise agents.

Leading hyperscalers collaborating with NVIDIA to deploy Vera include Alibaba, CoreWeave, Meta and Oracle Cloud Infrastructure, as well as global system makers Dell Technologies, HPE, Lenovo, Supermicro and others. This broad adoption establishes Vera as the new CPU standard for the AI workloads that matter most for developers, startups, public-private institutions and enterprises — helping democratize access to AI and accelerating innovation.

“Vera is arriving at a turning point for AI. As intelligence becomes agentic — capable of reasoning and acting — the importance of the systems orchestrating that work is elevated,” said Jensen Huang, founder and CEO of NVIDIA. “The CPU is no longer simply supporting the model; it’s driving it. With breakthrough performance and energy efficiency, Vera unlocks AI systems that think faster and scale further.”

Configurable for Every Data Center
NVIDIA announced a new Vera CPU rack integrating 256 liquid-cooled Vera CPUs to sustain more than 22,500 concurrent CPU environments, each running independently at full performance. AI factories can quickly deploy and scale to tens of thousands of simultaneous instances and agentic tools in a single rack.

The new Vera rack is built using the NVIDIA MGX™ modular reference architecture, supported by 80 ecosystem partners worldwide.

As part of the NVIDIA Vera Rubin NVL72 platform, Vera CPUs are paired with NVIDIA GPUs through NVIDIA NVLink™-C2C interconnect technology, with 1.8 TB/s of coherent bandwidth — 7x the bandwidth of PCIe Gen 6 — for high-speed data sharing between CPUs and GPUs. Additionally, NVIDIA introduced new reference designs that use Vera as the host CPU for NVIDIA HGX™ Rubin NVL8 systems, coordinating data movement and system control for GPU-accelerated workloads.

Vera systems partners are providing both dual and single-socket CPU server configurations, optimal for workloads such as reinforcement learning, agentic inference, data processing, orchestration, storage management, cloud applications and high-performance computing.

Across all configurations, Vera systems integrate NVIDIA ConnectX® SuperNIC cards and NVIDIA BlueField®-4 DPUs for accelerated networking, storage and security, which are critical for agentic AI. This enables customers to optimize for their specific workloads while maintaining a single software stack across the NVIDIA platform.

Designed for Agentic Scaling
By combining high-performance, energy-efficient CPU cores, a high-bandwidth memory subsystem and the second-generation NVIDIA Scalable Coherency Fabric, Vera enables faster agentic responses under the extreme utilization conditions common for agentic AI and reinforcement learning.

Vera features 88 custom NVIDIA-designed Olympus cores, delivering high performance for compilers, runtime engines, analytics pipelines, agentic tooling and orchestration services. Each core can run two tasks, using NVIDIA Spatial Multithreading, to deliver consistent, predictable performance — ideal for multi-tenant AI factories running many jobs at once.

To further enhance energy efficiency, Vera introduces the second generation of NVIDIA’s low-power memory subsystem, now built on LPDDR5X memory and delivering up to 1.2 TB/s of bandwidth — twice the bandwidth and at half the power compared with general-purpose CPUs.

Widespread Ecosystem Support
Cursor, an innovator in AI-native software development, is adopting NVIDIA Vera to boost performance for its AI coding agents.

“We’re excited to use NVIDIA Vera CPUs to improve overall throughput and efficiency so we can deliver faster, more responsive coding agent experiences for our customers,” said Michael Truell, cofounder and CEO of Cursor. 

Redpanda, a leading streaming data and AI platform, is using Vera to dramatically boost performance.

“Redpanda recently tested NVIDIA Vera running Apache Kafka-compatible workloads and saw dramatically better performance than other systems we’ve benchmarked, delivering up to 5.5x lower latency,” said Alex Gallego, founder and CEO of Redpanda. “Vera represents a new direction in CPU architecture, with more memory and less overhead per core, enabling our customers to scale real-time streaming workloads further than ever and unlock new AI and agentic applications.”

National laboratories planning to deploy Vera CPUs include Leibniz Supercomputing Centre, Los Alamos National Laboratory, Lawrence Berkeley National Laboratory's National Energy Research Scientific Computing Center and the Texas Advanced Computing Center (TACC).

“At TACC, we recently tested NVIDIA’s Vera CPU platform as we prepare for deployment in our upcoming Horizon system — and running six of our scientific applications, we saw impressive early results,” said John Cazes, director of high-performance computing at TACC. “Vera’s per-core performance and memory bandwidth represent a giant step forward for scientific computing, and we look forward to bringing Vera-based nodes to our CPU users on Horizon later this year.”

Leading cloud service providers planning to deploy Vera CPUs include Alibaba, ByteDance, Cloudflare, CoreWeave, Crusoe, Lambda, Nebius, Nscale, Oracle Cloud Infrastructure, Together.AI and Vultr.

Leading infrastructure providers adopting Vera CPUs include Aivres, ASRock Rack, ASUS, Compal, Cisco, Dell, Foxconn, GIGABYTE, HPE, Hyve, Inventec, Lenovo, MiTAC, MSI, Pegatron, Quanta Cloud Technology (QCT), Supermicro, Wistron and Wiwynn.

Availability
NVIDIA Vera is in full production and will be available from partners in the second half of this year.

Watch the GTC keynote from Huang and explore sessions.

联系我们 contact @ memedata.com