The Nvidia Vera CPU is a purpose-built AI data center processor made by Nvidia, launched at CES 2026, featuring 88 custom Olympus ARM cores designed specifically for agentic AI workloads rather than general-purpose server computing. This is not Nvidia trying to beat AMD’s EPYC or Intel’s Xeon at their own game. It’s something more specific, and arguably more interesting.
TL;DR: The Nvidia Vera CPU brings 88 custom Olympus cores, 1.2 TB/s memory bandwidth, and a 256-chip rack configuration delivering up to 6x CPU throughput gains. It replaces the Grace CPU and is built for agentic AI pipelines, not traditional server workloads. Think AI factory control plane, not general-purpose compute.
What the Nvidia Vera CPU actually is
The Nvidia Vera CPU is not a general-purpose server processor. It’s a control-plane chip built to manage and orchestrate AI factory workloads — handling data preparation, KV-cache management, compiler operations, and agentic AI pipelines that increasingly define how modern AI infrastructure runs. That distinction matters enormously when evaluating it against AMD and Intel alternatives.
Under the hood, Vera runs 88 custom Olympus cores built on full Armv9.2 compatibility, manufactured on TSMC’s 3nm process. Those 88 cores deliver 176 threads through a technique Nvidia calls Spatial Multithreading — physical resource partitioning rather than the time-slicing approach used in conventional simultaneous multithreading. The result is more consistent, predictable performance in multi-tenant environments where dozens of AI agents may be competing for resources simultaneously.
Memory is where Vera pulls ahead of most comparable architectures. The chip supports up to 1.5 TB of LPDDR5X memory with 1.2 TB/s of bandwidth, and it does this while consuming less than 50 watts of memory power — a figure that will matter to data center operators watching their power bills climb.
How Vera connects to Nvidia’s GPU stack
The Nvidia Vera CPU’s most strategically important feature is its NVLink-C2C connection to Nvidia’s Rubin GPUs, delivering 1.8 TB/s of coherent bandwidth — seven times faster than PCIe Gen 6. That coherence link means the HBM4 on the GPU and the LPDDR5X on the CPU function as a logically unified memory pool, eliminating the traditional bottleneck between host and accelerator memory.
This is where the AMD and Intel comparison gets complicated. Both companies offer competitive server CPUs with strong memory bandwidth and PCIe connectivity. But neither has an equivalent coherent interconnect tying their CPUs directly to a GPU ecosystem at this bandwidth level. Vera’s advantage isn’t core count or raw clock speed — it’s the depth of integration with Nvidia‘s own accelerator stack. For shops already committed to Nvidia’s AI infrastructure, that integration is a genuine lock-in advantage. For everyone else, it’s a reason to think carefully before switching.
The 256-chip rack configuration and what it enables
The Vera CPU rack packs 256 liquid-cooled Vera CPUs into a single unit, sustaining more than 22,500 concurrent CPU environments and delivering up to a 6x gain in CPU throughput compared to previous configurations. That number needs context — the 6x figure applies to specific workload comparisons, not every CPU task universally. But for the agentic AI use cases Nvidia is targeting, the density is genuinely striking.
A single rack running 22,500 concurrent CPU environments means a data center operator can spin up thousands of simultaneous AI agent instances without the sprawl of traditional server infrastructure. Cursor, the AI-native software development platform, is already adopting Vera for AI coding agents — an early signal of where this hardware lands in practice. The 80 ecosystem partners supporting Nvidia’s MGX modular reference architecture suggest the deployment story is more mature than a typical launch announcement.
Is the Nvidia Vera CPU a real threat to AMD and Intel?
The Nvidia Vera CPU competes with AMD EPYC and Intel Xeon only in the narrow sense that all three chips can run in data centers. Vera is explicitly a control-plane processor for AI factories, not a replacement for the general-purpose server CPUs that power databases, web servers, and enterprise applications. Calling it a direct competitor overstates the case.
Where Vera does apply real pressure is in the emerging category of AI infrastructure procurement. Data centers building out agentic AI capacity now have a credible Nvidia-native CPU option that pairs tightly with Rubin GPUs, rather than sourcing CPUs separately from AMD or Intel. Vera roughly doubles the performance of the Grace CPU it replaces, which means customers who were already in the Nvidia ecosystem have a compelling upgrade path. For greenfield AI factory builds, the integrated stack — Vera CPU plus Rubin GPU plus NVLink-C2C — is a serious proposition that AMD and Intel can’t yet match with equivalent coherent interconnect depth.
It’s also worth noting that Vera is the first CPU to natively support FP8 precision, a lower-precision format increasingly used in AI inference workloads. That’s a small but meaningful differentiator for teams optimizing inference pipelines.
What are the power and infrastructure requirements for Vera?
Deploying Vera Rubin at scale requires significant data center infrastructure upgrades. Analyst estimates put the Rubin GPU at roughly 2,300W TDP — nearly double the 1,200W of the Blackwell B200 — though Nvidia has not officially confirmed this figure. Nvidia argues that system-level efficiency improvements offset the raw power increase, but that claim comes without detailed supporting data, and absolute power draw is a real constraint for operators with fixed power budgets.
How does Vera compare to the Grace CPU it replaces?
The Vera CPU roughly doubles the performance of the Grace CPU it replaces, according to Nvidia. Grace was Nvidia’s first serious foray into custom ARM-based CPU design for data centers. Vera builds on that foundation with a new generation of Olympus cores, a second-generation Scalable Coherency Fabric for lower-latency interconnects, and full confidential computing support baked in. The architectural jump is meaningful, not incremental.
Which workloads is the Vera CPU actually designed for?
Vera targets agentic AI pipelines, data preparation, KV-cache management, memory-intensive HPC simulations, compiler and runtime engine operations, analytics pipelines, and orchestration services. It is not optimized for general enterprise workloads like relational databases or traditional web serving. If your infrastructure runs those workloads, AMD EPYC and Intel Xeon remain the more practical choice.
The Nvidia Vera CPU is a sharp, specific piece of silicon — purpose-built for a world where AI agents run continuously at scale inside data centers. It won’t replace AMD or Intel in the general server market, and it was never meant to. But for AI factory operators building around Nvidia’s GPU stack, Vera closes the last gap in a tightly integrated compute architecture. The 256-chip rack, the 22,500 concurrent environments, the NVLink-C2C coherence — these aren’t specs for a CPU trying to be everything. They’re specs for a CPU that knows exactly what it’s for.
This article was written with AI assistance and editorially reviewed.
Source: Tom's Hardware


