Nvidia Vera CPU: Architecture, First Customers, and Its Role in the Rubin NVL72 AI Supercomputer
Nvidia’s Vera CPU is a new Arm‑based data‑center processor built for agentic AI workloads and integrated into the Rubin NVL72 rack with 72 GPUs and 36 CPUs; first units were delivered to Anthropic, OpenAI, SpaceXAI, a... The chip acts as the orchestration and memory engine for AI systems—handling tasks like agent wo...
What is Nvidia’s new Vera CPU, which AI companies and cloud providers have received the first units, what technical features and performanceNvidia’s Vera CPU is designed to work alongside Rubin GPUs in rack‑scale AI systems built for large‑scale agentic AI workloads.
AI Prompt
Create a landscape editorial hero image for this Studio Global article: What is Nvidia’s new Vera CPU, which AI companies and cloud providers have received the first units, what technical features and performance. Article summary: Nvidia’s Vera CPU is a new in-house data-center processor built to sit beside Rubin GPUs in Nvidia’s next rack-scale AI systems, with Nvidia positioning it specifically for agentic AI, reinforcement learning, retrieval, . Topic tags: general, general web, user generated. Reference image context from search candidates: Reference image 1: visual subject "Vera Arrives: NVIDIA’s First CPU Built for Agents Lands at Top AI Labs | The Tech Buzz. The first NVIDIA Vera CPUs arrived at three of the world's leading AI labs on Friday — Anthr" source context "Vera Arrives: NVIDIA's First CPU Built for Agents Lands at Top AI Labs" Reference image 2: visual subject "### NVIDI
openai.com
Nvidia’s Vera CPU is the company’s first custom processor designed specifically for modern AI data centers. Introduced alongside the Rubin GPU architecture, Vera serves as the CPU backbone of Nvidia’s next generation of rack‑scale AI systems built for what the company calls the era of agentic AI—systems where AI models run complex tool‑using workflows rather than simple prompt‑response inference.
The chip is tightly integrated into Nvidia’s new Vera Rubin platform, which combines CPUs, GPUs, networking, and data‑processing hardware into unified “AI factory” infrastructure for large‑scale training and inference workloads.
First AI Labs and Cloud Providers to Receive Vera CPUs
The first Vera CPU systems were delivered in May 2026 to several leading AI developers and cloud providers. Nvidia reported that the initial shipments went to:
Anthropic in San Francisco
OpenAI in Mission Bay, San Francisco
SpaceXAI in Palo Alto
Oracle Cloud Infrastructure (OCI) in Santa Clara
Studio Global AI
Search, cite, and publish your own answer
Use this topic as a starting point for a fresh source-backed answer, then compare citations before you share it.
What is the short answer to "Nvidia Vera CPU: Architecture, First Customers, and Its Role in the Rubin NVL72 AI Supercomputer"?
Nvidia’s Vera CPU is a new Arm‑based data‑center processor built for agentic AI workloads and integrated into the Rubin NVL72 rack with 72 GPUs and 36 CPUs; first units were delivered to Anthropic, OpenAI, SpaceXAI, a...
What are the key points to validate first?
Nvidia’s Vera CPU is a new Arm‑based data‑center processor built for agentic AI workloads and integrated into the Rubin NVL72 rack with 72 GPUs and 36 CPUs; first units were delivered to Anthropic, OpenAI, SpaceXAI, a... The chip acts as the orchestration and memory engine for AI systems—handling tasks like agent workflows, retrieval, and reinforcement learning while Rubin GPUs handle heavy tensor computation.
What should I do next in practice?
Vera signals Nvidia’s strategy to sell full AI‑factory infrastructure rather than just GPUs, bringing it into direct competition with AMD EPYC and Intel Xeon in the data‑center CPU market.
According to Nvidia’s blog and newsroom reports, these early units were hand‑delivered by Nvidia VP of hyperscale and HPC Ian Buck as part of initial partner deployments.
These organizations are among the first to experiment with the new CPU as part of next‑generation AI infrastructure built around Rubin GPUs.
What the Vera CPU Is Designed For
Unlike general‑purpose server processors, Vera is built specifically for AI infrastructure workloads that require heavy coordination between CPUs and GPUs.
Modern AI systems increasingly run complex workflows involving:
Agent orchestration and planning loops
Retrieval‑augmented generation and database queries
Reinforcement learning and simulation environments
Tool use, sandboxes, and multi‑step reasoning
These tasks require high single‑thread performance, fast memory access, and extremely fast communication with GPUs—areas Nvidia optimized Vera to handle.
Core Technical Architecture
Public specifications describe Vera as a custom Arm‑based data‑center CPU built with Nvidia‑designed cores and optimized memory bandwidth.
Key architectural features include:
88 custom Armv9.2 “Olympus” cores
176 threads via spatial multithreading
High‑bandwidth memory system (~1.2 TB/s)
Coherent NVLink‑C2C connection to GPUs at up to ~1.8 TB/s
These features allow the CPU to act as the memory and orchestration engine for AI systems while Rubin GPUs perform the intensive matrix and tensor operations used for training and inference.
The design continues Nvidia’s strategy of using custom Arm‑based silicon for AI infrastructure, building on the earlier Grace CPU used in the Grace Hopper platform.
Performance Claims for Agentic AI Workloads
Nvidia positions Vera as optimized for workloads that involve coordination between many AI components rather than pure GPU math.
Company benchmarks claim that Vera can deliver:
Up to 2× efficiency compared with traditional rack‑scale CPUs
Around 50% faster performance in some workloads
Up to 50% faster agent sandbox execution
Up to 3× faster enterprise data queries
These figures come from Nvidia’s own benchmarks and have limited independent validation so far, so they should be interpreted as vendor‑supplied claims rather than definitive comparisons.
How Vera Fits into the Rubin NVL72 Platform
The Vera CPU is not intended to run alone. Instead, it forms the CPU half of Nvidia’s Vera Rubin AI platform.
The flagship system is the Vera Rubin NVL72 rack, which integrates:
72 Rubin GPUs
36 Vera CPUs
NVLink 6 switching fabric
ConnectX‑9 SuperNIC networking
BlueField‑4 data processing units (DPUs)
Quantum‑X800 InfiniBand or Spectrum‑X Ethernet networking
Together, these components form a rack‑scale AI supercomputer designed to function as a single system optimized for training, inference, and agentic workloads.
Nvidia also described Vera CPU‑only racks containing up to 256 liquid‑cooled CPUs designed for orchestration and reinforcement‑learning workloads within large AI clusters.
Why the Vera CPU Matters for Nvidia
For years Nvidia dominated AI infrastructure through GPUs alone. But large‑scale AI systems increasingly depend on a full stack of compute, networking, storage, and orchestration hardware.
With Vera, Nvidia is expanding its role from GPU supplier to complete AI infrastructure provider.
Instead of relying on host CPUs from Intel Xeon or AMD EPYC, Nvidia can now provide the CPU that coordinates its GPU clusters. This lets the company tightly integrate:
CPU scheduling
GPU memory access
networking fabric
software frameworks like CUDA
The strategy reflects a broader industry shift toward rack‑scale AI systems designed as unified platforms, rather than clusters assembled from separate components.
The Competitive Context: AMD and Intel
The launch also positions Nvidia more directly in the server CPU market.
That market is currently dominated by x86 processors from Intel and AMD, though competition has intensified. For example, AMD’s EPYC chips captured more than 33% of server CPU unit share and about 46% of server x86 CPU revenue share in early 2026, according to Mercury Research data reported by Tom’s Hardware.
By introducing its own CPU architecture for AI systems, Nvidia aims to capture more of the data‑center bill of materials and reduce dependence on external CPU vendors.
The Bigger Picture: AI Infrastructure Becoming Vertical
The Vera CPU launch highlights a larger trend in AI computing: infrastructure is becoming vertically integrated.
Rather than selling standalone chips, Nvidia now packages CPUs, GPUs, networking, DPUs, and software as a complete AI factory platform. The Rubin generation is built around this philosophy, combining multiple custom chips into unified rack‑scale systems designed to power the next wave of AI models and autonomous agents.
If the architecture delivers the performance gains Nvidia claims, Vera could play a key role in shaping how future AI data centers are built—where CPUs, GPUs, networking, and software are designed together rather than assembled from independent components.
Comments
0 comments