What should I do next in practice?

Vera signals Nvidia’s strategy to sell full AI‑factory infrastructure rather than just GPUs, bringing it into direct competition with AMD EPYC and Intel Xeon in the data‑center CPU market.

studioglobal

← Back to Trending

AnswersPublished2 months agoLast edited last month26 sources

Nvidia Vera CPU: Architecture, First Customers, and Its Role in the Rubin NVL72 AI Supercomputer

Nvidia’s Vera CPU is a new Arm‑based data‑center processor built for agentic AI workloads and integrated into the Rubin NVL72 rack with 72 GPUs and 36 CPUs; first units were delivered to Anthropic, OpenAI, SpaceXAI, a... The chip acts as the orchestration and memory engine for AI systems—handling tasks like agent wo...

Search & fact-check with Studio Global AI Browse more Trending pages

Illustration representing Nvidia Vera CPU integrated with Rubin GPU infrastructure for AI data centers — What is Nvidia’s new Vera CPU, which AI companies and cloud providers have received the first units, what technical features and performanceNvidia’s Vera CPU is designed to work alongside Rubin GPUs in rack‑scale AI systems built for large‑scale agentic AI workloads.
AI Prompt
Create a landscape editorial hero image for this Studio Global article: What is Nvidia’s new Vera CPU, which AI companies and cloud providers have received the first units, what technical features and performance. Article summary: Nvidia’s Vera CPU is a new in-house data-center processor built to sit beside Rubin GPUs in Nvidia’s next rack-scale AI systems, with Nvidia positioning it specifically for agentic AI, reinforcement learning, retrieval, . Topic tags: general, general web, user generated. Reference image context from search candidates: Reference image 1: visual subject "Vera Arrives: NVIDIA’s First CPU Built for Agents Lands at Top AI Labs | The Tech Buzz. The first NVIDIA Vera CPUs arrived at three of the world's leading AI labs on Friday — Anthr" source context "Vera Arrives: NVIDIA's First CPU Built for Agents Lands at Top AI Labs" Reference image 2: visual subject "### NVIDI
openai.com

Nvidia’s Vera CPU is the company’s first custom processor designed specifically for modern AI data centers. Introduced alongside the Rubin GPU architecture, Vera serves as the CPU backbone of Nvidia’s next generation of rack‑scale AI systems built for what the company calls the era of agentic AI—systems where AI models run complex tool‑using workflows rather than simple prompt‑response inference.

The chip is tightly integrated into Nvidia’s new Vera Rubin platform, which combines CPUs, GPUs, networking, and data‑processing hardware into unified “AI factory” infrastructure for large‑scale training and inference workloads.

First AI Labs and Cloud Providers to Receive Vera CPUs

The first Vera CPU systems were delivered in May 2026 to several leading AI developers and cloud providers. Nvidia reported that the initial shipments went to:

Anthropic in San Francisco
OpenAI in Mission Bay, San Francisco
SpaceXAI in Palo Alto
Oracle Cloud Infrastructure (OCI) in Santa Clara

According to Nvidia’s blog and newsroom reports, these early units were hand‑delivered by Nvidia VP of hyperscale and HPC Ian Buck as part of initial partner deployments.

These organizations are among the first to experiment with the new CPU as part of next‑generation AI infrastructure built around Rubin GPUs.

What the Vera CPU Is Designed For

Unlike general‑purpose server processors, Vera is built specifically for AI infrastructure workloads that require heavy coordination between CPUs and GPUs.

Modern AI systems increasingly run complex workflows involving:

Agent orchestration and planning loops
Retrieval‑augmented generation and database queries
Reinforcement learning and simulation environments
Tool use, sandboxes, and multi‑step reasoning

These tasks require high single‑thread performance, fast memory access, and extremely fast communication with GPUs—areas Nvidia optimized Vera to handle.

Core Technical Architecture

Public specifications describe Vera as a custom Arm‑based data‑center CPU built with Nvidia‑designed cores and optimized memory bandwidth.

Key architectural features include:

88 custom Armv9.2 “Olympus” cores
176 threads via spatial multithreading
High‑bandwidth memory system (~1.2 TB/s)
Coherent NVLink‑C2C connection to GPUs at up to ~1.8 TB/s

These features allow the CPU to act as the memory and orchestration engine for AI systems while Rubin GPUs perform the intensive matrix and tensor operations used for training and inference.

The design continues Nvidia’s strategy of using custom Arm‑based silicon for AI infrastructure, building on the earlier Grace CPU used in the Grace Hopper platform.

Performance Claims for Agentic AI Workloads

Nvidia positions Vera as optimized for workloads that involve coordination between many AI components rather than pure GPU math.

Company benchmarks claim that Vera can deliver:

Up to 2× efficiency compared with traditional rack‑scale CPUs
Around 50% faster performance in some workloads
Up to 50% faster agent sandbox execution
Up to 3× faster enterprise data queries

These figures come from Nvidia’s own benchmarks and have limited independent validation so far, so they should be interpreted as vendor‑supplied claims rather than definitive comparisons.

How Vera Fits into the Rubin NVL72 Platform

The Vera CPU is not intended to run alone. Instead, it forms the CPU half of Nvidia’s Vera Rubin AI platform.

The flagship system is the Vera Rubin NVL72 rack, which integrates:

72 Rubin GPUs
36 Vera CPUs
NVLink 6 switching fabric
ConnectX‑9 SuperNIC networking
BlueField‑4 data processing units (DPUs)
Quantum‑X800 InfiniBand or Spectrum‑X Ethernet networking

Together, these components form a rack‑scale AI supercomputer designed to function as a single system optimized for training, inference, and agentic workloads.

Nvidia also described Vera CPU‑only racks containing up to 256 liquid‑cooled CPUs designed for orchestration and reinforcement‑learning workloads within large AI clusters.

Why the Vera CPU Matters for Nvidia

For years Nvidia dominated AI infrastructure through GPUs alone. But large‑scale AI systems increasingly depend on a full stack of compute, networking, storage, and orchestration hardware.

With Vera, Nvidia is expanding its role from GPU supplier to complete AI infrastructure provider.

Instead of relying on host CPUs from Intel Xeon or AMD EPYC, Nvidia can now provide the CPU that coordinates its GPU clusters. This lets the company tightly integrate:

CPU scheduling
GPU memory access
networking fabric
software frameworks like CUDA

The strategy reflects a broader industry shift toward rack‑scale AI systems designed as unified platforms, rather than clusters assembled from separate components.

The Competitive Context: AMD and Intel

The launch also positions Nvidia more directly in the server CPU market.

That market is currently dominated by x86 processors from Intel and AMD, though competition has intensified. For example, AMD’s EPYC chips captured more than 33% of server CPU unit share and about 46% of server x86 CPU revenue share in early 2026, according to Mercury Research data reported by Tom’s Hardware.

By introducing its own CPU architecture for AI systems, Nvidia aims to capture more of the data‑center bill of materials and reduce dependence on external CPU vendors.

The Bigger Picture: AI Infrastructure Becoming Vertical

The Vera CPU launch highlights a larger trend in AI computing: infrastructure is becoming vertically integrated.

Rather than selling standalone chips, Nvidia now packages CPUs, GPUs, networking, DPUs, and software as a complete AI factory platform. The Rubin generation is built around this philosophy, combining multiple custom chips into unified rack‑scale systems designed to power the next wave of AI models and autonomous agents.

If the architecture delivers the performance gains Nvidia claims, Vera could play a key role in shaping how future AI data centers are built—where CPUs, GPUs, networking, and software are designed together rather than assembled from independent components.

Studio Global AI

Search, cite, and publish your own answer

Use this topic as a starting point for a fresh source-backed answer, then compare citations before you share it.

Search & fact-check with Studio Global AI

Sources

← Back to Trending

AnswersPublished2 months agoLast edited last month26 sources

Nvidia Vera CPU: Architecture, First Customers, and Its Role in the Rubin NVL72 AI Supercomputer

Search & fact-check with Studio Global AI Browse more Trending pages

First AI Labs and Cloud Providers to Receive Vera CPUs

The first Vera CPU systems were delivered in May 2026 to several leading AI developers and cloud providers. Nvidia reported that the initial shipments went to:

Anthropic in San Francisco
OpenAI in Mission Bay, San Francisco
SpaceXAI in Palo Alto
Oracle Cloud Infrastructure (OCI) in Santa Clara

According to Nvidia’s blog and newsroom reports, these early units were hand‑delivered by Nvidia VP of hyperscale and HPC Ian Buck as part of initial partner deployments.

These organizations are among the first to experiment with the new CPU as part of next‑generation AI infrastructure built around Rubin GPUs.

What the Vera CPU Is Designed For

Unlike general‑purpose server processors, Vera is built specifically for AI infrastructure workloads that require heavy coordination between CPUs and GPUs.

Modern AI systems increasingly run complex workflows involving:

Agent orchestration and planning loops
Retrieval‑augmented generation and database queries
Reinforcement learning and simulation environments
Tool use, sandboxes, and multi‑step reasoning

These tasks require high single‑thread performance, fast memory access, and extremely fast communication with GPUs—areas Nvidia optimized Vera to handle.

Core Technical Architecture

Public specifications describe Vera as a custom Arm‑based data‑center CPU built with Nvidia‑designed cores and optimized memory bandwidth.

Key architectural features include:

88 custom Armv9.2 “Olympus” cores
176 threads via spatial multithreading
High‑bandwidth memory system (~1.2 TB/s)
Coherent NVLink‑C2C connection to GPUs at up to ~1.8 TB/s

These features allow the CPU to act as the memory and orchestration engine for AI systems while Rubin GPUs perform the intensive matrix and tensor operations used for training and inference.

The design continues Nvidia’s strategy of using custom Arm‑based silicon for AI infrastructure, building on the earlier Grace CPU used in the Grace Hopper platform.

Performance Claims for Agentic AI Workloads

Nvidia positions Vera as optimized for workloads that involve coordination between many AI components rather than pure GPU math.

Company benchmarks claim that Vera can deliver:

Up to 2× efficiency compared with traditional rack‑scale CPUs
Around 50% faster performance in some workloads
Up to 50% faster agent sandbox execution
Up to 3× faster enterprise data queries

These figures come from Nvidia’s own benchmarks and have limited independent validation so far, so they should be interpreted as vendor‑supplied claims rather than definitive comparisons.

How Vera Fits into the Rubin NVL72 Platform

The Vera CPU is not intended to run alone. Instead, it forms the CPU half of Nvidia’s Vera Rubin AI platform.

The flagship system is the Vera Rubin NVL72 rack, which integrates:

72 Rubin GPUs
36 Vera CPUs
NVLink 6 switching fabric
ConnectX‑9 SuperNIC networking
BlueField‑4 data processing units (DPUs)
Quantum‑X800 InfiniBand or Spectrum‑X Ethernet networking

Together, these components form a rack‑scale AI supercomputer designed to function as a single system optimized for training, inference, and agentic workloads.

Nvidia also described Vera CPU‑only racks containing up to 256 liquid‑cooled CPUs designed for orchestration and reinforcement‑learning workloads within large AI clusters.

Why the Vera CPU Matters for Nvidia

For years Nvidia dominated AI infrastructure through GPUs alone. But large‑scale AI systems increasingly depend on a full stack of compute, networking, storage, and orchestration hardware.

With Vera, Nvidia is expanding its role from GPU supplier to complete AI infrastructure provider.

Instead of relying on host CPUs from Intel Xeon or AMD EPYC, Nvidia can now provide the CPU that coordinates its GPU clusters. This lets the company tightly integrate:

CPU scheduling
GPU memory access
networking fabric
software frameworks like CUDA

The strategy reflects a broader industry shift toward rack‑scale AI systems designed as unified platforms, rather than clusters assembled from separate components.

The Competitive Context: AMD and Intel

The launch also positions Nvidia more directly in the server CPU market.

By introducing its own CPU architecture for AI systems, Nvidia aims to capture more of the data‑center bill of materials and reduce dependence on external CPU vendors.

The Bigger Picture: AI Infrastructure Becoming Vertical

The Vera CPU launch highlights a larger trend in AI computing: infrastructure is becoming vertically integrated.

Studio Global AI

Search, cite, and publish your own answer

Use this topic as a starting point for a fresh source-backed answer, then compare citations before you share it.

Search & fact-check with Studio Global AI

Nvidia Vera CPU: Architecture, First Customers, and Its Role in the Rubin NVL72 AI Supercomputer

First AI Labs and Cloud Providers to Receive Vera CPUs

What the Vera CPU Is Designed For

Core Technical Architecture

Performance Claims for Agentic AI Workloads

How Vera Fits into the Rubin NVL72 Platform

Why the Vera CPU Matters for Nvidia

The Competitive Context: AMD and Intel

The Bigger Picture: AI Infrastructure Becoming Vertical

Search, cite, and publish your own answer

People also ask

What is the short answer to "Nvidia Vera CPU: Architecture, First Customers, and Its Role in the Rubin NVL72 AI Supercomputer"?

What are the key points to validate first?

What should I do next in practice?

Sources

Nvidia Vera CPU: Architecture, First Customers, and Its Role in the Rubin NVL72 AI Supercomputer

First AI Labs and Cloud Providers to Receive Vera CPUs

What the Vera CPU Is Designed For

Core Technical Architecture

Performance Claims for Agentic AI Workloads

How Vera Fits into the Rubin NVL72 Platform

Why the Vera CPU Matters for Nvidia

The Competitive Context: AMD and Intel

The Bigger Picture: AI Infrastructure Becoming Vertical

Search, cite, and publish your own answer

People also ask

What is the short answer to "Nvidia Vera CPU: Architecture, First Customers, and Its Role in the Rubin NVL72 AI Supercomputer"?

What are the key points to validate first?

What should I do next in practice?

Sources