NVIDIA positions Vera as the “CPU for the age of agents.” Unlike general-purpose server processors, Vera is optimized for autonomous AI systems that must make rapid, sequential decisions—reinforcement learning, database transactions, and real-time data processing .
The chip represents NVIDIA’s first fully custom data center CPU core, known as “Olympus.” It uses a 10-wide instruction fetch and decode frontend with a neural branch predictor, effectively using AI to accelerate AI workloads . It’s built on the Arm v9.2 instruction set and exposes 176 threads through physical resource partitioning rather than conventional simultaneous multithreading
.
| Specification | Detail |
|---|---|
| Cores | 88 custom Olympus cores (Armv9.2 compatible) |
| Memory bandwidth | Up to 1.2 TB/s via LPDDR5X |
| NVLink-C2C interconnect | 1.8 TB/s bandwidth to Rubin GPUs |
| Production status | Full production as of May 2026 |
NVIDIA’s internal messaging is bold. The company claims Vera delivers 1.8x faster task completion compared to x86 CPUs, 50% faster single-threaded performance, and twice the efficiency of traditional rack-scale processors .
Independent testing paints a more nuanced but still impressive picture. On May 26, 2026, the benchmarking site Phoronix published the first third-party numbers. In a geometric mean across diverse workloads—including code compilation, Python, Java, and database processing—the 88-core Vera scored 1.55x faster than Intel’s flagship Xeon 6980P and roughly 10% ahead of AMD’s EPYC 9575F. It also beat NVIDIA’s own prior Grace CPU by 1.6x and sustained 90% of its peak memory bandwidth in STREAM TRIAD benchmarks .
Phoronix recorded a 20-second Linux kernel compile on Vera—roughly twice as fast per core as a 128-core x86 chip .
This is the first server purpose-built for the Vera CPU. HPE unveiled it at COMPUTEX 2026, positioning it for agentic AI, reinforcement learning, and data processing at AI-factory scale . It will be available in fall 2026 as part of the NVIDIA AI Computing portfolio
.
For the highest-density deployments, HPE also offers a liquid-cooled Cray Supercomputing GX240 compute blade that packs up to 16 Vera CPUs per blade and scales to 640 CPUs and 56,320 cores per rack .
Redpanda is the streaming data layer in the NYSE collaboration. The platform is compatible with Apache Kafka workloads, and Redpanda’s founder and CEO Alex Gallego says the company’s own testing shows Vera delivers “up to 5.5x lower latency” compared to other systems they’ve benchmarked . For a stock exchange handling north of a trillion messages a day, that kind of latency reduction is not academic—it directly impacts trade execution quality and system resilience
.
The NYSE is the highest-profile financial customer exploring Vera, but the early adopter list reads like a who’s-who of AI and cloud computing.
Oracle is the first cloud provider expected to deploy Vera at hyperscale, with plans to roll out hundreds of thousands of CPUs beginning in 2026 .
Vera is not a standalone story. It’s the CPU half of NVIDIA’s larger Vera Rubin platform—paired with the next-generation Rubin GPU—designed to power AI factories and supercomputers . The rack-scale Vera Rubin NVL144 system is rated at 3.6 exaflops of FP4 inference and 1.2 exaflops of FP8 training, roughly 3.3x the performance of the current GB300 NVL72
.
For financial markets, the implication is straightforward: exchanges and trading firms have been locked into x86 architecture for decades. Vera represents a credible path to Arm-based, AI-optimized infrastructure that combines extreme memory bandwidth, massive core density, and native integration with real-time streaming platforms. The NYSE exploration—while still early—signals that capital markets infrastructure is converging with high-performance computing and AI, not just in software but at the silicon level.
Comments
0 comments