This modular approach allows each rack to be optimized for a specific function—training, inference, networking, or storage—while operating as a single logical supercomputer.
The production ramp is global in scale. Nvidia confirmed that hundreds of supply chain ecosystem partners are manufacturing Vera Rubin systems, with over 150 of those partners located in Taiwan alone . Production spans more than 350 factories across 30 countries, a clear signal that Nvidia is preparing for massive volume to meet demand from AI labs, cloud providers, and hyperscalers
. Top system builders in full-scale production include Dell Technologies, HPE, Lenovo, and Supermicro
.
Within a day of the GTC Taipei keynote, CoreWeave announced it had completed the industry-first bring-up and validation of a Vera Rubin NVL72 system on CoreWeave Cloud . The announcement confirmed the rack delivered up to 10× better inference per watt compared to previous generations, along with a reduction in the number of GPUs required for large-scale workloads
. CoreWeave’s speed in standing up a fully operational system underscores its deep engineering partnership with Nvidia and positions it as the leading early access provider for the Rubin generation.
The Vera CPU is a key differentiator for the platform. Described as Nvidia’s first standalone data center CPU, it entered mass production with shipments expected to begin in the second half of 2026 . Nvidia has designed the chip specifically for the demands of autonomous AI agents, which require high-throughput, low-latency processing across massive memory pools. Early customers confirmed for the Vera CPU include OpenAI, Anthropic, and SpaceX
.
The shift to full production for Vera Rubin signals a broader industry transition toward purpose-built infrastructure for agentic AI—systems that not only generate responses but can reason, plan, and execute multi-step actions. By integrating Groq’s low-latency inference technology directly into the POD architecture, Nvidia is targeting a new class of workloads where inference speed and efficiency are paramount .
The platform is expected to be generally available to cloud providers and enterprises in the second half of 2026, with AWS, Google Cloud, Microsoft Azure, and Oracle Cloud Infrastructure all named as expected early deployers . Given that Vera Rubin was already announced as entering production at CES in January and again at GTC in March, the GTC Taipei update confirms that the ramp has sustained momentum and is now backed by a fully scaled global supply chain
.
Comments
0 comments