On this rigorous new test, the GB300 NVL72 system took a decisive lead. The raw throughput figures from Artificial Analysis's social media post paint a stark picture at the least demanding SLO tier (20 tokens per second, 10 seconds TTFT) :
The 20x efficiency gain over the prior-generation Hopper platform held true at multiple SLO tiers, including the more stringent 60 tokens/s requirement, demonstrating that the performance advantage is not limited to a single testing point .
These numbers do not exist in a vacuum. Nvidia is using the AgentPerf results to cement a narrative around full-stack optimization that goes far beyond raw hardware specs. The company attributes the 20x efficiency gain to a combination of tightly integrated technologies: the NVLink scale-up fabric that links 72 GPUs into a single coherent system, custom CUDA kernels that overlap communication and computation specifically for MoE architectures, and TensorRT LLM optimizations like WideEP, DeepEP, DeepGEMM, and fused MoE kernels that maintain high utilization as the number of concurrent agent sessions scales .
The AgentPerf win also completes a clean sweep of the major AI infrastructure benchmarks for Blackwell Ultra. In MLPerf Inference v5.1, the same GB300 NVL72 system set records on DeepSeek-R1, delivering 1.4x the throughput of the previous Blackwell-based GB200 . In MLPerf Training v5.1, Blackwell Ultra achieved the fastest time-to-train on all seven benchmarks, including pretraining Llama 3.1 405B in just 10 minutes using 5,120 GPUs
.
Crucially, Nvidia is not just publishing records. The company is already pointing to production deployments as evidence that Blackwell Ultra is ready for real agentic workloads. Together AI is using the platform to power agentic coding for Cursor, and DeepInfra is running the AI workforce for Pam.ai on Blackwell . The blog post promoting the results also explicitly fast-forwards to the next architecture, noting that the Vera Rubin platform is now in production, aiming to provide even more capacity for the coming wave of agentic AI
.
For an industry that is increasingly betting its future on autonomous AI agents that can reason, code, and act, Nvidia is making a clear statement: the infrastructure is ready, and it is starting from a position of overwhelming performance leadership.
Comments
0 comments