AnswersPublished12 hours agoLast edited 12 hours ago14 sources

Nvidia's Blackwell Ultra Dominates the First AgentPerf Benchmark for Agentic AI

Nvidia's Blackwell Ultra GB300 NVL72 system topped the first AgentPerf benchmark, running up to 61,340 concurrent AI coding agents and achieving 20x more agents per megawatt than the previous generation H200 Hopper pl... AgentPerf is the industry's first open benchmark for agentic AI, measuring how many multi step c...

Search & fact-check with Studio Global AI Browse more Trending pages

1110

Nvidia Blackwell Ultra GPU architecture powering agentic AI infrastructure benchmarks — What did Nvidia achieve in the first published results of Artificial Analysis's AgentPerf benchmark, what does this new benchmark measure, aNvidia's Blackwell Ultra architecture is purpose-built for the demanding multi-step reasoning of agentic AI workloads. Image: AI-generated.
AI Prompt
Create a landscape editorial hero image for this Studio Global article: What did Nvidia achieve in the first published results of Artificial Analysis's AgentPerf benchmark, what does this new benchmark measure, a. Article summary: Here are the key findings from the first published results of Artificial Analysis's **AA-AgentPerf** benchmark, announced on June 12, 2026.. Topic tags: general, documentation, general web, user generated. Reference image context from search candidates: Reference image 1: visual subject "We measure real-world performance of AI accelerator systems during language model inference. ## AA-AgentPerf: The Hardware Benchmark for the Agent Era. AA-AgentPerf has been shaped" source context "AI Hardware Benchmarking & Performance Analysis" Reference image 2: visual subject "For years, co-founder and chief executive officer Jensen Huang and other higher-ups at Nvidia have
openai.com

AI benchmarks are rapidly evolving beyond measuring simple question-and-answer speed. The new frontier is agentic AI—autonomous systems that use tools, write code, and chain together multiple reasoning steps to complete complex tasks. The industry now has its first dedicated benchmark for this demanding workload, and Nvidia's newest hardware has posted dominant initial scores.

On June 12, 2026, Artificial Analysis published the first round of results for its AA-AgentPerf benchmark, designed specifically for agentic AI inference. Nvidia's Blackwell Ultra-based GB300 NVL72 rack-scale system not only achieved the highest overall performance but did so with a dramatic leap in efficiency over the prior generation, running up to 20 times more agents per megawatt than an Nvidia HGX H200 system .

What AgentPerf actually measures

Traditional LLM benchmarks often focus on synthetic queries and single-turn completions. AgentPerf is fundamentally different. It is the industry's first open, multi-vendor hardware benchmark built to stress-test real-world agentic AI workloads .

Instead of generating a single response, AgentPerf replays authentic coding agent trajectories sourced from public repositories across more than 12 programming languages. These trajectories chain together up to 20 sequential LLM calls, interspersed with tool-use simulations that incorporate realistic CPU delays, all while managing growing context windows . The result is a far more demanding test that mimics how a modern AI coding assistant actually behaves when fixing a bug or building a feature.

The core metric is the number of concurrent agents a system can support while meeting a strict service-level objective (SLO) for output token speed and time-to-first-token (TTFT) . The initial results were run on DeepSeek V4 Pro, a large mixture-of-experts (MoE) model chosen as a representative example of the frontier models powering advanced agents .

Studio Global AI

Search, cite, and publish your own answer

Use this topic as a starting point for a fresh source-backed answer, then compare citations before you share it.

Nvidia's Blackwell Ultra Dominates the First AgentPerf Benchmark for Agentic AI

What AgentPerf actually measures

Search, cite, and publish your own answer

People also ask

What is the short answer to "Nvidia's Blackwell Ultra Dominates the First AgentPerf Benchmark for Agentic AI"?

What are the key points to validate first?

What should I do next in practice?

Sources

Comments

Nvidia's Blackwell Ultra results by the numbers

How Nvidia is engineering the agentic AI era