ReportsPublished2 months agoLast edited last month24 sources

Anthropic Launches Claude Opus 4.8: New Benchmarks, Honesty Improvements, and Agentic Workflow Features

Claude Opus 4.8 scored 69.2% on the SWE Bench Pro agentic coding benchmark, outperforming GPT 5.5 at 58.6% and Gemini 3.1 Pro at 54.2%, but GPT 5.5 still leads on terminal based coding tasks. Standard pricing remains unchanged at $5 per million input tokens and $25 per million output tokens, while a new Fast mode co...

Search & fact-check with Studio Global AI Browse more Trending pages

Conceptual illustration of Claude Opus 4.8's launch, symbolizing AI performance benchmarks and agentic coding capabilities. — What were the key details of Anthropic's Claude Opus 4.8 launch on May 28, 2026, including its benchmark performance against OpenAI's GPT 5Anthropic launched Claude Opus 4.8 with significant improvements in agentic coding and model reliability. Image: AI-generated.
AI Prompt
Create a landscape editorial hero image for this Studio Global article: What were the key details of Anthropic's Claude Opus 4.8 launch on May 28, 2026, including its benchmark performance against OpenAI's GPT 5.. Article summary: **Launch positioning:** Claude Opus 4.8 was described as outperforming Opus 4.7 across most major benchmarks and beating GPT-5.5 and Gemini 3.1 Pro in several categories.. Topic tags: deepresearch, general web, user generated, documentation. Reference image context from search candidates: Reference image 1: visual subject "Anthropic released Claude Opus 4.8 today, and it outperforms its predecessor across most major benchmarks while beating OpenAI’s GPT-5.5 and Google’s Gemini 3.1 Pro in several key" source context "Anthropic Just Dropped Claude Opus 4.8. - The VC Corner" Reference image 2: visual subject "Anthropic released Claude Opus 4.8 today, and it outper
openai.com

Anthropic publicly released Claude Opus 4.8 on May 28, 2026, positioning it as its most capable generally available model . The update directly succeeds Opus 4.7 and targets coding, long-running agentic tasks, and enterprise reliability. Standard pricing carries over unchanged, while a significantly cheaper Fast mode and new workflow tools accompany the release .

Benchmarking the Performance Against Rivals

The most widely cited performance comparison is on the SWE-Bench Pro agentic coding benchmark. According to Anthropic's evaluation data, Opus 4.8 scored a leading 69.2%, compared to 64.3% for its predecessor, 58.6% for OpenAI's GPT-5.5, and 54.2% for Google's Gemini 3.1 Pro .

Studio Global AI

Search, cite, and publish your own answer

Use this topic as a starting point for a fresh source-backed answer, then compare citations before you share it.

Search & fact-check with Studio Global AI

Anthropic Launches Claude Opus 4.8: New Benchmarks, Honesty Improvements, and Agentic Workflow Features

Benchmarking the Performance Against Rivals

Search, cite, and publish your own answer

People also ask

What is the short answer to "Anthropic Launches Claude Opus 4.8: New Benchmarks, Honesty Improvements, and Agentic Workflow Features"?

What are the key points to validate first?

What should I do next in practice?

Sources

Honesty and Reliability as Core Technical Upgrades

Introducing Dynamic Workflows and Effort Control

Breakdown of the API Pricing Structure

What This Launch Signals for the AI Landscape