What should I do next in practice?

En uafhængig evaluering fra NIST (CAISI) antyder, at DeepSeeks selvrapporterede benchmarks kan overvurdere modellens reelle kapacitet sammenlignet med Qwen og Kimi.

← Back to Trending

AnswersPublished6 days agoLast edited 2 days ago22 sources

Qwen3.7 Max, DeepSeek V4 og Kimi K2.6: Den ultimative sammenligning af benchmarks og priser i 2026

De tre modeller er tæt på hinanden i SWE Bench Verified (80,2–80,6), men de har hver deres spidskompetencer: Qwen i terminalopgaver, DeepSeek i ren kodning og Kimi i agent baserede opgaver med brug af værktøjer. DeepSeek V4 Pro er prismæssigt i en klasse for sig selv med en permanent prisnedsættelse på 75 %, mens Qw...

Search & fact-check with Studio Global AI Browse more Trending pages

422K0

Comparison chart of Qwen3.7-Max, DeepSeek V4, and Kimi K2.6 AI model benchmarks and pricing data — Research for benchmarks of Qwen3.7-Max, DeepSeek V4, Kimi K2.6A data-driven comparison of benchmarks and pricing for the three leading Chinese AI models in mid-2026.
AI Prompt
Create a landscape editorial hero image for this Studio Global article: Research for benchmarks of Qwen3.7-Max, DeepSeek V4, Kimi K2.6. Compare them as comprehensively as possible on both benchmarks & pricing in. Article summary: Here is the comprehensive comparison of Qwen3.7-Max, DeepSeek V4, and Kimi K2.6 across benchmarks and pricing — all data sourced from public results released between April–June 2026.. Topic tags: deepresearch, government, general web, user generated, documentation. Reference image context from search candidates: Reference image 1: visual subject "# DeepSeek V4 vs Qwen, GPT, Claude, Kimi and MiniMax: Which Model Wins in 2026. DeepSeek V4 is out — Pro and Flash tiers, MIT license, 1M context, and pricing that undercuts the fr" source context "DeepSeek V4 vs Qwen, GPT-5.5, Claude 4.7, Kimi K2.6 (2026)" Reference image 2: visual subject "# Kimi K2.6 vs Qwen3.7-Max v
openai.com

Vi har analyseret de offentligt tilgængelige benchmarks og prislister for Qwen3.7 Max, DeepSeek V4 Pro og Kimi K2.6, som blev offentliggjort mellem april og juni 2026. Her får du et samlet overblik, så du nemt kan se, hvilken af de tre spidsfindige AI-modeller der passer bedst til dine behov og dit budget.

Sammenligning af benchmarks

Resultaterne i nedenstående tabel viser, hvordan modellerne klarer sig på tværs af softwareudvikling, agent-opgaver og ræsonnement. En bindestreg (-) angiver, at en score ikke var tilgængelig i det gennemgåede materiale.

Softwareudvikling & agent-baseret kodning

Benchmark	Qwen3.7-Max	DeepSeek V4 Pro Max	Kimi K2.6 Thinking
SWE-Bench Verified	80,4	80,6

Studio Global AI

Search, cite, and publish your own answer

Use this topic as a starting point for a fresh source-backed answer, then compare citations before you share it.

Benchmark	Qwen3.7-Max	DeepSeek V4 Pro Max	Kimi K2.6 Thinking
AA Intelligence Index v4.0	56,6 (#5)	52,0	—
GPQA Diamond	92,4	—	—
HLE (Humanity's Last Exam)	41,4	37,7	54,0 (med værktøjer)
HMMT 2026 (Matematik)	97,1%	95,2%	92,7%
IMOAnswerBench	90,0	89,8	—
SimpleQA Verificeret	—	57,9%	—
Kinesisk SimpleQA	—	84,4	75,9
DeepSearchQA (F1)	—	—	92,5

Prisfaktor	Qwen3.7-Max	DeepSeek V4 Pro (standardpris)	Kimi K2.6
Input (cache miss)	$2,50	$1,74	$0,95
Output	$7,50	$3,48	$4,00
Input (cache hit)	$0,25 (-90% rabat)	$0,0145 (-99% rabat)	$0,16 (-83% rabat)
Kontekstvindue	1M tokens	1M tokens	256K tokens
Max output tokens	65.536	384.000	—
Åbne vægte	Nej (kun API)	Ja (Hugging Face)	Ja

Qwen3.7 Max, DeepSeek V4 og Kimi K2.6: Den ultimative sammenligning af benchmarks og priser i 2026

Sammenligning af benchmarks

Softwareudvikling & agent-baseret kodning

Search, cite, and publish your own answer

People also ask

What is the short answer to "Qwen3.7 Max, DeepSeek V4 og Kimi K2.6: Den ultimative sammenligning af benchmarks og priser i 2026"?

What are the key points to validate first?

What should I do next in practice?

Sources

Comments

Ræsonnement & viden

Sammenligning af priser (API — pr. 1M tokens, USD)

Hovedpointer og hvad du skal vælge