← Back to Trending

答案已发布上周Last edited 上周39 来源

GLM-5.2：中国开源AI模型以六分之一成本在编程任务上超越GPT-5.5

智谱AI的GLM 5.2在Artificial Analysis Intelligence Index v4.1上获得51分，成为最高分的开源权重模型，整体排名第四，仅次于Claude Fable 5、Claude Opus 4.8和GPT 5.5。关键基准测试成绩：FrontierSWE 74.4%、Terminal Bench 2.1（81.0）、SWE bench Pro（62.1）、GPQA Diamond（80.3%）、AIME 2025（86.67%）以及WebDev Arena第一名。

使用 Studio Global AI 搜索并核查事实浏览更多热门页面

448K0

A futuristic visualization of the GLM-5.2 AI model architecture, with a glowing network of nodes and connections representing the 753-billion-parameter Mixture-of-Experts system. — Searching with cited sources for What is Z.AI's GLM-5.2, how does it perform on the Artificial Analysis Intelligence Index and coding benchmZ.AI's GLM-5.2 architecture visualization.
AI 提示
Create a landscape editorial hero image for this Studio Global article: Searching with cited sources for What is Z.AI's GLM-5.2, how does it perform on the Artificial Analysis Intelligence Index and coding benchm. Article summary: Here is a comprehensive answer covering all the dimensions you asked about.. Topic tags: general, general web, user generated, documentation. Style: premium digital editorial illustration, source-backed research mood, clean composition, high detail, modern web publication hero. Use reference image context only for broad subject, composition, and topical grounding; do not copy the exact image. Avoid: logos, brand marks, copyrighted characters, real person likenesses, fake screenshots, UI text, readable text, watermarks, charts with fake numbers, clickbait thumbnails, icons, and tiny thumbnail layouts. Make it useful as an illustrative visual, not as factual evid
openai.com

2026年6月13日，总部位于北京的智谱AI（Z.AI，原智谱华章）发布了GLM-5.2，一个拥有7530亿参数的开源权重模型。该模型立即在Artificial Analysis Intelligence Index上成为有史以来得分最高的开源权重模型。它在多个长程编程基准测试中击败了OpenAI的GPT-5.5，而每token成本仅为其约六分之一，这标志着开源AI的一个重要里程碑，也表明中国AI实验室已大幅缩小了与西方专有前沿模型的差距。

基准测试表现

Artificial Analysis Intelligence Index v4.1

GLM-5.2在Artificial Analysis Intelligence Index v4.1上获得了51分——这是开源权重模型有史以来的最高分。这使得它在完整排行榜上位列第四，仅次于Claude Fable 5（60分）、Claude Opus 4.8（56分）和GPT-5.5（估计约53-55分）。它击败了其他中国开源竞争对手：MiniMax-M3（44分）、DeepSeek V4 Pro Max（44分）和Kimi K2.6（43分）。从GLM-5.1（40分）到GLM-5.2（51分）的版本间提升达到了+11分——这比大多数小版本发布的提升幅度都要大。

Studio Global AI

Search, cite, and publish your own answer

Use this topic as a starting point for a fresh source-backed answer, then compare citations before you share it.

使用 Studio Global AI 搜索并核查事实

人们还问

“GLM-5.2：中国开源AI模型以六分之一成本在编程任务上超越GPT-5.5”的简短答案是什么？

智谱AI的GLM 5.2在Artificial Analysis Intelligence Index v4.1上获得51分，成为最高分的开源权重模型，整体排名第四，仅次于Claude Fable 5、Claude Opus 4.8和GPT 5.5。

首先要验证的关键点是什么？

智谱AI的GLM 5.2在Artificial Analysis Intelligence Index v4.1上获得51分，成为最高分的开源权重模型，整体排名第四，仅次于Claude Fable 5、Claude Opus 4.8和GPT 5.5。关键基准测试成绩：FrontierSWE 74.4%、Terminal Bench 2.1（81.0）、SWE bench Pro（62.1）、GPQA Diamond（80.3%）、AIME 2025（86.67%）以及WebDev Arena第一名。

接下来在实践中我应该做什么？

Vercel CEO Guillermo Rauch和Box CEO Aaron Levie等科技领袖公开称赞该模型。

来源

Comments

0 comments

Loading comments...

基准测试	得分	说明
FrontierSWE（长程编程）	74.4%	以约1%的优势领先GPT-5.5（72.6%）；落后Claude Opus 4.8约1%
Terminal-Bench 2.1	81.0	开源权重新纪录
SWE-bench Pro	62.1	有史以来最高开源权重得分
SWE-bench Verified	76.4%	与前沿模型相当
GPQA Diamond（研究生级科学推理）	80.3%	在硬科学推理方面表现出色
AIME 2025（数学推理）	86.67%	顶尖数学性能
MMLU-Pro	80.63%	广泛的学术知识
MMLU	91.72%	通用知识基准
Humanity's Last Exam（使用工具）	54.7%	较上一代提升12个百分点
ProofBench	>30%	首个开源权重模型突破30%——领先第二名11个百分点
WebDev Arena	#1	在人机投票的前端排行榜上超越了Claude Fable 5和Opus 4.8