答案已發布2 週前Last edited 2 週前18 來源

MLPerf Training v6.0 結果出爐：NVIDIA 全勝姿態稱霸　Blackwell 平台效能大躍進

NVIDIA 喺 MLPerf Training v6.0 實現「全壘打」，贏晒全部七項基準測試，包括最快訓練時間同最高每加速器效能，仲係唯一一個提交晒所有測試嘅平台 [3]。 MLCommons 今次引入咗兩個全新嘅混合專家模型（MoE）預訓練基準：DeepSeek V3（總參數 6,710 億，每個 token 啟動 370 億）同埋體積較細嘅 GPT OSS 20B [3][10]。

使用 Studio Global AI 搜尋並查核事實瀏覽更多熱門頁面

509K0

NVIDIA Blackwell Ultra GPUs powering record-breaking MLPerf Training v6.0 results for massive AI models. — What are the key highlights from the MLPerf Training v6.0 results, including Nvidia's performance across all benchmarks on its Blackwell plaNVIDIA's Blackwell platform set new performance records across all MLPerf Training v6.0 benchmarks, driven by the powerful GB300 NVL72 system.
AI 提示
Create a landscape editorial hero image for this Studio Global article: What are the key highlights from the MLPerf Training v6.0 results, including Nvidia's performance across all benchmarks on its Blackwell pla. Article summary: ## MLPerf Training v6.0 Key Highlights. Topic tags: general, documentation, news, general web, user generated. Reference image context from search candidates: Reference image 1: visual subject "Home » News » NVIDIA Sets MLPerf Inference v6.0 Records with Blackwell Ultra Platform. # NVIDIA Sets MLPerf Inference v6.0 Records with Blackwell Ultra Platform. NVIDIA has publish" source context "NVIDIA Sets MLPerf Inference v6.0 Records with Blackwell Ultra Platform - StorageReview.com" Reference image 2: visual subject "# MLPerf Inference v6.0 Results Explained: GPU Performance Rankings for AI Workloads (2026). MLPerf Inference v6.0 results dropped April 1, 2026, and
openai.com

NVIDIA 喺今次 MLPerf Training v6.0 真係威到盡，唔單止贏晒全部基準測試，仲要係唯一一個提交晒所有七項測試嘅平台，無論係最快訓練時間定係每加速器效能都拎晒第一。

全新混合專家模型（MoE）考驗登場

今次測試嘅一大亮點，就係 MLCommons 新加咗兩個混合專家模型（MoE）嘅預訓練基準測試。一個係參數量達到 6,710 億（671B）、每個 token 會啟動 370 億（37B）參數嘅 DeepSeek-V3，另一個就係規模較細嘅 GPT-OSS-20B。

NVIDIA 係唯一一個夠膽（同有能力）喺呢兩個新 Benchmark 都提交到結果嘅平台。佢哋嘅秘密武器就係 GB300 NVL72 系統，配合客製化嘅 NVIDIA 軟件堆疊、CUDA graphs 同進階嘅 MoE 路由技術，先可以將呢類極具挑戰性嘅 MoE 架構訓練得咁高效。

要處理好似 DeepSeek-V3 咁複雜嘅模型，本身嘅技術含量已經好高。佢內部採用咗多頭潛伏注意力（MLA）、將專家網絡精細分割成 160 個路由專家、仲有多 token 預測同輔助損失自由負載平衡等技術，對硬件同軟件嘅協同要求極高。

CoreWeave 破紀錄：兩分鐘練完 DeepSeek-V3

要講今次最震撼眼球嘅結果，一定係雲端服務商 CoreWeave 嘅表現。佢哋用咗足足 8,192 張 NVIDIA GB300 NVL72 GPU，呢個已經係今輪測試入面最大規模嘅 GB300 叢集，喺現時客戶都用得返嘅 CoreWeave Cloud 生產環境基礎設施上，。

Studio Global AI

Search, cite, and publish your own answer

Use this topic as a starting point for a fresh source-backed answer, then compare citations before you share it.

使用 Studio Global AI 搜尋並查核事實

人們還問