答案已发布19小时前Last edited 17小时前26 来源

小米 MiMo 飙至每秒 1000 Token：万亿参数模型在标准 GPU 上突破速度极限

2026 年 6 月，小米与 TileRT 发布 MiMo V2.5 Pro UltraSpeed，成为全球首个在通用 GPU 上将万亿参数模型推理速度突破每秒 1000 Token 的成果 [9][12]。速度突破依赖三大协同技术：FP4 混合精度量化（仅压缩 MoE 专家层）、DFlash 块级掩码并行投机解码，以及 TileRT 的常驻内核引擎与线程束级异构流水线协作 [2][37]。

使用 Studio Global AI 搜索并核查事实浏览更多热门页面

29K0

Conceptual visualization of Xiaomi MiMo-V2.5-Pro-UltraSpeed achieving over 1,000 tokens per second on a trillion-parameter model using standard GPUs. — What did Xiaomi announce on June 6, 2026 regarding MiMo-V2.5-Pro-UltraSpeed, including the specific tokens-per-second milestone achieved onA conceptual representation of high-speed AI inference on standard GPU hardware.
AI 提示
Create a landscape editorial hero image for this Studio Global article: What did Xiaomi announce on June 6, 2026 regarding MiMo-V2.5-Pro-UltraSpeed, including the specific tokens-per-second milestone achieved on. Article summary: On **June 8, 2026** (with major reports appearing on June 9), Xiaomi's MiMo team, in collaboration with TileRT, announced **MiMo-V2.5-Pro-UltraSpeed** — a new high-speed inference mode for its trillion-parameter flagship. Topic tags: general, general web, user generated, documentation. Reference image context from search candidates: Reference image 1: visual subject "# Xiaomi rolls out MiMo V2.5 with multimodal AI and improved efficiency. Xiaomi has introduced its MiMo-V2.5 model family, adding multimodal capabilities and advancing its push int" source context "Xiaomi rolls out MiMo V2.5 with multimodal AI and improved efficiency" Reference image 2: visual subje
openai.com

2026 年 6 月 8 日，小米 MiMo 团队携手推理合作伙伴 TileRT，正式推出 MiMo-V2.5-Pro-UltraSpeed 高速推理模式。官方消息显示，一个万亿参数规模的模型首次在单台标准 8 卡 GPU 服务器上，实现了超过 每秒 1000 Token 的解码速度，这被小米称为行业内的首次突破。

速度里程碑：1200 Token/秒的现实

在官方演示中，MiMo-V2.5-Pro-UltraSpeed 的吞吐量峰值甚至逼近了 每秒 1200 Token ，这意味着生成一篇千字长文，模型只需不到两秒。

值得注意的是，这一成就并非依赖昂贵的定制化专用芯片，而是运行在采购便捷的商用 GPU 集群上。小米集团创始人雷军也在微博上发声，强调这是业界首次在万亿参数模型上跨过“1000 tokens/s”的门槛。

Studio Global AI

Search, cite, and publish your own answer

Use this topic as a starting point for a fresh source-backed answer, then compare citations before you share it.

使用 Studio Global AI 搜索并核查事实

人们还问