答案已发布上周Last edited 5天前21 来源

小鹏豪赌500亿：为什么说“语言是毒药”，要革了自动驾驶的命？

小鹏确认每年仅用于AI模型训练的费用就高达5亿美元，且这只是算力成本，其2026年整体AI研发预算预计将接近10亿美元。在CVPR 2026上，小鹏发布了全新的VLA 2.0架构，完全移除了驾驶过程中的语言中间环节。其AI负责人直言：“语言是毒药”。

使用 Studio Global AI 搜索并核查事实浏览更多热门页面

380K0

A conceptual visualization of XPeng's VLA 2.0 AI architecture transforming raw visual data directly into driving actions, bypassing language tokens. — How much does Xpeng spend annually on AI model training for autonomous driving, what new VLA 2.0 architecture did it introduce at CVPR 2026XPeng's VLA 2.0 model adopts a 'Vision-Implicit Token-Action' path, eliminating language as an intermediate step to improve real-time driving performance.
AI 提示
Create a landscape editorial hero image for this Studio Global article: How much does Xpeng spend annually on AI model training for autonomous driving, what new VLA 2.0 architecture did it introduce at CVPR 2026. Article summary: Here are the answers to your three questions, based on recent reporting by Electrek and XPeng's official announcements.. Topic tags: general, general web, user generated. Reference image context from search candidates: Reference image 1: visual subject "Xpeng’s head of autonomous driving told Electrek that the company is spending roughly 300 million RMB (~$41 million) per month on AI training alone and believes it has already reac" source context "Xpeng spends $500M/year on AI training to beat Tesla FSD | Electrek" Reference image 2: visual subject "Xpeng’s head of autonomous driving told Electrek that the company is spending roughly 300 million RMB (~$41 millio
openai.com

在一场愈演愈烈的技术和舆论战中，中国车企小鹏汽车公开了其挑战特斯拉自动驾驶霸主地位的蓝图。在2026年6月的计算机视觉与模式识别大会（CVPR）上，小鹏详细阐述了一种激进的AI架构，该架构抛弃了大多数机器人系统的核心组成部分——语言。结合其公开的惊人年度训练开支，小鹏正在证明，其下一代技术路线不仅与众不同，而且从根本上更快、更可靠。

小鹏的AI负责人、通用智能中心主管刘宪明博士在接受美国电动汽车媒体Electrek采访时透露，公司每月在AI模型训练上的花费约为3亿元人民币，折合年化5亿美元。这个数字仅仅涵盖了训练自动驾驶模型的原始算力成本，不包括公司整体的AI研发预算。小鹏的AI相关总研发投入在2025年约为6.52亿美元，并预计在2026年随着野心的扩大攀升至约9.7亿美元。

CVPR 2026上的VLA 2.0：移除“语言”瓶颈

小鹏在CVPR 2026上的重头戏是正式介绍了其第二代视觉-语言-行动模型——VLA 2.0。该架构从根本上背离了许多AI系统（包括小鹏自己的第一代模型）处理驾驶任务的方式。

在传统的VLA流程中，系统遵循一个顺序过程：汽车看见路况，将视觉感知转化为类语言的数据块（token），然后对这些语言数据块进行推理，最终生成驾驶动作。刘博士将这个中间步骤描述为关键弱点，并直言不讳地表示，。他的论点是，在需要毫秒级反应的驾驶过程中，语言数据块会引入固有的延迟，并注入不相关的语义噪声。

Studio Global AI

Search, cite, and publish your own answer

Use this topic as a starting point for a fresh source-backed answer, then compare citations before you share it.

使用 Studio Global AI 搜索并核查事实

人们还问