ERNIE 5.1 is best understood as an economics story. In its release, Baidu says the model inherits ERNIE 5.0’s pre-training foundation, compresses total parameters to approximately one-third and active parameters to approximately one-half, and still achieves leading foundational performance at its model scale while using only about 6% of the pre-training cost of comparable models [7]. That is why the release matters: it presents a route to strong model performance that depends less on a fresh giant training run and more on reusing, slimming, and post-training an existing foundation.
The real significance: cost-performance, not raw scale
Baidu’s announcement is not framed primarily around making ERNIE larger. It is framed around what the company says it can preserve after compression: strong foundational performance at its scale, with far less pre-training spend [7]. The ERNIE blog also says ERNIE 5.1 ranks first in China on Arena Search and improves agent, reasoning, and creative capabilities through disaggregated fully-asynchronous reinforcement learning and scaled agentic post-training [
12].
For the global AI race, the strategic point is straightforward: if a lab can get close to leading performance with less pre-training compute, the advantage shifts toward training design, reuse, and post-training efficiency rather than parameter count alone. ERNIE 5.1 is important because Baidu is making that argument explicitly.
What Baidu means by the 6% figure
Read the number narrowly. Baidu’s claim is about pre-training cost compared with comparable models [7]. It is not, from the cited materials alone, an audited statement about total development cost, deployment cost, inference pricing, or hardware efficiency.




