One quantified analysis posted to Reddit by u/tadanada explicitly called out the cost inflation, comparing a $1,552 benchmark run for Gemini 3.5 Flash against $278 for Gemini 3 Flash—a 5.6x difference that explained why paid plans were collapsing so quickly .
Google's response came in two waves:
Even the 9x quota increase didn't fully solve the problem. Some developers reported hitting their weekly Flash lockout within 30 minutes of resuming work after the quota reset .
Gemini 3.5 Flash Low represents a more surgical fix: instead of just giving developers more raw quota (a supply-side bandage), it gave them a way to use fewer tokens per task (a demand-side control).
Google's official documentation describes the Low variant as having been "significantly improved for code and agentic tasks that require fewer steps, offering strong quality at lower latency and cost" . The company states the Low variant generates roughly 45% fewer output tokens than the now-renamed Medium variant
.
For developers, this means they can now set thinking_level: "low".
This effectively gives developers a four-tier dial for reasoning effort—minimal, low, medium, high—instead of a binary choice between "thinking on" and "thinking off" .
One of the biggest API traps in the Gemini 3.5 Flash launch was the unannounced change of the default thinking_level from high to medium. Developers who ported directly from gemini-3-flash-preview without explicitly setting a thinking level were silently getting different reasoning behavior . This meant that even after the Low variant shipped, many developers were still using more tokens than necessary for simple tasks because they hadn't noticed the default had shifted.
The Low variant essentially completes the fix: it gives developers an explicit, documented, and purpose-built level for the kind of cost-sensitive work that the Flash family was originally designed for.
The rollout of Gemini 3.5 Flash Low, combined with the 9x quota increases and default thinking level adjustment, has stabilized the Antigravity developer experience. Developers can now:
thinking_level: "low"The Low variant isn't a replacement for Google's quota increases—it's a complement. Developers who use both the new thinking level and the 9x expanded quotas can now work through meaningful coding sessions without hitting limits or burning through their monthly Antigravity budgets in an afternoon.
Comments
0 comments