The new calculation considers several factors:
In practice, this means one prompt may count far more than another depending on how demanding the request is. A simple text question might use minimal quota, while tasks such as research, coding, or media generation may consume significantly more.
The new system also introduces a different rhythm for resets.
Instead of a single daily reset, users now operate within rolling windows of availability, which can allow bursts of activity but still enforce an overall weekly cap.
According to Google’s support documentation, these changes apply to users aged 18 and older, while younger users were not initially affected by the update.
Google introduced and reshaped several consumer AI subscription tiers around the same time.
Key tiers reported around the rollout include:
Reports also noted that Google adjusted the pricing structure of its highest tier, introducing the $100 AI Ultra plan while lowering the cost of the previous top plan.
While exact quotas vary by plan and task type, Google states that paid tiers receive higher compute limits and priority access to models and features.
When users exhaust their quota, Gemini may shift interactions to smaller, faster models rather than stopping completely, allowing basic functionality to continue.
Google also supports purchasing additional AI credits for some services or upgrading to a higher subscription tier to increase available usage.
However, details about exact downgrade behavior or the precise mechanics of credit purchases vary by feature and plan.
The rollout generated backlash from some Gemini users, particularly among paid subscribers.
Several complaints center on how the new quota system affects real‑world usage:
Critics described the change as a "bait‑and‑switch" because the pricing remained similar while the mechanics of usage became stricter.
Google’s move reflects a broader trend in AI services toward compute‑ or token‑based pricing instead of simple message counts.
Under this model, platforms measure the actual computational work performed rather than the number of requests.
This approach is increasingly common across advanced AI systems because:
By tying usage limits to compute, providers can allocate resources more efficiently as AI systems become more powerful and expensive to run.
The May 2026 update marks one of the biggest structural changes to Gemini since launch. Instead of counting prompts, the platform now measures how much work each interaction requires.
For casual users, the experience may feel similar. But for heavy users—especially those running long conversations, complex prompts, or advanced tools—the new compute‑based quotas can cause limits to arrive much sooner than under the previous daily prompt caps.
Comments
0 comments