Spend limits are enforced at two independent levels. An account-level spend limit caps total spend across all gateways for organizations using Unified Billing. Per-gateway rules provide finer-grained control. Whichever limit is hit first blocks subsequent requests . The limits work for both Unified Billing requests—where Cloudflare loads credits with a 5% transaction fee—and bring-your-own-key (BYOK) setups, provided the model's pricing is known
.
A 429 rejection is a blunt instrument. Cloudflare's Dynamic Routing (in beta) offers a smarter alternative by allowing fallback to cheaper models when a budget is exhausted. Routing flows can include Budget Limit nodes that enforce cost quotas and, rather than dropping the request, automatically switch to an alternative model . The same system supports Rate Limit nodes, percentage-based A/B traffic splits, and conditional branching based on request metadata like user plan or team, all without touching application code
.
Perhaps the most significant announcement is a closed beta for identity-driven budgets that integrates with Cloudflare Access and an organization's existing identity provider (IdP). This solves the persistent problem of shared API keys, where, as Cloudflare's blog puts it, "nobody knows who spent what" . Per-person attribution and enforcement tied directly to corporate identity give CIOs and finance teams the unit economics for AI that exist for every other business line item
.
This feature set is a direct response to specific market failures Cloudflare observed among its customers :
By tying spend limits to real dollars and real people, Cloudflare is betting that AI cost management will become as routine as tracking any other cloud infrastructure expense.
Comments
0 comments