Spend Limits
Every API key can carry an optional monthly spend limit in US dollars. Once the key’s billed usage in the current calendar month reaches the limit, further requests with that key are refused until you raise the limit, switch to another key, or the month rolls over.
Spend limits are the blast-radius control for API keys: a leaked key, a runaway script, or an experimental integration can never spend more than the budget you gave it.
How it works
- The limit covers everything the key does - speech synthesis, voice agents and outbound calls - in one dollar number.
- Spend is measured by the same billing engine that produces your invoice, at your plan’s prices. The number the limit counts is the number you are billed.
- The window is the calendar month (UTC). Limits reset automatically at 00:00 UTC on the 1st.
- Keys without a limit are unaffected. Limits are per key, independent of your workspace balance.
Enforcement runs against billed usage, which trails live traffic by a couple of minutes. A key crossing its limit mid-burst can briefly overshoot before it is cut off.
Setting a limit
Manage limits in the console under API Keys: set a limit when creating a key, or open Edit on an existing key to add, raise, lower, or remove one at any time - the change takes effect immediately and the key secret never changes. Each capped key shows its month-to-date spend against the limit right in the key list.
Workspace-wide monthly budget
Per-key limits bound one credential; the workspace budget bounds everything. Owners can set
a monthly USD budget in workspace settings (Settings → Workspace). Once the workspace’s billed
month-to-date spend - across every API key plus console usage - reaches the budget, new requests
and call dispatches are refused with a 402 whose error code is spend_budget_exceeded,
until the budget is raised or the month resets (1st, UTC). The two controls compose: a key stops
at its own limit even when the workspace budget has room, and the budget stops everything even
for uncapped keys.
The budget has the same webhook alerts as per-key limits: subscribe an endpoint to
workspace.spend_budget.warning (80%) and workspace.spend_budget.reached (100%). Each fires
at most once per month for a given budget value (changing the budget re-arms them); data.object
is a workspace snapshot with monthly_budget and monthly_spend, and the crossing details ride
as data.spend_budget_alert.
Get warned before a key hits its limit
Workspace webhook endpoints can subscribe to two spend-limit events, so you hear about a key approaching its budget instead of discovering it through failing requests:
Each event fires at most once per key per calendar month for a given limit value; changing
the limit re-arms both thresholds against the new value. data.object is the API key
exactly as a GET returns it (including spend_cap and spend_cap_remaining), and the
crossing details ride alongside it as data.spend_cap_alert:
When a key hits its limit
Requests and new call dispatches with the key fail with HTTP 402 and the error code
spend_cap_exceeded:
Handle it distinctly from payment_required: payment_required means the workspace balance
needs a top-up; spend_cap_exceeded means this specific key hit the budget you set for it, and
raising the key’s limit in the console unblocks it immediately.
In-flight calls are never interrupted. The limit gates new requests and new call dispatches only.