Spend Limits

Cap what any API key can spend each month

Every API key can carry an optional monthly spend limit in US dollars. Once the key’s billed usage in the current calendar month reaches the limit, further requests with that key are refused until you raise the limit, switch to another key, or the month rolls over.

Spend limits are the blast-radius control for API keys: a leaked key, a runaway script, or an experimental integration can never spend more than the budget you gave it.

How it works

  • The limit covers everything the key does - speech synthesis, voice agents and outbound calls - in one dollar number.
  • Spend is measured by the same billing engine that produces your invoice, at your plan’s prices. The number the limit counts is the number you are billed.
  • The window is the calendar month (UTC). Limits reset automatically at 00:00 UTC on the 1st.
  • Keys without a limit are unaffected. Limits are per key, independent of your workspace balance.

Enforcement runs against billed usage, which trails live traffic by a couple of minutes. A key crossing its limit mid-burst can briefly overshoot before it is cut off.

Setting a limit

Manage limits in the console under API Keys: set a limit when creating a key, or open Edit on an existing key to add, raise, lower, or remove one at any time - the change takes effect immediately and the key secret never changes. Each capped key shows its month-to-date spend against the limit right in the key list.

Workspace-wide monthly budget

Per-key limits bound one credential; the workspace budget bounds everything. Owners can set a monthly USD budget in workspace settings (Settings → Workspace). Once the workspace’s billed month-to-date spend - across every API key plus console usage - reaches the budget, new requests and call dispatches are refused with a 402 whose error code is spend_budget_exceeded, until the budget is raised or the month resets (1st, UTC). The two controls compose: a key stops at its own limit even when the workspace budget has room, and the budget stops everything even for uncapped keys.

The budget has the same webhook alerts as per-key limits: subscribe an endpoint to workspace.spend_budget.warning (80%) and workspace.spend_budget.reached (100%). Each fires at most once per month for a given budget value (changing the budget re-arms them); data.object is a workspace snapshot with monthly_budget and monthly_spend, and the crossing details ride as data.spend_budget_alert.

Get warned before a key hits its limit

Workspace webhook endpoints can subscribe to two spend-limit events, so you hear about a key approaching its budget instead of discovering it through failing requests:

EventFires when
api_key.spend_cap.warningThe key’s billed spend this month crosses 80% of its limit
api_key.spend_cap.reachedThe key’s billed spend this month reaches 100% of its limit

Each event fires at most once per key per calendar month for a given limit value; changing the limit re-arms both thresholds against the new value. data.object is the API key exactly as a GET returns it (including spend_cap and spend_cap_remaining), and the crossing details ride alongside it as data.spend_cap_alert:

1{
2 "type": "api_key.spend_cap.warning",
3 "data": {
4 "object": {
5 "id": "key_...",
6 "name": "prod-mobile",
7 "api_key": "sk_...cdef",
8 "scopes": ["audio:all"],
9 "spend_cap": 50,
10 "spend_cap_remaining": 7.5,
11 "created_at": "2026-05-14T09:12:00Z"
12 },
13 "spend_cap_alert": {
14 "threshold_percent": 80,
15 "spend": 42.5,
16 "resets_at": "2026-08-01T00:00:00Z"
17 }
18 }
19}

When a key hits its limit

Requests and new call dispatches with the key fail with HTTP 402 and the error code spend_cap_exceeded:

1{
2 "error": {
3 "code": "spend_cap_exceeded",
4 "message": "This API key has reached its monthly spend cap. Raise the cap or use a different key; the cap resets on Aug 1 (UTC)."
5 },
6 "request_id": "..."
7}

Handle it distinctly from payment_required: payment_required means the workspace balance needs a top-up; spend_cap_exceeded means this specific key hit the budget you set for it, and raising the key’s limit in the console unblocks it immediately.

In-flight calls are never interrupted. The limit gates new requests and new call dispatches only.