API Limits

Character limits, rate limits, and concurrency limits

Character limits

EndpointLimitUse case
/v1/audio/speech2,000 charactersShort-form text (sentences, paragraphs)
/v1/audio/stream20,000 charactersLong-form text (articles, chapters)

Character counts include SSML tags. For text longer than the limit, split it into multiple requests.

Rate limits

Rate limits differ by product because the workloads differ. Build audio is cost-per-call. Agents is chatty interactive traffic.

Build audio

Applies to /v1/audio/speech and /v1/audio/stream.

PlanSustained requests per second
Free1
Paid20

Agents

Applies to every Agents endpoint under /v1/agents/*: agents, conversations, knowledge bases, tools, tests, memories, audio assets, batch calls, IVR menus, and telephony.

PlanSustained requests per secondBurst
Free530
Paid2060

Burst is the peak bucket capacity. A fresh bucket absorbs the burst in a single second, then refills at the sustained rate. This lets a console page load or batch operation fire many parallel requests without hitting 429, while still capping long-running abuse at the sustained rate.

Concurrency limits

Concurrency limits cap the number of simultaneous in-flight requests per account.

Build audio

Applies to /v1/audio/speech and /v1/audio/stream.

PlanSimultaneous requests
Free1
Paid15

Agents

Applies to the authenticated Agents endpoints listed above. The primary target is POST /v1/agents/{id}/conversations, which allocates a live-call session.

PlanSimultaneous requests
Free10
Paid30

All limits apply per account, not per API key.

Handling 429 responses

When you exceed rate or concurrency limits, the API returns 429 Too Many Requests with a Retry-After header.

1import time
2from speechify import Speechify
3
4client = Speechify()
5
6def generate_with_retry(text, max_retries=3):
7 for attempt in range(max_retries):
8 try:
9 return client.audio.speech(
10 input=text,
11 voice_id="george",
12 audio_format="mp3",
13 )
14 except Exception as e:
15 if "429" in str(e) and attempt < max_retries - 1:
16 time.sleep(2 ** attempt)
17 else:
18 raise

Processing long texts

For texts exceeding 20,000 characters, split into chunks and process sequentially:

1def split_text(text, max_chars=19000):
2 """Split text at sentence boundaries within the character limit."""
3 chunks = []
4 current = ""
5 for sentence in text.split(". "):
6 if len(current) + len(sentence) + 2 > max_chars:
7 chunks.append(current.strip())
8 current = sentence + ". "
9 else:
10 current += sentence + ". "
11 if current.strip():
12 chunks.append(current.strip())
13 return chunks

FAQ

The request is rejected with an error response. Split your text into smaller chunks within the allowed limits.

Upgrade to a paid plan for 20 req/sec on Build audio (with 15 concurrent requests) and 20 req/sec + 60 burst on Agents endpoints. Enterprise customers can request custom limits: contact sales.

Track usage through the Speechify Console dashboard.