Speechify Text to Speech Quickstart — Your First API Call

Get your API key

Sign up at console.speechify.ai
Go to API Keys
Copy your default API key (or create a new one)

Set it as an environment variable so the SDKs pick it up automatically:

$ export SPEECHIFY_API_KEY="your-api-key-here"

API keys are sensitive. Never expose them in client-side code or public repositories. See the Authentication guide for security best practices.

Install the SDK

Python

TypeScript

$ pip install speechify-api

Prefer raw HTTP? No install needed — use the cURL tab in the examples below.

Generate speech

Send text to POST /v1/audio/speech. These examples are generated from our Fern SDKs and the API spec, so they switch languages and stay in sync with the live endpoint:

POST

/v1/audio/speech

1 curl -X POST https://api.speechify.ai/v1/audio/speech \
2      -H "Authorization: Bearer <token>" \
3      -H "Content-Type: application/json" \
4      -d '{
5   "input": "Hello! This is the Speechify text-to-speech API.",
6   "voice_id": "george",
7   "audio_format": "mp3",
8   "model": "simba-english"
9 }'

Try it

A successful call returns the audio payload:

Response

1 {
2   "audio_data": "string",
3   "audio_format": "wav",
4   "billable_characters_count": 1,
5   "speech_marks": {
6     "chunks": [
7       {
8         "end": 1,
9         "end_time": 1.1,
10         "start": 1,
11         "start_time": 1.1,
12         "type": "string",
13         "value": "string"
14       }
15     ],
16     "end": 1,
17     "end_time": 1.1,
18     "start": 1,
19     "start_time": 1.1,
20     "type": "string",
21     "value": "string"
22   }
23 }

The Python and TypeScript SDKs return decoded audio bytes. The raw HTTP response base64-encodes the audio in the audio_data field, so decode it before saving.

Save and play

Assign the call above to response, then write the audio to output.mp3:

Python

TypeScript

cURL

1 with open("output.mp3", "wb") as f:
2     f.write(response.audio_data)

Then play it from the terminal:

$ afplay output.mp3

Choose a voice

List the built-in voices to find one that fits, then pass its id as the voice_id:

GET

/v1/voices

1 curl https://api.speechify.ai/v1/voices \
2      -H "Authorization: Bearer <token>"

Try it

Popular built-in voices: george, henry, carly, sabrina. You can also clone any voice from a short audio sample.

Add emotion

Use SSML to control how the voice sounds — pass it as the input parameter and the API detects it automatically:

1 <speak>
2   <speechify:style emotion="cheerful">
3     Great news! Your order has been shipped!
4   </speechify:style>
5 </speak>

SSML also controls pitch, rate, pauses, and emphasis. See SSML and Emotion Control for the full reference.

Next steps

Stream audio

Process up to 20,000 characters with real-time audio streaming

Clone a voice

Create a custom voice from a 10-30 second audio sample

Control speech

Use SSML for pitch, rate, pauses, and emphasis

API Reference

Full endpoint documentation with request/response schemas