SpeechifyAI Build: TTS, Voices, Streaming, and SSML

Your first request

POST

/v1/audio/speech

1 curl -X POST https://api.speechify.ai/v1/audio/speech \
2      -H "Authorization: Bearer <token>" \
3      -H "Content-Type: application/json" \
4      -d '{
5   "input": "Hello! This is the Speechify text-to-speech API.",
6   "voice_id": "geffen_32",
7   "audio_format": "mp3",
8   "model": "simba-3.2"
9 }'

Try it

Ready to run it end to end? The Quickstart walks you through your first call: get a key, install the SDK, generate speech, and play it.

Grab an API key at platform.speechify.ai/api-keys and set SPEECHIFY_API_KEY so the SDKs authenticate automatically.

Set up

Install an SDK

pip install speechify-api for Python, npm install @speechify/api for TypeScript. Both read SPEECHIFY_API_KEY from the environment automatically.

Authenticate

A single Authorization: Bearer key works for every endpoint. Manage and rotate keys in the console.

Build With Speech

Streaming

Start playback before the full audio is generated. Up to 20,000 characters per request.

Voice cloning

Clone any voice from a 10-30 second sample. Cloned voices work across every supported language.

SSML and emotion

Fine-grained control over pitch, rate, pauses, emphasis, and 13 emotion presets.

Speech marks

Word-level timestamps for highlighting, captions, and audio-text sync.

Integrations

Speechify voices drop into every major voice-agent platform. Native plugins where they exist, an open-source tts-shims proxy where they don’t.

LiveKit

Add speechify.TTS(...) to a LiveKit AgentSession via the official Python plugin.

Vapi

Serve Speechify PCM to Vapi custom voice via the tts-shims Vapi provider.

Deepgram

Point Deepgram Voice Agent’s open_ai speak provider at the tts-shims shim.

See the Integrations overview for the platform picker.

Models and languages

Model	Best for	Languages	Highlights
`simba-3.2`	Recommended for new English integrations	English	Lowest TTFB, richest expressivity; the recommended Simba 3 model
`simba-3.0`	Streaming-native beyond English	English + 6 European	German, Spanish, French, Italian and Brazilian Portuguese; set `language` to pick one
`simba-multilingual`	Multilingual and mixed-language input	30+	Same voice IDs across every language, no separate cloning required
`simba-english`	Default English (retained for cloned voices)	English	Current API default when `model` is omitted; the only English model that supports cloned/personal voices

See Models and Language Support for the full matrix.

Resources

API Reference

Endpoint schemas, parameters, and response shapes.

Examples

End-to-end demo projects on GitHub.

Console

Manage API keys, voices, and billing.