Text to Speech API

Lifelike speech in 50+ languages from a single API call. Stream long-form audio, clone any voice from a 10-30 second sample, and control delivery with SSML.

Your first request

POST
/v1/audio/speech
1curl -X POST https://api.speechify.ai/v1/audio/speech \
2 -H "Authorization: Bearer <token>" \
3 -H "Content-Type: application/json" \
4 -d '{
5 "input": "Hello! This is the Speechify text-to-speech API.",
6 "voice_id": "george",
7 "audio_format": "mp3",
8 "model": "simba-english"
9}'

Ready to run it end to end? The Quickstart walks you through your first call — get a key, install the SDK, generate speech, and play it — in about five minutes.

Grab an API key at console.speechify.ai/api-keys and set SPEECHIFY_API_KEY so the SDKs authenticate automatically.

Set up

Build with TTS

Models and languages

ModelBest forLanguagesHighlights
simba-englishFlagship English qualityEnglishHighest quality, lowest streaming latency, full SSML + emotion control
simba-multilingualMultilingual and mixed-language input50+Same voice IDs across every language, no separate cloning required

See Models and Language Support for the full matrix.

Resources