Models
Choose the right text-to-speech model for your use case
Available models
Pass the model ID as the model_id parameter in your API calls. If omitted, the API defaults to simba-english.
Python
TypeScript
cURL
Simba English
Optimized for English text-to-speech with the highest quality output.
- Clear, natural-sounding speech
- Consistent quality across outputs
- Full support for SSML and emotion control
- Zero-shot voice cloning from short audio samples
- Fine-tuned voice cloning from hours of speaker audio (contact sales)
Simba Multilingual
Supports multiple languages, including mixing languages within a single sentence.
- 6 fully supported languages, 17 in beta, 26 coming soon
- Automatic language detection when the
languageparameter is omitted - Zero-shot voice cloning works across all supported languages
- Fine-tuned voice cloning available (contact sales)
See Language Support for the full list.
Voice cloning
Both models support two tiers of voice cloning:
See Voice Cloning for implementation details.
FAQ
Which model should I use?
Use Simba English if your content is English-only — it produces the highest quality output. Use Simba Multilingual if you need non-English languages or mixed-language content.
Can I switch models without changing my code?
Yes. Just change the model_id parameter. All other parameters (voice, format, SSML) work the same across models.
Do both models support the same voices?
Built-in system voices may differ between models. Cloned voices work with both models.