Models
Choose the right text-to-speech model for your use case
Available models
Pass the model ID as the model parameter in your API calls. If omitted, the API defaults to simba-english.
Python
TypeScript
cURL
Simba English
Optimized for English text-to-speech with the highest quality output.
- Clear, natural-sounding speech
- Consistent quality across outputs
- Full support for SSML and emotion control
- Zero-shot voice cloning from short audio samples
- Fine-tuned voice cloning from hours of speaker audio (contact sales)
Simba Multilingual
Supports multiple languages, including mixing languages within a single sentence.
- 6 fully supported languages, 17 in beta, 26 coming soon
- Automatic language detection when the
languageparameter is omitted - Zero-shot voice cloning works across all supported languages
- Fine-tuned voice cloning available (contact sales)
See Language Support for the full list.
Voice cloning
Both models support two tiers of voice cloning:
See Voice Cloning for implementation details.
FAQ
Which model should I use?
Use Simba English if your content is English-only — it produces the highest quality output. Use Simba Multilingual if you need non-English languages or mixed-language content.
Can I switch models without changing my code?
Yes. Just change the model parameter. All other parameters (voice, format, SSML) work the same across models.
Do both models support the same voices?
Built-in system voices may differ between models. Cloned voices work with both models.