Emotion Control

Add emotional expression to synthesized speech

Overview

Use the <speechify:style> SSML tag to control the emotion of synthesized speech:

1<speak>
2 <speechify:style emotion="cheerful">
3 Great news! Your order has been shipped!
4 </speechify:style>
5</speak>

Pass SSML as the input parameter — the API detects it automatically.

1response = client.tts.audio.speech(
2 input='<speak><speechify:style emotion="cheerful">Great news!</speechify:style></speak>',
3 voice_id="george",
4 audio_format="mp3",
5)

Supported emotions

angry

Forceful, intense

cheerful

Upbeat, positive

sad

Downcast, melancholic

terrified

Extreme fear

relaxed

Calm, at-ease

fearful

Anxious, worried

surprised

Astonished, unexpected

calm

Tranquil, peaceful

assertive

Confident, authoritative

energetic

Dynamic, lively

warm

Friendly, inviting

direct

Straightforward, clear

bright

Optimistic, cheerful

Tips for best results

  • Match text to emotion — “I told you not to do that!” works with angry, not with cheerful
  • Keep sentences short — shorter sentences produce stronger emotional expression
  • Use punctuation! for anger/excitement, ? for uncertainty, ... for hesitation/sadness
  • Combine with SSML — pair emotions with <prosody> and <break> for finer control. See SSML docs.

Examples

1<speak>
2 <speechify:style emotion="angry">Stop it! Right now!</speechify:style>
3</speak>