Voice Agents quickstart | Speechify API

Get your API key

Sign up at console.speechify.ai
Go to API Keys
Copy your default API key (or create a new one)

$ export SPEECHIFY_API_KEY="your-api-key-here"

Install the SDK

The official Python and TypeScript SDKs auto-generate against the same OpenAPI spec — every method below is type-checked and version-pinned. Both read SPEECHIFY_API_KEY from the environment automatically.

Python

TypeScript

cURL

$ pip install speechify-api

Create an agent

An agent bundles a prompt, a voice, and a default LLM. Voice IDs come from the regular /v1/voices catalog — anything that works for TTS works for Voice Agents, including your cloned voices.

Python

TypeScript

cURL

1 from speechify import Speechify
2 
3 client = Speechify()
4 
5 agent = client.tts.agents.create(
6     name="Support Bot",
7     prompt="You are a friendly support agent for a SaaS product. "
8            "Greet callers, answer questions about billing and account "
9            "settings, and transfer to a human if you cannot help.",
10     first_message="Hi, this is Sabrina with support. How can I help today?",
11     voice_id="sabrina",
12     language="en",
13     temperature=0.7,
14 )
15 print(agent.id)

Start a conversation

POST /v1/agents/{id}/conversations provisions a realtime voice session, dispatches the agent, and returns a short-lived access token. The caller connects directly to the session with that token — audio never flows through our server.

Python

TypeScript

cURL

1 session = client.tts.agents.create_conversation(id=agent.id)
2 print(session.url, session.token)  # pass these to your browser/SDK

The response shape:

1 {
2   "conversation": { "id": "…", "agent_id": "…", "status": "pending", "transport": "web", "…": "…" },
3   "room":  "conv_<agent>_<user>_<ts>",
4   "token": "eyJhbGc…",
5   "url":   "wss://…"
6 }

Connect from the browser

The upcoming @speechify/agents-js SDK handles the session connection, microphone capture, and audio playout in a single call — we’ll link it from here as soon as it publishes. Until then, the easiest path is the console Test Call button.

Test it from the console

The quickest path to hearing the agent without writing client code: open the agent in the console, click Test Call, and talk.

Inspecting conversations

Every turn is streamed to the control plane and persisted with timestamps.

Python

TypeScript

cURL

1 # List recent conversations for this account
2 convs = client.tts.conversations.list()
3 
4 # Fetch one, plus its transcript and post-call evaluation
5 conv = client.tts.conversations.get(conv_id)
6 messages = client.tts.conversations.list_messages(conv_id)
7 evals = client.tts.conversations.list_evaluations(conv_id)

Next steps

Attach tools

Give the agent access to your backend, the caller’s device, or built-in actions like end_call.

Listen for events

Receive conversation.started, conversation.ended, message.created webhooks.

Clone your own voice

Use a custom voice with your agents.

API Reference

Full schemas for /v1/agents, /v1/tools, /v1/conversations.