SpeechifyAI Agents: Real-Time Voice AI Overview

Agents is in beta. The product is stable enough for production traffic, but request and response shapes on the Agents endpoints may still change before general availability. Pin a dated API version and check the changelog before upgrading.

SpeechifyAI Agents lets you put a talking, listening AI in your product in minutes. An agent is a reusable definition: prompt, voice, tools, and evaluation criteria. Users talk to it over the web or a phone line.

What you get

Speechify voices: a curated catalog of natural voices, available via GET /v1/agents/voices. Cloned and personal voices stay Build-only and cannot be assigned to agents.
Low-latency realtime pipeline: speech in, agent response, speech out.
Tools: call your backend, run code on the caller’s device, connect to an MCP server, or invoke built-ins like end_call and transfer_to_number.
Transcripts: every turn persisted with timestamps and tool traces.
Post-call evaluation: LLM-graded criteria and structured data extraction after hang-up.

How it fits together

Your server calls POST /v1/agents/{id}/conversations. We provision a realtime voice session, dispatch the agent, and return a short-lived token and URL. Your browser or SDK connects to the session using that token. Audio, transcripts, and tool calls flow over the session. The API receives lifecycle events and persists the transcript and evaluation.

When to reach for an agent

Inbound support and triage. Answer routine questions before a human has to pick up.
Outbound follow-ups. Confirm appointments, check in on customers, collect structured information at scale.
IVR replacement. Replace tone-tree menus with a conversation that routes the caller correctly the first time.

Build without code in the console

Everything here is also a no-code workflow in the console: write the prompt, pick a voice, attach knowledge, connect a phone number, and preview the conversation in your browser. Start with the Quickstart or take the dashboard tour.

What to read next

Quickstart

Create an agent and place your first test call.

Tools

Give your agent access to your backend, the caller’s device, or built-in actions like end_call.

Webhooks

Receive a signed POST after each conversation completes.

API Reference

Full schemas for /v1/agents, /v1/agents/tool-definitions, /v1/agents/conversations.