Create Agent

Create a voice agent.

Authentication

AuthorizationBearer

Enter your API key with the Bearer prefix, e.g. ‘Bearer sk_…’.

Request

This endpoint expects an object.
namestringRequired1-100 characters
voice_idstringRequired

Voice slug from the VMS catalog (see GET /v1/voices). Required — the server rejects writes with an unknown or empty slug.

slugstringOptional<=64 characters

Optional. Server derives slug from name with a random suffix when omitted; if you supply your own, a collision returns 400 ‘slug already taken’.

promptstringOptional
first_messagestringOptional
Greeting spoken verbatim at session start when included in the agent's flow graph.
languagestringOptional

ISO 639-1 code. Defaults to ‘en’ when omitted.

llm_providerenumOptional

LLM backend. Leave empty (or omit both llm_provider and llm_model) to use the platform default (today: Speechify Kimi K2.6, resolved server-side at dispatch). When set, must be paired with a non-empty llm_model; mixing a populated provider with an empty model is rejected as a 400. custom additionally requires llm_base_url.

llm_modelstringOptional

Chat model slug. Leave empty to use the platform default. For openai / speechify the (provider, model) pair must be in the allowed table; for custom it is free-form.

llm_base_urlstringOptional

Custom OpenAI/vLLM-compatible endpoint base URL. Required when llm_provider is custom, rejected otherwise.

llm_api_keystringOptional

Bearer key for the custom endpoint. Write-only - stored encrypted, never returned (GET exposes llm_api_key_set). Optional even for custom (keyless endpoints); rejected for any other provider.

llm_extra_bodymap from strings to anyOptional

Optional JSON object forwarded verbatim to the custom endpoint as the chat.completions extra_body (reasoning / sampling knobs). Valid only when llm_provider is custom.

temperaturedoubleOptional
0.0..1.0. Defaults to 0.7 when omitted.
widget_configobjectOptional

Customer-editable appearance + behaviour payload for the embedded <speechify-agent> pill: button text, avatar style, orb colours, terms-and-conditions markdown, transcript display. Every field is optional - empty fields fall back to the widget’s compile-time defaults.

is_publicbooleanOptional
Defaults to false when omitted.
allowed_originslist of stringsOptional
hostname_allowlistlist of stringsOptional

Optional per-agent hostname allowlist (see Agent schema).

memory_enabledbooleanOptional
Defaults to false when omitted.
memory_retention_daysintegerOptional
Defaults to 90 when omitted.
webhook_urlstringOptional

Customer-facing post-call webhook URL.

webhook_secretstringOptional

HMAC-SHA256 secret seed. Write-only — never echoed back on reads; clients see webhook_secret_set: true instead.

amdobjectOptional

AMD routing config. Optional on create; omitted means AMD off. See AMDConfig schema.

save_audio_recordingbooleanOptional

When set, opts the agent into per-conversation audio recording. Defaults to false when omitted.

navigator_modebooleanOptional

When set, opts the agent into IVR-tuned turn handling. Defaults to false when omitted.

ivr_memory_enabledbooleanOptional

When omitted, defaults to true. Set to false to opt-out of the IVR-memory cache lookup for this agent.

tts_speaking_ratedouble or nullOptional
tts_playback_ratedouble or nullOptional

Post-process pitch-preserving time-stretch on the synthesized audio. See the field on Agent for semantics.

response_delay_secondsdouble or nullOptional

Per-agent override for the worker’s endpointing min_delay on the VAD path (seconds). See the field on Agent for semantics. Range 0.0..5.0; null means use the stack default.

inactivity_timeout_secondsintegerOptional>=0

Per-agent silence-tolerance override in seconds. Send 0 to clear the override and fall back to the platform default. Negative values are rejected.

background_noise_presetstringOptional

Pre-mixed ambient bed slug. Send empty string ("") to disable the bed, which also clears background_noise_volume.

background_noise_volumedoubleOptional

Volume of the background-noise bed (0..1). Ignored when background_noise_preset is empty.

stt_overrideenumOptional

Optional non-default streaming-STT stack for this agent. Omit to use the worker’s default stack (today: whisper-v3). See the Agent schema for the full option semantics.

Response

The created agent.
idstring

Prefixed wire identifier (agent_<26 char Crockford base32>). ADR 0015 Cluster 1 hard-break: this is the sole customer-facing identifier. URL paths accept only this prefixed form; legacy UUID path parameters are rejected with 404 as of Cluster 1.

namestring
slugstring
promptstring
first_messagestring
Spoken verbatim at session start when present in the customer's flow graph.
languagestring

ISO 639-1 code, e.g. ‘en’.

llm_providerenum

LLM backend the worker constructs for this agent. Null means “use the platform default” (resolved server-side at dispatch; today: Speechify Kimi K2.6). openai and speechify pair with a model from the allowed (provider, model) table. custom points the worker at any OpenAI / vLLM-compatible endpoint - see llm_base_url, llm_api_key, llm_extra_body.

llm_modelstring or null

Chat model slug. Null means “use the platform default” (resolved server-side at dispatch; today: Speechify Kimi K2.6). For openai / speechify it must be a slug from the allowed table; for custom it is free-form (the customer’s endpoint owns the namespace).

voice_idstring
Speechify voice slug.
temperaturedouble
is_publicboolean

When true, the <speechify-agent> web component can start a session against this agent without an API key, subject to the allowed_origins allowlist. When false (default), only authenticated callers can start sessions.

allowed_originslist of strings

Exact Origin header values (e.g. https://example.com) that are allowed to start public sessions. Empty array with is_public = true means any origin is accepted — intended for open demos. No subdomain wildcards.

memory_enabledboolean

When true, the post-call extractor writes durable facts about each caller; at conversation-start the retriever injects the top matches into the system prompt via the {{memory}} template variable. Defaults to false.

memory_retention_daysinteger

Maximum age (in days) of memories kept and surfaced to the retriever. 0 disables the cap. Defaults to 90.

amdobject

Answering Machine Detection routing config for outbound voice agents. AMD classifies the called party’s first ~3-15 seconds of audio into one of LiveKit’s categories (human, uncertain, machine-vm, machine-ivr, machine-unavailable) and dispatches per category to the configured action. Stored on the agent row; flowed onto outbound dispatch metadata under the amd key. Rationale: see ADR 0008 (docs/adrs/0008-amd-as-session-routing-primitive.md).

save_audio_recordingboolean

When true, every conversation produces a room-composite OGG egress uploaded to the recordings bucket. Defaults FALSE for new agents (privacy by default).

navigator_modeboolean

Tunes worker turn handling for autonomous outbound IVR navigation - longer endpointing and no barge-in. The goal itself lives in the agent’s prompt; this flag is the behaviour switch only. Defaults FALSE.

ivr_memory_enabledboolean

Per-agent kill switch for the IVR-memory cache lookup performed at AMD time. Defaults TRUE so existing navigator agents keep their always-on behaviour. Set to false to skip the cache and force every outbound dial on this agent to start cold (LLM-driven navigation only).

created_atdatetime
updated_atdatetime
llm_base_urlstring or null

Custom OpenAI/vLLM-compatible endpoint base URL. Non-null only when llm_provider is custom.

llm_api_key_setboolean

Whether a bearer key is stored for the custom endpoint. The key itself is write-only and never returned.

llm_extra_bodymap from strings to any or null

JSON object forwarded verbatim to the custom endpoint as the chat.completions extra_body (reasoning / sampling knobs). Non-null only when llm_provider is custom.

widget_configobject

Customer-editable appearance + behaviour payload for the embedded <speechify-agent> pill: button text, avatar style, orb colours, terms-and-conditions markdown, transcript display. Every field is optional - empty fields fall back to the widget’s compile-time defaults.

hostname_allowlistlist of strings or null

Optional per-agent hostname allowlist enforced at session-creation time. When set and non-empty, the Origin header’s hostname must be an exact member. Bare hostnames only — no scheme, port, or path. Up to 10 entries. Omit (null) or leave empty for no enforcement (public agents accept any hostname).

webhook_urlstring

Customer-facing post-call webhook target. When non-empty, the control plane POSTs a signed payload (transcript + evals + extractors + recording URL) once the conversation completes. Empty disables the fire path.

webhook_secret_setboolean

True when an HMAC-SHA256 webhook secret is configured. The secret itself is write-only — supplied on PATCH and never echoed back on reads.

tts_speaking_ratedouble or null

Per-agent override for the voice’s default speaking rate (0.5 = half speed, 2.0 = double, 1.0 = neutral). Null means “use the voice’s default rate”.

tts_playback_ratedouble or null

Per-agent post-process pitch-preserving time-stretch applied to the synthesized audio in the worker before publishing. Distinct from tts_speaking_rate: speaking_rate biases the model’s generation prosody (clipped syllables, pauses preserved); playback_rate uniformly stretches the rendered waveform (every sample, every pause, every breath). Range 0.5..3.0; null means no post-process.

response_delay_secondsdouble or null

How long the agent waits after the caller stops talking before generating a reply (the worker’s endpointing min_delay on the VAD path). Range 0.0..5.0. Null means “use the stack default” — Deepgram VAD: 0.5s, or 0.75s when navigator_mode=true. Ignored on Flux + Whisper STT, which use semantic turn detection instead.

inactivity_timeout_secondsinteger or null

Optional override for the per-agent silence-tolerance before the worker tears the call down. Null means use the platform default.

background_noise_presetstring or null

Optional pre-mixed ambient bed (e.g. office, cafe). Null disables background noise.

background_noise_volumedouble or null

Volume of the background-noise bed. Null disables.

stt_overrideenum

Optional override for the streaming-STT stack this agent dispatches with. Null means use the worker’s default stack (today: whisper-v3, Baseten Whisper Large V3). Pick whisper-v3 to pin Whisper Large V3 explicitly, flux to opt into Deepgram Flux’s semantic end-of-turn detection, or gpt-realtime-whisper for OpenAI’s streaming Whisper-class STT.

Errors

400
Bad Request Error
401
Unauthorized Error