Add tools

Create system, webhook, client, and MCP tools and attach them to an agent

Tools let the LLM act mid-call. For the four kinds and the parameter schema, see Tools.

System built-in

Add a built-in directly to the agent with kind: "builtin". config.builtin must be one of the capabilities from GET /v1/agents/tool-capabilities; name is the identifier the LLM calls.

POST
/v1/agents/:agent_id/tools
1curl -X POST https://api.speechify.ai/v1/agents/agent_01jqr8x9zg5k2m3n4p5q6r7s8t/tools \
2 -H "Authorization: Bearer <token>" \
3 -H "Content-Type: application/json" \
4 -d '{
5 "kind": "builtin",
6 "name": "Example name",
7 "config": {
8 "builtin": "example",
9 "builtin_config": {},
10 "params": [
11 {
12 "description": "Example description.",
13 "name": "Example name",
14 "required": true,
15 "type": "string"
16 }
17 ]
18 },
19 "description": "Example description.",
20 "enabled": true
21}'

The LLM calls end_call() and the room disconnects immediately.

transfer_to_number and play_keypad_touch_tone are SIP-dependent and return a clear error until phone-number support is enabled on your account.

Webhook tool

Create the tool. The worker signs a JSON envelope, POSTs it to your URL, and returns your response to the LLM.

POST
/v1/agents/tool-definitions
1curl -X POST https://api.speechify.ai/v1/agents/tool-definitions \
2 -H "Authorization: Bearer <token>" \
3 -H "Content-Type: application/json" \
4 -d '{
5 "name": "lookup_order",
6 "description": "Fetch order details by order ID.",
7 "kind": "webhook",
8 "config": {
9 "headers": {
10 "X-Org-ID": "acme"
11 },
12 "method": "POST",
13 "params": [
14 {
15 "description": "Order ID",
16 "name": "order_id",
17 "required": true,
18 "type": "string"
19 }
20 ],
21 "timeout_ms": 5000,
22 "url": "https://api.your-app.com/webhooks/lookup-order"
23 }
24}'

The create response includes the HMAC signing secret once ("webhook_secret": "wh_sec_…"). Store it now - every later read returns a masked placeholder, and there is no retrieval endpoint.

Your endpoint receives a JSON envelope and replies 200 OK with JSON, which the agent uses as the tool’s return value:

1// request body
2{ "tool_call_id": "call_abc123", "tool_name": "lookup_order", "arguments": { "order_id": "ORD-42" }, "timestamp": 1713360000000 }
3
4// your response
5{ "status": "shipped", "tracking": "1ZW…" }

Each call carries a single combined signature header (Stripe/ElevenLabs format):

HeaderExample
Speechify-Signaturet=1729425600,v0=<64-char hex>

t is the dispatch time in Unix seconds and v0 = HEX(HMAC_SHA256(secret, "<t>.<raw body>")) - sign over the t value, a literal ., then the raw body.

1import hmac, hashlib, time
2
3def verify(headers, raw_body: bytes, secret: str) -> bool:
4 parts = dict(p.split("=", 1) for p in headers["Speechify-Signature"].split(","))
5 t, v0 = parts["t"], parts["v0"]
6 if abs(time.time() - int(t)) > 300: # 5-minute replay window
7 return False
8 expected = hmac.new(
9 secret.encode(), f"{t}.".encode() + raw_body, hashlib.sha256
10 ).hexdigest()
11 return hmac.compare_digest(v0, expected)

Reject deliveries whose t is more than 5 minutes old to guard against replays. For method: "GET", arguments are sent as query parameters and the signature covers an envelope that isn’t on the wire - use POST for any endpoint you plan to verify.

Client tool

Runs in the caller’s browser or SDK over the session’s tools data channel.

POST
/v1/agents/tool-definitions
1curl -X POST https://api.speechify.ai/v1/agents/tool-definitions \
2 -H "Authorization: Bearer <token>" \
3 -H "Content-Type: application/json" \
4 -d '{
5 "name": "navigate_to",
6 "description": "Scroll the page to a named section.",
7 "kind": "client",
8 "config": {
9 "params": [
10 {
11 "description": "Section name",
12 "enum": [
13 "pricing",
14 "docs",
15 "contact"
16 ],
17 "name": "section",
18 "required": true,
19 "type": "string"
20 }
21 ],
22 "timeout_ms": 4000
23 }
24}'

Your client receives a tool_call and replies with a tool_response carrying the same tool_call_id on the same channel:

1// agent → client
2{ "type": "tool_call", "tool_call_id": "call_abc123", "tool_name": "navigate_to", "arguments": { "section": "pricing" } }
3
4// client → agent
5{ "type": "tool_response", "tool_call_id": "call_abc123", "result": { "ok": true } }

The @speechify/agents-js SDK will wrap this with a single registerTool(name, handler) call - the reference will be linked here when it publishes.

MCP tool

A customer-hosted Model Context Protocol server. Create the tool with POST /v1/agents/tool-definitions using an MCPToolConfig body - pick a transport (http_streamable or sse) and an auth mode (none, bearer, or oauth2_client_credentials). The worker opens the configured transport at session start, discovers the remote server’s tool list, and proxies tool calls through.

See Tools for the configuration model and the API Reference for the full MCPToolConfig schema.

Attach to an agent

A webhook or client tool must be attached before the LLM can call it.

$# Attach
$curl -X PUT https://api.speechify.ai/v1/agents/$AGENT_ID/tools/$TOOL_ID \
> -H "Authorization: Bearer $SPEECHIFY_API_KEY"
$
$# Detach
$curl -X DELETE https://api.speechify.ai/v1/agents/$AGENT_ID/tools/$TOOL_ID \
> -H "Authorization: Bearer $SPEECHIFY_API_KEY"
$
$# List attached
$curl https://api.speechify.ai/v1/agents/$AGENT_ID/tools \
> -H "Authorization: Bearer $SPEECHIFY_API_KEY"