Create Message
Authentication
Enter your API key with the Bearer prefix, e.g. ‘Bearer sk_…’.
Headers
Request
The route to run. waymark-fast favors latency, waymark-moa
balances quality and cost, and waymark-max runs the widest panel
for the highest quality. Access to the higher routes depends on your
plan.
A system prompt giving the model context and instructions: a plain string, or an array of Anthropic text blocks.
When true, the answer is streamed back as a text/event-stream of
Anthropic server-sent events instead of a single JSON response.
Defaults to false.
Amount of randomness injected into the response (0 to 1).
Response headers
The route that served a chat completion (e.g. waymark-moa), after any
in-gateway escalation. Mirrors the waymark.route field in the body.
Request-rate budget: the maximum number of requests in the current
window (the bucket capacity). The IETF-draft un-prefixed name; the
legacy alias X-RateLimit-Limit carries the same value. Rides every
response.
Request-rate budget: requests left in the current window. Legacy
alias: X-RateLimit-Remaining.
Request-rate budget: integer delta-seconds until the window fully
refills (same unit as Retry-After). Legacy alias:
X-RateLimit-Reset.
Response
The message. A single JSON object by default; when the request set
stream: true, a text/event-stream of Anthropic server-sent
events whose message_delta frame carries the waymark usage
object.
The object type, always message.
The conversational role of the generated message, always assistant.
The reason generation stopped (e.g. end_turn, max_tokens,
stop_sequence); null while a streamed message is still in flight.
The custom stop sequence that was generated, if any; otherwise null.
Anthropic token-usage totals for the request.
Per-request routing and token breakdown. Reports the route taken, whether it escalated, and the input/output token counts for each upstream model that ran. Token counts only — no pricing or cost.