Create Chat Completion
Authentication
Enter your API key with the Bearer prefix, e.g. ‘Bearer sk_…’.
Headers
Request
The route to run. waymark-fast favors latency, waymark-moa
balances quality and cost, and waymark-max runs the widest panel
for the highest quality. Access to the higher routes depends on your
plan.
The conversation so far, in OpenAI chat-message format.
When true, the answer is streamed back as a text/event-stream of
server-sent events instead of a single JSON response. Defaults to
false.
Response headers
The route that served a chat completion (e.g. waymark-moa), after any
in-gateway escalation. Mirrors the waymark.route field in the body.
Request-rate budget: the maximum number of requests in the current
window (the bucket capacity). The IETF-draft un-prefixed name; the
legacy alias X-RateLimit-Limit carries the same value. Rides every
response.
Request-rate budget: requests left in the current window. Legacy
alias: X-RateLimit-Remaining.
Request-rate budget: integer delta-seconds until the window fully
refills (same unit as Retry-After). Legacy alias:
X-RateLimit-Reset.
Response
The chat completion. A single JSON object by default; when the
request set stream: true, a text/event-stream of server-sent
events whose final data chunk before [DONE] carries the waymark
usage object.
The object type, always chat.completion.
Unix timestamp (seconds) of when the completion was created.
Standard OpenAI token-usage totals for the request.
Per-request routing and token breakdown. Reports the route taken, whether it escalated, and the input/output token counts for each upstream model that ran. Token counts only — no pricing or cost.