Run Agent Test

Enqueue a single run of the test. The returned run starts in queued status. Poll GET /v1/agents/tests/runs/{id} until the status reaches a terminal state (passed, failed, or error).

Authentication

AuthorizationBearer

Enter your API key with the Bearer prefix, e.g. ‘Bearer sk_…’.

Path parameters

idstringRequired

Request

This endpoint expects an object.
agent_idstringOptional
Run the test against this agent instead of the test's default agent.

Response

The queued run.
idstring

Prefixed wire identifier (run_<26 char Crockford base32>). URL paths accept only this prefixed form; legacy UUID path parameters are rejected with 404.

test_idstring

Prefixed wire identifier (test_<26 char Crockford base32>) of the parent test.

agent_idstring

Prefixed wire identifier (agent_<26 char Crockford base32>) of the agent this run executed against.

statusenum

Lifecycle of a test run: queued - running - terminal.

Terminal states:

  • passed - the agent behaviour met the success criteria.
  • failed - the agent behaviour did not meet the success criteria.
  • error - the runner itself could not complete (LLM outage, network error, etc.), distinct from failed which means the agent behaviour was judged and found lacking.
created_atdatetime
started_atdatetime or null
completed_atdatetime or null
resultobject or null
Populated on terminal status only.
errorstring

Human-readable error message when status is error.

Errors

401
Unauthorized Error
404
Not Found Error