# Router One API Documentation

Generated from the public OpenAPI schema.

Router One offers an **OpenAI-compatible** unified model API. Smart routing, automatic fallback, and cost controls let teams run LLM workloads in production safely, predictably, and economically.

> Calling LLMs directly is a black box. Calling LLMs through Router One gives you a ledger, a trace, and guardrails.



## Quick Start

Get started with the Router One API in three steps:

### 1. Get an API Key

Sign in to the [Router One console](https://router.one), create a project, and generate an API Key (format `sk-xxx`).

### 2. Send your first request

```bash
curl https://api.router.one/v1/chat/completions \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{"role": "user", "content": "Hello"}]
  }'
```

Set `model` to `auto` and Router One picks the best model based on your routing weights. You can also pin a specific model such as `gpt-4o` or `claude-sonnet-4-20250514`.

### 3. Handle the response

```json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "gpt-4o",
  "choices": [{
    "index": 0,
    "message": { "role": "assistant", "content": "Hello! How can I help you?" },
    "finish_reason": "stop"
  }],
  "usage": { "prompt_tokens": 9, "completion_tokens": 12, "total_tokens": 21 }
}
```



## Authentication

All API requests carry a Bearer Token in the `Authorization` header:

```
Authorization: Bearer sk-your-api-key
```

API Keys are created and managed in the project settings of the [Router One console](https://router.one). Each key can have its own budget and QPS limits.



## Base URL

| Environment | URL |
|------|------|
| **Production** | `https://api.router.one` |
| **Local development** | `http://localhost:8080` |



## Request Format

- All requests use **JSON** (`Content-Type: application/json`)
- The API is fully compatible with the **OpenAI Chat Completions** format — existing code only needs to swap `base_url`
- Both streaming (SSE) and non-streaming response modes are supported



## Error Handling

The API returns standard HTTP status codes; error responses contain a structured error object:

```json
{
  "error": {
    "message": "Invalid API key",
    "type": "authentication_error",
    "code": "invalid_api_key"
  }
}
```

| Status | Meaning | What to do |
|--------|------|----------|
| `401` | Invalid or missing API key | Check the Authorization header |
| `402` | Insufficient balance | Top up in the console or adjust your budget |
| `429` | Rate limit exceeded | Reduce request rate or raise your QPS limit |
| `500` | Internal server error | Retry shortly; contact support if it persists |

## Base URLs

- `https://api.router.one` — Production
- `http://localhost:8080` — Local Development
- OpenAI-compatible API / Codex CLI base URL: `https://api.router.one/v1`
- Claude Code / Anthropic-compatible endpoint: `https://api.router.one`

## Endpoint reference

### POST `/v1/chat/completions`

**Create Chat Completion**

- Operation ID: `createChatCompletion`
- Tags: Chat

Create a chat completion. OpenAI Chat Completions API compatible; supports streaming and non-streaming responses.

When `model` is `auto`, Router One picks the best model based on the active routing strategy.

#### Request body

- Required: Yes
- Content types: `application/json`

##### Fields

| Field | Type | Required | Description |
|---|---|---|---|
| `model` | string | Yes | Model ID. Set to `auto` to let Router One's smart routing pick the best model, or specify a model such as `gpt-4o` or `claude-sonnet-4-20250514`. Example: `auto` |
| `messages` | array<ChatMessage> | Yes | Chat messages, in chronological order. |
| `stream` | boolean | No | Whether to enable streaming response. When enabled, returns an SSE event stream. Default: `false` |
| `temperature` | number | No | Sampling temperature, range 0-2. Higher values (e.g. 0.8) make output more random; lower values (e.g. 0.2) make it more deterministic. Default: `1` |
| `max_tokens` | integer | No | Maximum number of tokens to generate. |
| `top_p` | number | No | Nucleus sampling parameter. The model considers tokens with the top `top_p` mass of the probability distribution. Default: `1` |
| `stream_options` | object | No | Streaming response options. Only valid when `stream: true`. |
| `stream_options.include_usage` | boolean | No | Whether to include usage info in the final chunk of the streaming response. Default: `false` |

##### Examples

**Basic request**

```json
{
  "model": "auto",
  "messages": [
    {
      "role": "user",
      "content": "Hello"
    }
  ]
}
```

**With system prompt**

```json
{
  "model": "auto",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Introduce Router One"
    }
  ],
  "temperature": 0.7,
  "max_tokens": 1024
}
```

**Streaming response**

```json
{
  "model": "auto",
  "stream": true,
  "messages": [
    {
      "role": "user",
      "content": "Write a short poem"
    }
  ]
}
```

#### Responses

| Status | Description | Schema |
|---|---|---|
| `200` | Successful chat completion. When `stream: false`, returns a JSON object; when `stream: true`, returns an SSE event stream. | ChatCompletionResponse |
| `401` | Authentication failure — invalid or missing API key | ErrorResponse |
| `402` | Insufficient balance | ErrorResponse |
| `429` | Rate limit exceeded | ErrorResponse |
| `500` | Internal server error | ErrorResponse |

##### Fields

| Field | Type | Required | Description |
|---|---|---|---|
| `id` | string | No | Unique identifier of the completion request |
| `object` | chat.completion | No | Object type, always `chat.completion` |
| `created` | integer | No | Creation Unix timestamp |
| `model` | string | No | ID of the model actually used |
| `choices` | array<object> | No |  |
| `usage` | object | No |  |
| `usage.prompt_tokens` | integer | No | Tokens consumed by input |
| `usage.completion_tokens` | integer | No | Tokens consumed by output |
| `usage.total_tokens` | integer | No | Total tokens consumed |

##### Examples

```json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 12,
    "total_tokens": 21
  }
}
```

### POST `/v1/messages`

**Create Message**

- Operation ID: `createMessage`
- Tags: Chat

Create a Claude / Anthropic Messages compatible request. Use this for clients such as Claude Code that call the Messages API shape; streaming and non-streaming responses are supported.

#### Request body

- Required: Yes
- Content types: `application/json`

##### Fields

| Field | Type | Required | Description |
|---|---|---|---|
| `model` | string | Yes | Model ID. Set to `auto` for Router One routing, or specify a concrete model. Example: `auto` |
| `messages` | array<AnthropicMessage> | Yes | Conversation messages in the Claude / Anthropic Messages format. |
| `system` | string | No | Optional system prompt. |
| `max_tokens` | integer | Yes | Maximum number of output tokens. |
| `stream` | boolean | No | Whether to enable streaming response. Default: `false` |
| `temperature` | number | No | Sampling temperature, range 0-2. Default: `1` |

##### Examples

**Basic request**

```json
{
  "model": "auto",
  "max_tokens": 1024,
  "messages": [
    {
      "role": "user",
      "content": "Hello"
    }
  ]
}
```

**Streaming response**

```json
{
  "model": "auto",
  "max_tokens": 1024,
  "stream": true,
  "messages": [
    {
      "role": "user",
      "content": "Write a short product intro"
    }
  ]
}
```

#### Responses

| Status | Description | Schema |
|---|---|---|
| `200` | Successful Messages API compatible response. | MessageResponse |
| `401` | Authentication failure — invalid or missing API key | ErrorResponse |
| `402` | Insufficient balance | ErrorResponse |
| `429` | Rate limit exceeded | ErrorResponse |
| `500` | Internal server error | ErrorResponse |

##### Fields

| Field | Type | Required | Description |
|---|---|---|---|
| `id` | string | No | Message ID. |
| `type` | message | No | Object type. |
| `role` | assistant | No | Response role. |
| `model` | string | No | ID of the model actually used. |
| `content` | array<AnthropicContentBlock> | No |  |
| `stop_reason` | string | No | Stop reason. |
| `usage` | object | No |  |
| `usage.input_tokens` | integer | No | Input tokens. |
| `usage.output_tokens` | integer | No | Output tokens. |

##### Examples

```json
{
  "id": "msg_abc123",
  "type": "message",
  "role": "assistant",
  "model": "claude-sonnet-4-20250514",
  "content": [
    {
      "type": "text",
      "text": "Hello! How can I help you?"
    }
  ],
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 9,
    "output_tokens": 12
  }
}
```

### POST `/v1/responses`

**Create Response**

- Operation ID: `createResponse`
- Tags: Chat

Create an OpenAI Responses API compatible request. Use this for OpenAI-compatible clients that call the Responses API shape; text input, instructions, and streaming are supported.

#### Request body

- Required: Yes
- Content types: `application/json`

##### Fields

| Field | Type | Required | Description |
|---|---|---|---|
| `model` | string | Yes | Model ID. Set to `auto` for Router One routing. Example: `auto` |
| `input` | string \| array<ResponseInputItem> | Yes |  |
| `instructions` | string | No | Optional system instructions. |
| `stream` | boolean | No | Whether to enable streaming response. Default: `false` |
| `temperature` | number | No | Sampling temperature, range 0-2. Default: `1` |
| `max_output_tokens` | integer | No | Maximum number of output tokens. |

##### Examples

**Basic request**

```json
{
  "model": "auto",
  "input": "Introduce Router One in one sentence"
}
```

**With instructions**

```json
{
  "model": "auto",
  "instructions": "You are a concise technical documentation assistant.",
  "input": "Explain smart routing."
}
```

#### Responses

| Status | Description | Schema |
|---|---|---|
| `200` | Successful Responses API compatible response. | ResponsesResponse |
| `401` | Authentication failure — invalid or missing API key | ErrorResponse |
| `402` | Insufficient balance | ErrorResponse |
| `429` | Rate limit exceeded | ErrorResponse |
| `500` | Internal server error | ErrorResponse |

##### Fields

| Field | Type | Required | Description |
|---|---|---|---|
| `id` | string | No | Response ID. |
| `object` | response | No | Object type. |
| `created_at` | integer | No | Creation Unix timestamp. |
| `status` | string | No | Response status. |
| `model` | string | No | ID of the model actually used. |
| `output_text` | string | No | Aggregated text output. |
| `output` | array<ResponseInputItem> | No | Raw output items. |
| `usage` | object | No |  |
| `usage.input_tokens` | integer | No | Input tokens. |
| `usage.output_tokens` | integer | No | Output tokens. |
| `usage.total_tokens` | integer | No | Total tokens. |

##### Examples

```json
{
  "id": "resp_abc123",
  "object": "response",
  "created_at": 1700000000,
  "status": "completed",
  "model": "gpt-4o",
  "output_text": "Router One is a unified LLM API gateway.",
  "output": [
    {
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "Router One is a unified LLM API gateway."
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 12,
    "output_tokens": 10,
    "total_tokens": 22
  }
}
```

### POST `/v1/images/generations`

**Create Image Generation**

- Operation ID: `createImageGeneration`
- Tags: Images

Generate images from a text prompt. This is a synchronous endpoint; the request returns once generation completes. Typical generation takes 5-30 seconds — set a client HTTP timeout of at least 60 seconds.

The `data` array length in the response equals the number of images actually generated, and is what you are billed for. When `response_format` is `url`, the returned image URLs have an expiration (typically 1 hour); download or re-host them if you need them long-term.

#### Request body

- Required: Yes
- Content types: `application/json`

##### Fields

| Field | Type | Required | Description |
|---|---|---|---|
| `model` | string | Yes | Model ID. Choose a model that supports image generation; check the console model marketplace. Example: `image-default` |
| `prompt` | string | Yes | Text prompt for image generation. More specific and visually evocative prompts usually produce better results. Recommended length under 4000 characters. |
| `n` | integer | No | Number of images to generate in this request. Billed per image. Default: `1` |
| `size` | string | No | Image size, format `width x height` (pixels). Common values: `1024x1024`, `512x512`, `1792x1024`, `1024x1792`. Supported ranges depend on the chosen model. Example: `1024x1024` |
| `quality` | standard \| hd | No | Image quality. `hd` is typically higher-resolution and more detailed but may take longer to generate and cost more. Default: `standard` |
| `response_format` | url \| b64_json | No | How the image is returned.<br>- `url` (default): a CDN URL valid for about 1 hour; download or re-host as needed<br>- `b64_json`: base64-encoded image bytes in the response; larger body, no extra download Default: `url` |

##### Examples

**Basic request**

```json
{
  "model": "image-default",
  "prompt": "An orange kitten wearing an astronaut helmet, floating in starry space, cinematic lighting"
}
```

**With size and quality**

```json
{
  "model": "image-default",
  "prompt": "Minimalist poster — black coffee cup on a yellow background",
  "n": 2,
  "size": "1024x1024",
  "quality": "hd",
  "response_format": "b64_json"
}
```

#### Responses

| Status | Description | Schema |
|---|---|---|
| `200` | Generation succeeded | ImageGenerationResponse |
| `400` | Invalid request — model not found, invalid field format, or missing prompt | ErrorResponse |
| `401` | Authentication failure — invalid or missing API key | ErrorResponse |
| `402` | Insufficient balance | ErrorResponse |
| `429` | Rate limit exceeded or API key spending cap reached | ErrorResponse |
| `500` | Internal server error | ErrorResponse |
| `503` | Service temporarily unavailable | ErrorResponse |

##### Fields

| Field | Type | Required | Description |
|---|---|---|---|
| `created` | integer | No | Unix timestamp (seconds) when generation completed |
| `data` | array<ImageData> | No | List of generated images; length equals the number actually generated. |

##### Examples

```json
{
  "created": 1700000000,
  "data": [
    {
      "url": "https://cdn.router.one/img/abc123.png",
      "revised_prompt": "An orange kitten wearing an astronaut helmet, floating in starry space, cinematic lighting"
    }
  ]
}
```

### POST `/v1/videos/generations`

**Submit Video Generation**

- Operation ID: `submitVideoGeneration`
- Tags: Videos

Submit a video generation task. Video generation takes a while (typically 30 seconds to several minutes depending on the model and duration), so it uses an async task pattern:

1. Call this endpoint to submit the task; on success, returns `202 Accepted` with a `task_id`.
2. Use the `task_id` with `GET /v1/videos/generations/{task_id}` to poll the status.
3. When `status` becomes `completed`, read the video `url` from the response.

**Recommended**: poll at intervals of at least 3 seconds; do not set an HTTP timeout for the whole generation flow — only set short timeouts (e.g. 30 s) for individual submit/poll requests.

**Reference images**: `image_url` / `image_urls` accept HTTP(S) URLs, each image up to 20 MB. After submission, Router One downloads and re-hosts the images securely, so even short-lived user-uploaded URLs work.

#### Request body

- Required: Yes
- Content types: `application/json`

##### Fields

| Field | Type | Required | Description |
|---|---|---|---|
| `model` | string | Yes | Model ID. Choose a model that supports video generation; check the console model marketplace. Example: `video-default` |
| `prompt` | string | Yes | Text prompt for video generation, describing scene content, camera motion, style, etc. |
| `duration` | integer | No | Video duration in seconds. Allowed values depend on the model; common values are 4, 5, 6, 8 seconds. |
| `size` | string | No | Video resolution. Common values: `720p`, `1080p`. Supported range depends on the model. Example: `1080p` |
| `aspect_ratio` | 16:9 \| 9:16 \| 1:1 | No | Video aspect ratio. `16:9` for landscape, `9:16` for portrait short video, `1:1` for social-style square. |
| `negative_prompt` | string | No | Negative prompt — elements to avoid in the output. Optional. |
| `image_url` | string (uri) | No | Reference image URL (for image-to-video). Must be HTTP(S) reachable; up to 20 MB per image. Router One downloads and securely re-hosts it. |
| `image_urls` | array<string (uri)> | No | Multi-reference image URLs, up to 3. Same constraints per image as `image_url`. |
| `input_reference` | string | No | Model-specific reference input identifier. Only used when the selected model explicitly requires it. |

##### Examples

**Text-to-video**

```json
{
  "model": "video-default",
  "prompt": "Waves lapping at the shore, dusk sunlight glittering on the water, slow motion",
  "duration": 5,
  "size": "1080p",
  "aspect_ratio": "16:9"
}
```

**Image-to-video (single image)**

```json
{
  "model": "video-default",
  "prompt": "Camera slowly pushes in; the person in frame turns their head slightly",
  "duration": 5,
  "image_url": "https://example.com/portrait.jpg"
}
```

**Multi-reference image-to-video**

```json
{
  "model": "video-default",
  "prompt": "Smooth transition from scene A to scene B",
  "duration": 8,
  "aspect_ratio": "9:16",
  "image_urls": [
    "https://example.com/scene-a.jpg",
    "https://example.com/scene-b.jpg"
  ]
}
```

#### Responses

| Status | Description | Schema |
|---|---|---|
| `202` | Task accepted; queued for generation | VideoSubmitResponse |
| `400` | Invalid request — model not found, reference image fetch failed, or parameters out of the model's allowed range | ErrorResponse |
| `401` | Authentication failure — invalid or missing API key | ErrorResponse |
| `402` | Insufficient balance | ErrorResponse |
| `429` | Rate limit exceeded or API key spending cap reached | ErrorResponse |
| `502` | Upstream service error | ErrorResponse |

### GET `/v1/videos/generations/{task_id}`

**Get Video Generation Status**

- Operation ID: `getVideoGeneration`
- Tags: Videos

Get the status of a video generation task. The `status` field can be:

- `pending` — task accepted, not yet started
- `processing` — generation in progress; `progress` is readable
- `completed` — generation complete; read `url` for the video file
- `failed` — generation failed; read `error` for the reason

Poll at intervals of at least 3 seconds. Video file URLs have an expiration (typically 24 hours); download or re-host them if you need them long-term.

#### Parameters

| Field | In | Type | Required | Description |
|---|---|---|---|---|
| `task_id` | path | string | Yes | The task identifier returned by the submit endpoint. Treat as an opaque string and pass it back as-is — do not parse its internal structure. |

#### Request body

This endpoint does not define a JSON request body.

#### Responses

| Status | Description | Schema |
|---|---|---|
| `200` | Query succeeded | VideoStatusResponse |
| `400` | Invalid task_id format | ErrorResponse |
| `401` | Authentication failure — invalid or missing API key | ErrorResponse |
| `404` | Task not found or expired | ErrorResponse |
| `502` | Upstream service error | ErrorResponse |

##### Fields

| Field | Type | Required | Description |
|---|---|---|---|
| `task_id` | string | Yes | Task identifier; matches the `task_id` returned when submitting. |
| `status` | pending \| processing \| completed \| failed | Yes | Task status.<br>- `pending`: accepted, not yet started<br>- `processing`: generating; `progress` is readable<br>- `completed`: complete; read `url` for the video<br>- `failed`: failed; read `error` for the reason |
| `url` | string | No | Generated video URL. Only returned when `status=completed`. The URL is typically valid for 24 hours — download or re-host promptly. |
| `error` | string | No | Human-readable failure description. Only returned when `status=failed`. |
| `progress` | integer | No | Generation progress percentage (0-100); some models may not return this. |

##### Examples

**Generating**

```json
{
  "task_id": "v_8f3a92c1d4e74b6ea0b5f1d29c7e8a01",
  "status": "processing",
  "progress": 45
}
```

**Completed**

```json
{
  "task_id": "v_8f3a92c1d4e74b6ea0b5f1d29c7e8a01",
  "status": "completed",
  "url": "https://cdn.router.one/video/abc123.mp4",
  "progress": 100
}
```

**Failed**

```json
{
  "task_id": "v_8f3a92c1d4e74b6ea0b5f1d29c7e8a01",
  "status": "failed",
  "error": "Content policy violation"
}
```

## Trust and methodology

- [Methodology](https://router.one/methodology) — how every public number on the site is measured.
- [Smart routing methodology](https://router.one/routing-methodology) — EWMA latency, fallback rules, provider scoring, and failure handling.
- [Pricing methodology](https://router.one/pricing-methodology) — no hidden markup on the pay-as-you-go token line; FX/channel fees shown at checkout.
- [Data retention](https://router.one/data-retention) — what Router One stores, what it does not store, and how long logs live.
- [Security](https://router.one/security) — transport security, API key handling, and upstream isolation.
- [SLA](https://router.one/sla) — availability language, fallback behavior, and enterprise contract scope.
- [China latency benchmark](https://router.one/benchmarks/china-latency) — public p50 and timeout snapshots for China access.

## Longer reference

- https://router.one/llms-full.txt

Last generated dynamically per request. When this Markdown page and the HTML docs differ, the OpenAPI schema and HTML docs are the source of truth.
