Question 1

How does smart routing decide which route to use?

Accepted Answer

Exact-model requests use the default model_name strategy. With model="auto", Router One can consider model-provider EWMA latency, posted cost, and reliability/success rate when selecting and retrying its server-owned candidates.

Question 2

Can I keep using one exact model?

Accepted Answer

Yes. Specify the model id in the standard OpenAI-compatible request. The default model_name strategy keeps model selection explicit.

Question 3

What happens when a provider degrades?

Accepted Answer

Reliability observations can influence model="auto" routing. An eligible exact-model failure can retry another provider route for the same model, while auto routing retries within its server-owned candidates. No fixed fallback time is promised.

Question 4

Will provider fallback change the requested model?

Accepted Answer

For exact-model requests, provider fallback keeps the model unchanged. With model="auto", model selection comes from the server-owned candidate set. The trace records the final model and provider route.

Question 5

Can I configure weights per project or API key?

Accepted Answer

The current public contract does not expose per-project or per-key latency, cost, or quality weights. Supported candidate routing uses gateway-managed signals.

Question 6

Where do I see the final routing result?

Accepted Answer

Every request appears in the dashboard with its final model and provider route, tokens, cost, latency, and status. The methodology behind the signals is documented on the routing methodology page.

Smart LLM routing across latency, cost, and reliability

The signals behind every decision

EWMA latency

Reliability

Posted cost

Routing mode boundary

Exact model selection

Same-model fallback

Predictable by design

Use model="auto" with the standard request shape

FAQ