Router One

Smart routing methodology

Smart routing is the load-bearing claim of Router One. This page documents the signals, weights, and fallback rules — including the constraints that keep routing predictable.

Last updated:

Routing signals

Latency (EWMA)
Exponentially weighted moving average over the last 50 requests per provider, computed independently for each model family. The window is short enough to track real-world degradation, long enough not to overreact to a single slow request.
Error rate
5xx and gateway timeout rate computed over the same rolling window. Crossing a configured threshold causes the provider to be temporarily down-ranked.
Cost
Token-level cost of each candidate provider for the request's model family. Used as a weight, not a hard rank — high-quality providers can still win for latency-weighted projects even at higher cost.
Customer weights
Each project can set its own (latency, cost, quality) weights. Default profile favors latency in production projects and cost in development projects, but every project can override.

Fallback behavior

Trigger
A request hits fallback on: 5xx response, network error, or response time exceeding a per-model timeout budget.
Same family only
Fallback only swaps between providers serving the same model family (GPT → GPT, Claude → Claude, Gemini → Gemini). Router One never silently downgrades a request to a different model.
Fallback latency
Typical end-to-end fallback adds < 200ms over the failing request. The fallback chain is recorded in the per-request trace, so customers see exactly what happened.
Bounded retries
Fallback is capped at one provider switch per request by default. Higher caps are available via per-project configuration.

What appears in your trace

Provider used
The upstream provider name, model variant, and routing decision for each request.
Fallback chain
If fallback occurred, both the failed provider and the successful one — plus the error code and latency for the failed attempt.
Token counts and cost
Input, output, and cached-input token counts, plus the computed cost at the rate that applied at request time.

Per-project configuration

Weights
Set the relative importance of latency, cost, and quality. Defaults are sensible; explicit overrides are honored.
Disable fallback
Enterprise contracts can disable fallback for projects that need a single provider for compliance or evaluation reasons.
Allowed providers
Customers can restrict a project to a subset of upstream providers — useful for data residency or vendor governance.

FAQ

Can I disable fallback for a specific project?
Yes — enterprise contracts can configure fallback off for projects that require single-provider behaviour. Pay-as-you-go projects use the default fallback configuration.
Is Router One swapping models silently?
No. Fallback is constrained to the same model family (GPT → GPT, Claude → Claude). The exact model variant used is recorded in the per-request trace.
How quickly do routing decisions adapt to new conditions?
EWMA over the last 50 requests per provider per model family means routing reacts within seconds to a degradation, but is not whip-lashed by a single anomalous slow request.

Related