Retry another provider when a model route fails

When an upstream provider returns a retryable error, Router One can try another healthy provider that serves the same exact model, when one is available. Your requested model ID stays unchanged; only the explicit model="auto" mode lets the server select from candidates that may span models or vendors. The customer-facing request trace shows the final model, provider, and request metrics, while support can correlate intermediate failed attempts in operational logs by request_id. One endpoint for 40+ models, reachable globally and from Mainland China.

Get your API key

Three things fallback gives you

Rate-limit survival (429 / 529)

When a 429 or 529 is classified as retryable, Router One can try another healthy provider serving the same requested model. If no compatible route is available or the retry also fails, your app still receives an error and should use bounded backoff.

Provider-outage failover (5xx / timeout)

Eligible provider 5xx errors and timeouts can be retried on another healthy provider for the same exact model. The retry only happens when a compatible provider route is available; it is not a zero-downtime guarantee.

Visibility (see the final route)

The customer-facing trace shows the final model and provider together with tokens, cost, latency, and status. It does not expose the failed-attempt chain; support can investigate intermediate attempts in operational logs using the request_id.

What triggers a failover

Fallback is conditional: a response must be eligible for retry, and another healthy provider must serve the exact requested model. These common conditions may make a request eligible:

Provider 5xx or timeout

An eligible 500-class error or connection/read timeout can move the call to another healthy provider serving the same model. If no compatible route can complete it, the request returns an error.

Overload / rate limit (429, 529)

An eligible 429 or 529 overload response can be retried on another compatible provider route. The customer trace shows only the final route and request metrics, not the intermediate retry decision.

The retry concept

For an exact model ID, an eligible call can try healthy provider routes that serve that same model and return the first success. With model="auto", server-selected candidates may span models or vendors. You send one request; Router One handles eligible provider retries.

fallback.ts

// exact model -> compatible provider routes -> first success
for (const route of healthyRoutes(modelFamily)) {
  try {
    return await call(route, request);
  } catch (e) {
    if (isRetryable(e)) continue; // 5xx, timeout, 429, 529
    throw e;
  }
}

Reliability competitors under-target

Capability	Router One	Single-provider SDK
Failover on provider 5xx / timeout	Eligible retry on a healthy provider serving the same model	Request fails; you build retry yourself
429 / 529 overload handling	Conditional retry on a compatible healthy provider route	Returns the error to your app
Per-request final-route trace	Shows final model/provider, status, cost, latency	None
Reachable from Mainland China	China-friendly routing with published latency benchmark	Often blocked or inconsistent
One endpoint, many models	40+ supported models behind one OpenAI-compatible URL	One provider, one API surface

Reliable globally and from Mainland China

Reliability isn't only about provider errors — reachability matters too. Router One's endpoints are reachable globally and from Mainland China networks. Based on the China latency benchmark last updated 2026-05-15, Router One measured 110-130ms p50 across Beijing, Shanghai, and Shenzhen; individual networks may vary.

View the China latency benchmark ->

FAQ

What triggers fallback?

An eligible provider 5xx, timeout, 429, or 529 can trigger a retry when another healthy provider serves the same requested model. Not every error is retryable, and a request can still fail when no compatible route completes it.

Does fallback cross model families?

For an exact model ID, provider retry keeps that exact model and does not silently swap GPT for Claude or Gemini. Only the explicit model="auto" mode allows the server to choose from candidates that may span models or vendors.

How do I see which route served my request?

The dashboard trace shows the final model and provider together with tokens, cost, latency, and status. It does not show failed attempts or a fallback chain. Give support the request_id when intermediate provider attempts need investigation.

Does fallback add latency?

A retry adds time before the final provider responds, so a retried request can be slower than a clean first-try success. The customer-facing trace reports the request's final metrics, but it does not break latency down by provider attempt.

What about rate limits?

When a route returns an eligible 429 or 529, Router One can retry on another healthy provider serving the same model. If no compatible route completes the request, the error is returned to your app, so use bounded exponential backoff rather than assuming zero downtime.

Add conditional provider fallback with one base URL change