Named Routers - OrcaRouter

OrcaRouter lets you save a routing strategy as a named router. Call it from your code as orcarouter/{name} and OrcaRouter resolves it to a concrete model at request time, based on the rules you configured. This is useful when you want to:

Swap routing behavior without redeploying your app (change the router in the dashboard; your code stays the same).
Let different teams or services choose their own routing policy independently of the application that calls the API.
Reference routing logic that’s too complex to inline in extra_body.

Using a router

response = client.chat.completions.create(
    model="orcarouter/production-chat",
    messages=[...],
)

To find out which concrete model a router resolved to, read the X-Orca-Router and X-Orca-Resolved-Model response headers — see Response Headers. The model field in the response body itself reflects whatever the upstream returned (often the bare upstream name, e.g. gpt-4o-mini-2024-07-18).

Creating a router

Routers are created in the dashboard under Routing. Each router has:

Name — the {name} in orcarouter/{name}. Must be unique within your workspace; lowercase letters, digits, _, and - (1-50 chars). The name orcarouter is reserved.
Allowed models — one or more glob patterns (comma- or newline-separated, case-insensitive) limiting which models this router can pick. Examples: openai/* or openai/*, anthropic/claude-haiku-*. Empty matches every model your account has access to.
Strategy — how to pick among matching models. See Strategies below.
Mundane models / Hard models — additional model lists used only by the Adaptive · Gated strategy. See Adaptive below.
Default model — a safety-net model used if the pattern resolves to nothing.
Enabled — disable the router without deleting it.

Strategies

The editor exposes four strategy cards. Adaptive bundles two backend sub-modes, for five total enum values you can persist via the API.

Cheapest

Picks the model with the lowest per-token price among live candidates. Default for the seeded orcarouter/auto router. Best when you want the cheapest live chat model on every request and don’t care about output-style consistency across calls.

Quality

Picks the model with the highest quality score among live candidates, regardless of price. Best when output quality dominates cost.

Balanced

Picks a low-cost option that still meets a quality bar; if nothing meets the bar, falls back to the highest-quality option. Default for new routers you create yourself. Runs without per-router tuning.

Adaptive

A per-router LinUCB contextual bandit that learns from your real production traffic. Weighs quality, cost, latency, and reliability per request to pick the best model. New routers behave like Balanced during a short cold-start period (a per-model warm-up) before the bandit starts steering picks — that’s expected, not a bug. Two sub-modes:

Standard (API enum: linucb) — considers every Allowed model for each request. Best when traffic is roughly uniform and you want the router to find the best option across your full list.
Gated (API enum: gated_adaptive) — requests are first classified as mundane or hard; mundane requests draw from a smaller Mundane models pool, hard requests from a stronger Hard models pool, and mid-difficulty requests from the full Allowed list. Best when your traffic mixes simple and complex calls. Each pool is intersected with Allowed models; empty or non-overlapping pools quietly fall back to the full Allowed list, so requests are never starved. Configure the two pools (weak_pool and strong_pool at the API level — up to 2000 chars each) in the editor when you pick Gated.

Seeded router: `orcarouter/auto`

Every OrcaRouter account is seeded with a default router called auto on signup — see Auto Router. You can use it immediately without any configuration.

​Using a router

​Creating a router

​Strategies

​Cheapest

​Quality

​Balanced

​Adaptive

​Seeded router: orcarouter/auto