orcarouter/{name} and OrcaRouter resolves it to a
concrete model at request time, based on the rules you configured.
This is useful when you want to:
- Swap routing behavior without redeploying your app (change the router in the dashboard; your code stays the same).
- Let different teams or services choose their own routing policy independently of the application that calls the API.
- Reference routing logic that’s too complex to inline in
extra_body.
Using a router
X-Orca-Router and X-Orca-Resolved-Model response headers — see
Response Headers. The model field in
the response body itself reflects whatever the upstream returned (often
the bare upstream name, e.g. gpt-4o-mini-2024-07-18).
Creating a router
Routers are created in the dashboard under Routing. Each router has:- Name — the
{name}inorcarouter/{name}. Must be unique within your workspace; lowercase letters, digits,_, and-(1-50 chars). The nameorcarouteris reserved. - Allowed models — one or more glob patterns (comma- or
newline-separated, case-insensitive) limiting which models this
router can pick. Examples:
openai/*oropenai/*, anthropic/claude-haiku-*. Empty matches every model your account has access to. - Strategy — how to pick among matching models. See Strategies below.
- Mundane models / Hard models — additional model lists used only by the Adaptive · Gated strategy. See Adaptive below.
- Default model — a safety-net model used if the pattern resolves to nothing.
- Enabled — disable the router without deleting it.
Strategies
The editor exposes four strategy cards. Adaptive bundles two backend sub-modes, for five total enum values you can persist via the API.Cheapest
Picks the model with the lowest per-token price among live candidates. Default for the seededorcarouter/auto router. Best when you want the
cheapest live chat model on every request and don’t care about
output-style consistency across calls.
Quality
Picks the model with the highest quality score among live candidates, regardless of price. Best when output quality dominates cost.Balanced
Picks a low-cost option that still meets a quality bar; if nothing meets the bar, falls back to the highest-quality option. Default for new routers you create yourself. Runs without per-router tuning.Adaptive
A per-router LinUCB contextual bandit that learns from your real production traffic. Weighs quality, cost, latency, and reliability per request to pick the best model. New routers behave like Balanced during a short cold-start period (a per-model warm-up) before the bandit starts steering picks — that’s expected, not a bug. Two sub-modes:- Standard (API enum:
linucb) — considers every Allowed model for each request. Best when traffic is roughly uniform and you want the router to find the best option across your full list. - Gated (API enum:
gated_adaptive) — requests are first classified as mundane or hard; mundane requests draw from a smaller Mundane models pool, hard requests from a stronger Hard models pool, and mid-difficulty requests from the full Allowed list. Best when your traffic mixes simple and complex calls. Each pool is intersected with Allowed models; empty or non-overlapping pools quietly fall back to the full Allowed list, so requests are never starved. Configure the two pools (weak_poolandstrong_poolat the API level — up to 2000 chars each) in the editor when you pick Gated.
Seeded router: orcarouter/auto
Every OrcaRouter account is seeded with a default router called auto
on signup — see Auto Router. You can use
it immediately without any configuration.