Skip to main content

One API, every model

OrcaRouter gives you one OpenAI-compatible endpoint and keys into 200+ models across OpenAI, Anthropic, Google, Together, Groq, Mistral, xAI, and more. You don’t juggle per-provider SDKs.

Zero markup, ever

OrcaRouter routes your tokens directly to the upstream provider and charges you what the provider charges. We don’t add a per-token cut. Our revenue comes from optional paid subscription plans and (in the future) opt-in features we will disclose separately.

OpenAI-compatible by design

Every request we can serve, we serve in the OpenAI JSON shape. Change one line (base_url) and your existing OpenAI SDK code works. Streaming, tools, JSON mode, vision, and reasoning all pass through.

Unknown fields are forwarded

OrcaRouter reads every request and response (to compute token usage and bill correctly) but does not interpret fields we don’t own. If a provider adds a new parameter tomorrow — say cache_control, reasoning.budget, or a custom safety flag — you can send it today by putting it in your request. We forward it unchanged. New upstream capabilities become available the moment the provider supports them, without waiting for an OrcaRouter release. We only intercept:
  1. model — to resolve orcarouter/{name} routers and extra_body.models fallback chains.
  2. Authorization header — we swap your OrcaRouter key for the upstream provider’s key before forwarding.

We don’t expose which provider served your request

From a caller’s point of view, OrcaRouter is a single provider. We don’t return provider, routed_to, or any identifier that tells you which upstream served a given call. That’s intentional — it gives us room to rebalance traffic for cost and reliability without changing your contract. We do advertise on our marketing pages that we route direct to OpenAI, Anthropic, Google, and others. That’s a capability claim. Per-request provider identity is internal.

No prompt logging

We don’t persist the content of your prompts or model outputs. Our servers handle both in-transit — forwarding to the destination provider and returning the response to you — and discard them. Token counts, latency, and request metadata are kept for billing and abuse prevention. See Zero Data Retention for the full policy.

Self-hostable

OrcaRouter is designed to be deployed on your own infrastructure. The docs you’re reading are hosted at docs.orcarouter.ai but every bit of routing logic runs on whatever server you deploy the backend to.