Everything here binds to your workspace and is configured from the
console. Your app keeps calling
https://api.orcarouter.ai/v1/chat/completions with each tenant’s
sk-orca-... key — only the policy in the gateway changes. Config
actions need the roles called out per step; only /v1/* relay calls use
a tenant key.1. The multi-tenant ai security model
A multi-tenant gateway has a different threat shape than a single app. The risks that matter scale with the number of tenants:Key leakage = one tenant's blast radius
A leaked tenant key shouldn’t be able to drain your account, call
models you never exposed, or reach beyond that tenant’s budget.
Cross-tenant data bleed
One tenant’s PII landing in shared logs, or in a response routed to
another tenant, breaks your data-isolation promise.
A noisy tenant agent
One tenant’s agent looping on a tool or fetching arbitrary hosts
shouldn’t degrade the gateway for everyone else.
Per-tenant compliance
A regulated tenant may need PII masking and data-residency the rest of
your tenants don’t.
2. The baseline: one workspace policy every tenant inherits
Author your security posture once at the workspace level so every tenant key inherits it by default — no per-tenant duplication.A default guardrail
In Guardrails → New guardrail, author one named policy (e.g.
Any tenant key with no explicit guardrail attachment falls back to
this default. Authoring a guardrail needs the Developer role.
tenant-baseline) and mark it the workspace default (is_default).
Add a PII rule, stage input, action mask, so no tenant’s
request carries raw PII upstream:A default firewall policy
If your tenants run agents, do the same on the action plane: in
Firewall → Policies, author a default policy or — faster — open
Firewall → Posture and apply the
balanced
autonomy level. That audits
every tenant’s tool calls and flags PII workspace-wide while denying the
most destructive actions, so you watch real tenant behavior before
broadly enforcing. Developer role.3. One scoped key per tenant
This is the core of tenant isolation: never share a key across tenants, and never hand a tenant your account-wide key. Mint one key per tenant, scoped to exactly what that tenant may do. In API Keys → New key, set:Cap the spend (denial-of-wallet boundary)
Cap the spend (denial-of-wallet boundary)
Set
credit_limit_usd to that tenant’s ceiling (0 = unlimited). This
is the single most important multi-tenant control: a leaked or abused
tenant key can only ever burn that tenant’s budget, never your account.
See denial-of-wallet.Pin the models
Pin the models
Turn on
model_limits (model_limits_enabled) and list only the
model(s) that tenant’s plan includes — so a leaked key can’t run an
expensive model the tenant never paid for.Label the environment / tenant
Label the environment / tenant
Set
environment (a free-form deployment label, e.g. prod /
staging) so a tenant’s traffic is attributable in your logs and you
can tell production keys from test ones at a glance.Lock down origin & lifetime
Lock down origin & lifetime
Set
allow_ips to that tenant’s backend egress IPs if it calls from a
fixed server, and expired_time for trial or time-boxed tenants
(-1 = never expires).tenant-baseline guardrail and the
default firewall policy automatically — you minted a scoped key, and it’s
already governed. The key is masked on display after creation, so copy
it once when you provision the tenant.
4. Per-tenant overrides — tighten one without touching the rest
Most tenants ride the baseline. When one needs more — a regulated tenant, an enterprise tier, a tenant on a probation list — attach a stricter named policy to that key only:| Set on the key | Effect for that one tenant |
|---|---|
guardrail_id | Swaps in a stricter named guardrail (e.g. block-on-PII). |
firewall_policy_id | Swaps in a tighter firewall policy (e.g. default-deny tools). |
Guardrails: explicit attachment is the off switch
Guardrails: explicit attachment is the off switch
An explicit
guardrail_id (when it exists and is enabled) always
applies and never silently falls back. If that attached guardrail is
disabled, the key gets no guardrail — it does not drop to the
workspace default. Leave guardrail_id unset (0/null) to inherit the
tenant-baseline default.Firewall: a disabled attachment falls back
Firewall: a disabled attachment falls back
An attached
firewall_policy_id applies when it exists and is enabled;
if that policy is disabled, the key falls back to the workspace
default firewall policy. (This is the opposite of the guardrail
off-switch behavior — by design.)5. A concrete two-tier example
Say you run a free tier and a regulated enterprise tier on one workspace:- Workspace baseline —
tenant-baselineguardrail (PII mask on input, block on card/SSN) asis_default, plus thebalancedfirewall autonomy level. Every tenant inherits this. - Free-tier tenant key — no
guardrail_id(inherits the baseline),model_limitspinned toopenai/gpt-4o-mini, a lowcredit_limit_usd. - Enterprise-tenant key —
guardrail_idset to a stricterenterprise-piiguardrail (PII block, not mask, on input;output-stage secrets block), afirewall_policy_idwith a tighter tool allow-list, a higher credit cap, andallow_ipspinned to their backend.
/v1/chat/completions endpoint with their own
key. The gateway resolves the right policy per key — your application code
is identical for every tenant.
6. Per-tenant compliance & residency
A regulated tenant often needs an attestation the rest don’t. Compliance runs as a workspace peer of guardrails and firewall:- Browsing the framework catalog and readiness is open to any Member
and free — confirm coverage for the framework a tenant asks about
(
soc2,hipaa,gdpr,iso_27001,pci_dss, and more). - Installing a pack (
POST /api/compliance/packs/:key/install) materializes the matching guardrails and firewall policies into your workspace; it requires workspace Admin and a paid plan. - Data residency pins the region of your compliance report artifact
(
us/eu/uk/ap/cn/global) viaPUT /api/compliance/residency(Admin). Cross-region reads are withheld.
Residency here governs the compliance report artifact, not
inference-data geo-pinning. For the request-log story: logs retain for a
default of 30 days (hard-capped at 180), and a user self-deletion runs a
30-day grace then a PII scrub that cascades to that user’s guardrail
matches and request logs.
7. Watch every tenant from one workspace
All observability is workspace-scoped, so one set of feeds covers all your tenants — filterable down to a single one:- Guardrails → Matches (any Member) — every rule that fired across all tenants: type, action, stage, detail. The matched substring is recorded only if Log raw content is on for that guardrail (off by default — the privacy-conservative posture, which matters most in multi-tenant). Mark a false positive to tune (Admin).
- Firewall → Events / Runs (Developer+) — every tool call, rolled up per agent run, so a noisy tenant’s loop or a novel egress stands out.
- Anomaly feed (Member) — rate/cost spikes scored against a learned hour-of-week baseline catch one tenant burning out of pattern even when each call is individually allowed.
guardrail_blocked /
firewall_blocked), costs that tenant no quota, and is marked
skip-retry — the boundary held without charging the tenant for the
rejection.
8. Where to go deeper
Scope keys, policies, workspaces
The full resolution order for key attachment and workspace defaults.
Guardrails reference
Every rule type, PII entity, and per-entity override in full.
Firewall reference
Verdicts, surfaces, autonomy levels, and the policy plane.
Stop data exfiltration
Lock down a tenant agent’s outbound egress.
