Secure a multi-tenant SaaS

You’re building a SaaS where many customer-tenants share one codebase and one OrcaRouter workspace. Each tenant sends prompts and runs agents through your gateway, and the hard problem is blast-radius: a leaked tenant key, a runaway tenant agent, or one tenant’s PII landing in another’s logs can’t be allowed to spill across the boundary. This recipe wires the three controls that make a shared gateway tenant-safe — a scoped key per tenant, workspace-level policy every tenant inherits, and per-tenant overrides where one tenant needs more — all from the console, with zero change to your application code.

Everything here binds to your workspace and is configured from the console. Your app keeps calling https://api.orcarouter.ai/v1/chat/completions with each tenant’s sk-orca-... key — only the policy in the gateway changes. Config actions need the roles called out per step; only /v1/* relay calls use a tenant key.

1. The multi-tenant ai security model

A multi-tenant gateway has a different threat shape than a single app. The risks that matter scale with the number of tenants:

Key leakage = one tenant's blast radius

A leaked tenant key shouldn’t be able to drain your account, call models you never exposed, or reach beyond that tenant’s budget.

Cross-tenant data bleed

One tenant’s PII landing in shared logs, or in a response routed to another tenant, breaks your data-isolation promise.

A noisy tenant agent

One tenant’s agent looping on a tool or fetching arbitrary hosts shouldn’t degrade the gateway for everyone else.

Per-tenant compliance

A regulated tenant may need PII masking and data-residency the rest of your tenants don’t.

The model below is two layers: a workspace baseline every tenant key inherits, plus per-key scope and overrides that tighten one tenant without touching the others. See scope keys, policies, workspaces for the full resolution rules.

2. The baseline: one workspace policy every tenant inherits

Author your security posture once at the workspace level so every tenant key inherits it by default — no per-tenant duplication.

A default guardrail

In Guardrails → New guardrail, author one named policy (e.g. tenant-baseline) and mark it the workspace default (is_default). Add a PII rule, stage input, action mask, so no tenant’s request carries raw PII upstream:

{
  "type": "pii",
  "stage": "input",
  "action": "mask",
  "entities": ["email", "phone", "credit_card", "ssn", "ip"],
  "entity_actions": { "credit_card": "block", "ssn": "block" }
}

Any tenant key with no explicit guardrail attachment falls back to this default. Authoring a guardrail needs the Developer role.

A default firewall policy

If your tenants run agents, do the same on the action plane: in Firewall → Policies, author a default policy or — faster — open Firewall → Posture and apply the balanced autonomy level. That audits every tenant’s tool calls and flags PII workspace-wide while denying the most destructive actions, so you watch real tenant behavior before broadly enforcing. Developer role.

Roll the baseline out in observe → shadow → enforce order so a new rule can’t break a tenant mid-flight. A firewall policy supports a per-policy shadow_mode flag (enforcing verdicts log as [shadow] would …); start guardrail rules at the flag action. See enforcement modes.

3. One scoped key per tenant

This is the core of tenant isolation: never share a key across tenants, and never hand a tenant your account-wide key. Mint one key per tenant, scoped to exactly what that tenant may do. In API Keys → New key, set:

Cap the spend (denial-of-wallet boundary)

Set credit_limit_usd to that tenant’s ceiling (0 = unlimited). This is the single most important multi-tenant control: a leaked or abused tenant key can only ever burn that tenant’s budget, never your account. See denial-of-wallet.

Pin the models

Turn on model_limits (model_limits_enabled) and list only the model(s) that tenant’s plan includes — so a leaked key can’t run an expensive model the tenant never paid for.

Label the environment / tenant

Set environment (a free-form deployment label, e.g. prod / staging) so a tenant’s traffic is attributable in your logs and you can tell production keys from test ones at a glance.

Lock down origin & lifetime

Set allow_ips to that tenant’s backend egress IPs if it calls from a fixed server, and expired_time for trial or time-boxed tenants (-1 = never expires).

Every tenant key inherits the workspace tenant-baseline guardrail and the default firewall policy automatically — you minted a scoped key, and it’s already governed. The key is masked on display after creation, so copy it once when you provision the tenant.

4. Per-tenant overrides — tighten one without touching the rest

Most tenants ride the baseline. When one needs more — a regulated tenant, an enterprise tier, a tenant on a probation list — attach a stricter named policy to that key only:

Set on the key	Effect for that one tenant
`guardrail_id`	Swaps in a stricter named guardrail (e.g. block-on-PII).
`firewall_policy_id`	Swaps in a tighter firewall policy (e.g. default-deny tools).

Resolution differs between the two planes — know the difference:

Guardrails: explicit attachment is the off switch

An explicit guardrail_id (when it exists and is enabled) always applies and never silently falls back. If that attached guardrail is disabled, the key gets no guardrail — it does not drop to the workspace default. Leave guardrail_id unset (0/null) to inherit the tenant-baseline default.

Firewall: a disabled attachment falls back

An attached firewall_policy_id applies when it exists and is enabled; if that policy is disabled, the key falls back to the workspace default firewall policy. (This is the opposite of the guardrail off-switch behavior — by design.)

Editing a named policy shifts every key attached to it on the next call. If multiple tenants share one stricter policy, an edit hits all of them at once. Use a distinct named policy per isolation class, not one giant shared policy, when tenants need genuinely different rules.

5. A concrete two-tier example

Say you run a free tier and a regulated enterprise tier on one workspace:

Workspace baseline — tenant-baseline guardrail (PII mask on input, block on card/SSN) as is_default, plus the balanced firewall autonomy level. Every tenant inherits this.
Free-tier tenant key — no guardrail_id (inherits the baseline), model_limits pinned to openai/gpt-4o-mini, a low credit_limit_usd.
Enterprise-tenant key — guardrail_id set to a stricter enterprise-pii guardrail (PII block, not mask, on input; output-stage secrets block), a firewall_policy_id with a tighter tool allow-list, a higher credit cap, and allow_ips pinned to their backend.

Both tiers call the same /v1/chat/completions endpoint with their own key. The gateway resolves the right policy per key — your application code is identical for every tenant.

6. Per-tenant compliance & residency

A regulated tenant often needs an attestation the rest don’t. Compliance runs as a workspace peer of guardrails and firewall:

Browsing the framework catalog and readiness is open to any Member and free — confirm coverage for the framework a tenant asks about (soc2, hipaa, gdpr, iso_27001, pci_dss, and more).
Installing a pack (POST /api/compliance/packs/:key/install) materializes the matching guardrails and firewall policies into your workspace; it requires workspace Admin and a paid plan.
Data residency pins the region of your compliance report artifact (us / eu / uk / ap / cn / global) via PUT /api/compliance/residency (Admin). Cross-region reads are withheld.

Residency here governs the compliance report artifact, not inference-data geo-pinning. For the request-log story: logs retain for a default of 30 days (hard-capped at 180), and a user self-deletion runs a 30-day grace then a PII scrub that cascades to that user’s guardrail matches and request logs.

For a full audited evidence run, see generate SOC 2 evidence and deploy for HIPAA.

7. Watch every tenant from one workspace

All observability is workspace-scoped, so one set of feeds covers all your tenants — filterable down to a single one:

Guardrails → Matches (any Member) — every rule that fired across all tenants: type, action, stage, detail. The matched substring is recorded only if Log raw content is on for that guardrail (off by default — the privacy-conservative posture, which matters most in multi-tenant). Mark a false positive to tune (Admin).
Firewall → Events / Runs (Developer+) — every tool call, rolled up per agent run, so a noisy tenant’s loop or a novel egress stands out.
Anomaly feed (Member) — rate/cost spikes scored against a learned hour-of-week baseline catch one tenant burning out of pattern even when each call is individually allowed.

A blocked request returns HTTP 400 (guardrail_blocked / firewall_blocked), costs that tenant no quota, and is marked skip-retry — the boundary held without charging the tenant for the rejection.

8. Where to go deeper

Scope keys, policies, workspaces

The full resolution order for key attachment and workspace defaults.

Guardrails reference

Every rule type, PII entity, and per-entity override in full.

Firewall reference

Verdicts, surfaces, autonomy levels, and the policy plane.

Stop data exfiltration

Lock down a tenant agent’s outbound egress.

​1. The multi-tenant ai security model

Key leakage = one tenant's blast radius

Cross-tenant data bleed

A noisy tenant agent

Per-tenant compliance

​2. The baseline: one workspace policy every tenant inherits

​3. One scoped key per tenant

​4. Per-tenant overrides — tighten one without touching the rest

​5. A concrete two-tier example

​6. Per-tenant compliance & residency

​7. Watch every tenant from one workspace

​8. Where to go deeper

Scope keys, policies, workspaces

Guardrails reference

Firewall reference

Stop data exfiltration

1. The multi-tenant ai security model

2. The baseline: one workspace policy every tenant inherits

3. One scoped key per tenant

4. Per-tenant overrides — tighten one without touching the rest

5. A concrete two-tier example

6. Per-tenant compliance & residency

7. Watch every tenant from one workspace

8. Where to go deeper