Skip to main content
This is the fastest path from an unguarded agent to a zero-trust posture. You apply one switch, keep calling the gateway exactly as before, watch what your agent actually does, and then tighten. No rules to author, no SDK change.
Applying a security posture changes a workspace setting, so steps 2 and 5 need the Developer role. The guardrail Matches feed (step 4) is open to any member; the firewall Events feed also needs Developer.

Turn it on in 5 steps

1

Get an API key

If you don’t have one yet, create a key — see Get an API key. Give this key to the agent you want to secure. Everything below binds to your workspace, so the same posture covers every key in it.
2

Apply the Secure Agents baseline

In the console, open Firewall → Posture and apply the balanced autonomy level (Developer role).In one transaction this sets both your Firewall and Guardrails posture: tool calls are audited and PII is flagged, while the most destructive actions (like destructive shell) are denied — so you watch before you broadly enforce. It’s a single switch with one-click undo. (For a pass that blocks nothing at all, start at permissive.)
3

Send a request exactly as before

Nothing about your call changes. Use the same key, the same OpenAI shape:
curl https://api.orcarouter.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-orca-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o-mini",
    "messages": [
      {"role": "user", "content": "Summarize my notes and email me at jane@acme.com"}
    ]
  }'
The request goes through. Under balanced it isn’t blocked — it’s observed. The email is flagged, and any tool calls your agent makes are recorded.
4

See what your agent actually did

Two feeds, both workspace-scoped:
  • Firewall → Events / Runs — every tool call your agent made, its verdict, and which surface it hit (the tool it advertised, the call the model emitted, an MCP dispatch, or an outbound destination).
  • Guardrails → Matches — every rule that fired, like the flagged email, grouped by guardrail and action.
This is the payoff of observing first: you see your agent’s real behavior before any rule can break it.
5

Tighten to enforce

Once the feeds look right, switch the autonomy level to tight on the same Firewall → Posture page (Developer role).Now enforcement is live: PII is masked before the model sees it, secrets are blocked from your requests, and destructive shell calls and SSRF egress are denied. A denied tool call comes back as HTTP 400 firewall_blocked; a blocked prompt comes back as HTTP 400 guardrail_blocked — and a block costs you no quota. No application change — the very next request is governed.
That’s zero trust on: every prompt and response screened, every tool call and routed outbound request governed, every decision logged.

What you just turned on

LayerUnder balancedUnder tight
Guardrails (text)PII flagged (audit-only)PII masked, secrets blocked
Firewall (actions)Audited; destructive shell deniedDefault-deny; destructive shell + fetch-shaped tools (SSRF) denied
VisibilityFull — Events + MatchesFull — Events + Matches

Made it too strict?

Every autonomy change is one transaction with one-click undo, so you can roll straight back to your previous posture from the Firewall page (or the undo API). You can also just re-apply a softer level (balanced or permissive) at any time.

Next steps

The Secure Agents baseline

What each autonomy level sets, and how to simulate before applying.

Enforcement modes

Observe → shadow → enforce, the safe rollout in detail.

Guardrails

Author your own content rules beyond the baseline.

Agent Firewall

Author tool allow-lists, argument checks, and egress rules.