deny on
shell.exec, an egress allow-list — and you believe it’s right. But
flipping it on against production agent traffic is a leap of faith: one
over-broad rule and you’re blocking calls your agents legitimately make.
Firewall shadow mode is the safe-rollout switch. It’s a per-policy
flag that tells the gateway to evaluate the policy exactly as it would in
production, log everything, but block nothing. Every enforcing verdict is
downgraded to audit, and the event reason is prefixed [shadow] would …
so you can read off precisely what the policy would have done — without
it having done anything yet.
Shadow mode is a flag on the policy, set in the console (or the
/api/workspace/firewall/policies management routes, which use your
session / access token — not a relay sk-orca-… key). Toggling it is a
Developer+ action. Your agent’s /v1/* relay calls don’t change.1. What firewall shadow mode does
When a policy’sshadow_mode flag is on, the gateway runs the full
evaluation — resolves the policy, walks the rules in priority order, picks
a verdict — and then, right before the verdict takes effect, downgrades
anything that would have changed the call:
| Resolved verdict | Under shadow mode |
|---|---|
deny | → audit, reason [shadow] would deny — … |
sanitize | → audit, reason [shadow] would sanitize — … |
pending_approval | → audit, reason [shadow] would pending_approval — … |
allow / audit | unchanged (already non-blocking) |
2. One concrete rollout
Say you have a policyprod-agents with a deny rule on destructive shell
commands, and you want to confirm it won’t trip on anything legitimate.
Turn shadow mode on
In Security → Firewall → Policies, open
prod-agents, toggle
Shadow mode on, and save. The policy keeps its attachment and its
rules — it just stops enforcing.Let real traffic flow
Your agents keep calling the gateway exactly as before. Every tool call
is evaluated; nothing is blocked. Give it a representative window —
long enough to cover your real tool mix.
Read the would-be denials
Open Events and filter for the
[shadow] reason. Each row shows the tool, the surface, the run, and
the rule that matched — so a [shadow] would deny — destructive shell command on a shell.exec call is exactly what you’d see in production,
minus the HTTP 400.Flip shadow mode off
Once the feed shows the policy firing on what you expect and nothing
you don’t, toggle Shadow mode off. From the next call on, those
[shadow] would deny events become real
firewall_blocked denials.deny rule — fix the rule (tighten the
glob or add an
argument clause) while still in
shadow, and watch the feed again. You iterate against real traffic with
zero blast radius.
3. What shadow mode does not soften
Shadow mode is a preview of the policy, not a master off-switch. A few more boundaries worth knowing:Allow and audit verdicts are untouched
Allow and audit verdicts are untouched
Only enforcing verdicts (
deny, sanitize, pending_approval) are
downgraded. An allow or audit already lets the call through, so
there’s nothing to soften — those events still carry the shadow badge
so you can tell the policy was in shadow when they were recorded.cap_cost resolves before the downgrade
cap_cost resolves before the downgrade
A
cap_cost rule resolves to a concrete
allow or deny based on the run’s accumulated spend, and that
resolved verdict is what shadow mode then downgrades — a would-be
cap-trip denial shows up as [shadow] would deny like any other.It's per-policy, not per-workspace
It's per-policy, not per-workspace
Shadow mode lives on each policy independently. You can shadow a brand
new policy while a battle-tested one keeps enforcing — there’s no
workspace-wide shadow switch to forget to turn off.
4. Shadow mode vs. the other rollout dials
The firewall gives you three different “don’t break anything yet” controls. They solve different problems:| Control | Scope | Question it answers |
|---|---|---|
| Shadow mode | One policy | ”What would this policy block if I enforced it?” |
audit default verdict | One policy | ”Log everything no rule names, block nothing.” |
| Observe mode | Workspace | ”Which tools are running with no policy covering them?” |
audit is for the unmatched tail of one policy; observe mode is about
coverage gaps across the workspace, not a specific policy’s enforcement.
You can stack them. A new default-deny policy under shadow mode is the
gentlest possible rollout: even the default-deny floor only logs
[shadow] would deny instead of blocking, so you see the full set of calls
your allow rules don’t yet cover before the deny is live.5. Compliance packs land in shadow first
When you install a compliance pack in observe (non-enforcing) mode, the firewall policies it materializes are created with shadow mode on — they evaluate and log against your traffic without blocking anything. Promoting the pack to enforce flips those policies out of shadow. Same mechanism, applied for you: dry-run the controls, read the would-be verdicts, then enforce.6. Toggling it
In the console, shadow mode is a toggle on the policy editor. The same flag is exposed on the management API asshadow_mode on the policy object —
these routes use your session / access token and require Developer+:
| Method & path | Role | Note |
|---|---|---|
PUT /api/workspace/firewall/policies | Developer+ | Set shadow_mode: true / false on the policy in the body. |
GET /api/workspace/firewall/policies/:id | Member | Read a policy’s current shadow_mode state. |
version integer, so turning shadow on and off is itself
tracked.
Where to go next
Create & attach a policy
The two-step setup shadow mode rolls out — create the policy, attach a
key.
Events log
Where
[shadow] would … shows up — filter, drill into runs and rules.Verdicts
The enforcing verdicts shadow mode downgrades, and what each does live.
Enforcement modes
How shadow, audit, and observe fit the broader enforcement model.
