Skip to main content
A firewall rule fires at a specific point in a tool call’s lifecycle. That point is its stage — one of four enforcement surfaces, each seeing a different slice of the call. Pin a rule to the wrong stage and it sees the wrong data: an egress allowlist on the inbound surface has no destination to check; an argument clause on inbound has no call-time arguments yet. This page is the focused guide to the four agent firewall stages: what each surface observes, when a rule should target it, and the one concrete way the same intent is expressed at different stages. For the full rule vocabulary, see Firewall rules; for the policy model around it, Firewall.

1. The four stages at a glance

Every evaluation is stamped with exactly one stage. A rule with no stage ("") applies to all of them; a rule pinned to one stage only fires there.
StageWhat the surface sees
inboundTools the agent advertises on the request
responsetool_calls the model emits in its reply
mcpA tools/call dispatched through the MCP gateway
egressAn outbound host / IP / CIDR a tool reaches
The stage names are stable enum values — you set them verbatim in the rule editor’s stage field, or as the stage property when authoring through the API.
Stage governs what data is in scope, not how strict the verdict is. A deny is a deny on any stage; what changes is whether the rule has the arguments, the tool name, or the destination it needs to match on.

2. inbound — the tools an agent advertises

The earliest surface. Before the model ever runs, your agent sends a list of tool definitions it’s willing to let the model call. The inbound stage sees that advertised toolset and can block a dangerous tool before the model can even choose it. There are no call-time arguments at this stage — the model hasn’t decided how to call anything yet — so inbound rules match on the tool name (and optionally its owning skill), not on args_match_json.
A sanitize verdict on inbound has nothing to redact (no arguments exist yet), so it escalates to a block. Author inbound rules as explicit allow / deny, and save sanitize for the execution stages.
A denied call here returns HTTP 400 with code firewall_blocked, named after the tool and reason, and marked skip-retry.

3. response — the tool calls the model emits

Once the model replies, it may emit one or more tool_calls — concrete invocations with real arguments. The response stage sees those, so this is where argument-level rules belong: not “block shell.exec” but “block shell.exec only when the command is rm -rf.”
{
  "stage": "response",
  "tool_name_glob": "shell.exec",
  "verdict": "deny",
  "args_match_json": "{\"clauses\":[{\"path\":\"$.command\",\"op\":\"regex\",\"value\":\"rm -rf|mkfs|dd if=\"}]}"
}
Because the model’s chosen arguments are present, sanitize works here — it redacts matched substrings from the call’s arguments and forwards the cleaned call. (Sanitize redacts tool-call arguments only; it never touches the content a tool returns.)

4. mcp — calls dispatched through the gateway

When an agent reaches a tool through OrcaRouter’s MCP gateway, every tools/call is evaluated on the mcp stage before it’s dispatched to the registered server. This is the surface that governs Model Context Protocol traffic — the same glob / argument / verdict vocabulary as response, applied to MCP dispatch. A block here surfaces as a tool error (firewall deny: <reason>) rather than a transport failure, so the model sees the rejection and can react — pick another tool, ask the user, or stop.
The mcp stage pairs with per-server governance: each registered MCP server has its own health probe and encrypted credentials, and skills loaded through it carry a risk band and an enforcement mode. See Firewall MCP and Firewall skills.

5. egress — the outbound destination a tool reaches

The last surface. When a tool reports an outbound network destination, the egress stage matches on it — the SSRF and data-exfiltration surface. Egress rules don’t match on a tool name pattern alone; they match on a host / IP / CIDR list:
{
  "stage": "egress",
  "verdict": "deny",
  "egress_json": "{\"deny\":[\"169.254.169.254\",\"10.0.0.0/8\"],\"allow\":[\"api.openai.com\"]}"
}
Entries match as a CIDR, an IP literal, or a case-insensitive hostname. You author host and CIDR deny rules yourself — the cloud-metadata endpoint (169.254.169.254) and RFC-1918 ranges are the canonical things to deny. See Firewall rules §6 for the allow/deny polarity.
No preset ships CIDR rules. The tight autonomy level’s SSRF posture denies fetch-shaped tool names (e.g. http_fetch, web_search, fetch_url); a destination-based egress deny is something you author for the hosts and ranges your agents must never reach.

6. Choosing the right stage

The same security goal often has a best stage. Match the intent to the surface that actually carries the data you need:
If the model should never even see a tool, deny it on inbound. The block lands before the model call, so it costs no model tokens.
Argument clauses need the model’s chosen arguments, which only exist on response and mcp. Deny on a dangerous argument, or sanitize to strip a secret or PII value the agent put in an argument.
Calls routed through the MCP gateway are evaluated on mcp before dispatch — the choke point for every registered server’s tools.
Destination-based rules — block the cloud-metadata IP, deny a CIDR, allowlist your approved hosts — only make sense on egress.
A rule with no stage runs on all four. Use it for a blanket default_verdict-style rule, or a tool you deny everywhere it appears.

7. Stages and shadow mode

A policy’s shadow_mode flag is independent of stage. Turn it on and every enforcing verdict — on any stage — is downgraded to audit and the reason is prefixed [shadow] would …, so you can confirm a rule fires on the right surface before it changes live traffic. See Shadow mode and Enforcement modes.

8. Where stages fit the bigger picture

The four stages are the where of enforcement; the rest of the model is the what and who.

Verdicts

What each stage can do once it matches — allow, audit, deny, sanitize, hold for approval, cap cost.

Tool allow-listing

Use inbound to constrain the toolset an agent advertises.

Validate arguments

Author response / mcp argument clauses that gate a tool by how it’s called.

Egress control

Block outbound destinations on the egress surface — the exfiltration boundary.
For how these surfaces sit on the inspection path, see How OrcaRouter inspects and the enforcement path latency notes. For the threats each stage addresses, see Dangerous tool calls, Data exfiltration, and MCP tool poisoning.