guardrails vs firewall: one-line distinction, side-by-side comparison, and a threat-to-layer mapping to help you decide which OrcaRouter security plane catches each risk.
The short answer: Guardrails govern text; the Firewall governs actions.
They are complementary — a single request flows through both — and the fastest
way to configure them together is an autonomy level.The rest of this page is for the cases where you need to know which layer owns
a specific threat.
Role required. Any workspace member can read policies and the guardrail
Matches feed; the firewall Events feed requires the Developer role.
Creating or editing guardrails or firewall policies also requires
Developer or above.
Guardrails fire before the upstream call (on the prompt) and after it (on the
response). The Firewall fires on every tool call the model emits or that the
agent issues — regardless of the model or provider that served the turn.
Guardrails and the Firewall are designed to compose, not compete. A single
request passes through both planes:
Input guardrail runs — prompt text is screened and optionally masked.
Model call — the (possibly sanitized) prompt reaches the upstream model.
Firewall — every tool call the model emits is evaluated.
Output guardrail runs — the model’s response text is screened.
The fastest way to configure both at once is an autonomy level — a single
setting that atomically writes a Firewall policy and a Guardrails policy for
the whole workspace, with one-click undo:
Guardrails own the text; the Firewall owns the actions — run both, let the
autonomy level wire them together, and tighten each plane independently once
you can see your agents’ real traffic.
Guardrails
Rule types, PII detection, LLM judge, eval harness, and API reference.
Agent Firewall
Verdicts, surfaces, autonomy levels, HITL approval, and API reference.