400. Three security codes cover the cases you’ll see: a screened
prompt or response, a denied tool call, and a tool call held for human approval.
This page is the reference for those codes — the use-case for each, the exact
HTTP status, what it costs you, and the one rule that matters most: retry
logic must special-case them. All three are marked skip-retry; blindly
re-running the same call just trips the same control again.
These are enforcement codes — the gateway deciding not to forward your
call. They are distinct from upstream provider errors (a model 429, a context
overflow) and from auth failures. For why a specific request was stopped, see
Why was this blocked?.
1. The llm security error codes at a glance
Every security block returns HTTP 400 with an OpenAI-shaped error body (error.code is the typed string below). On native Claude (/v1/messages)
routes the same code travels in the Claude error shape, so SDK routing is
deterministic across protocols.
| Code | Stops | Quota cost |
|---|---|---|
guardrail_blocked | A prompt or response that hit a block rule | None |
firewall_blocked | A denied tool call / advertisement | No model tokens |
firewall_approval_pending | A tool call held for a human reviewer | No model tokens |
2. guardrail_blocked — a screened prompt or response
Returned when a guardrail rule with the block action
fires — a denylisted keyword, a regex hit, a PII or secret entity you chose to
block rather than mask, an llm_judge verdict, or a failed grounding check.
HTTP 400. The message names the guardrail and the rule that fired.
Quota impact: none
Quota impact: none
A blocked request costs no quota. An input-stage block fires before
metering, so nothing is ever billed. An output-stage block runs after the
model responds, so the gateway refunds the pre-consumed quota before
returning the error. Either way you pay nothing for a blocked call.
Why it's skip-retry
Why it's skip-retry
The verdict is a property of the content, not the channel. Re-running the
same prompt — even against a different model — produces the same block.
Fix the input (or the policy) instead of retrying.
What a mask looks like instead
What a mask looks like instead
mask rules do not return this code. A masked match (e.g.
jane@acme.com → [EMAIL]) is redacted in place and the call proceeds
normally — you get a 200, just with the sensitive span removed. Only the
block action surfaces guardrail_blocked. (flag changes nothing about
the traffic at all.)3. firewall_blocked — a denied tool call
Returned when the firewall resolves a deny verdict for a
tool call — a destructive shell command, an SSRF-shaped fetch, an egress
destination on a deny list, or a skill in block
mode.
How the deny surfaces depends on the enforcement surface:
inbound / response / egress
HTTP 400 with
error.code = firewall_blocked. The body carries
structured error.metadata (reason_code, risk factors, risk_score) so
you can explain the block, not just see it.mcp surface
Returned as a tool error (
firewall deny: <reason>), not a transport
failure — so the model sees the rejection and can pick another tool, ask the
user, or stop, instead of crashing the run.4. firewall_approval_pending — held for a human
Returned the instant a tool call hits a pending_approval verdict. A
human-in-the-loop gate can’t be a blocking inline wait, so the gateway returns a
held response immediately rather than long-polling.
HTTP 400. The error carries the approval id so your agent knows which
hold to resolve.
This is the one code you respond to by resolving and re-submitting — not by
treating it as a terminal failure:
Read the approval id off the held error
The id is recoverable from the error body. Don’t retry the call yet — a
naive retry just re-holds.
Wait for a decision
A reviewer resolves it from the console (Developer+), or your approval
system gets an HMAC-signed webhook callback. Your agent polls
GET /api/v1/firewall/approvals/:id for the state.The approval routes (
/api/v1/firewall/approvals/*) run on a
firewall-gateway-scoped key, not your console session. See
Human approval (HITL) for the full
loop and Webhook payloads for the
callback signature.5. Why all three skip retry
Standard SDK retry logic assumes a400 might succeed on a second try. These
codes break that assumption — the block is deterministic, so a blind retry
wastes a round trip and (for held calls) silently re-queues an approval.
What 'skip-retry' means in practice
What 'skip-retry' means in practice
OrcaRouter’s own internal retry/fallback machinery never re-attempts a call
that returns one of these codes against another channel. Mirror that in your
client: on a security code, stop and act on the verdict, don’t loop.
The right reaction per code
The right reaction per code
guardrail_blocked→ fix the input or relax the policy; surface the refusal to the user. Don’t retry.firewall_blocked→ the action is disallowed; have the agent choose a different tool or ask for help. Don’t retry.firewall_approval_pending→ resolve the hold, then re-submit once with the approval header (§4). A retry without the header re-holds.
6. Quota & billing summary
A security block never bills you for the blocked unit of work.| Code | When it fires | Billing outcome |
|---|---|---|
guardrail_blocked (input) | Before the model call | Never metered |
guardrail_blocked (output) | After the model responds | Pre-consumed quota refunded |
firewall_blocked (inbound) | Before the model call | No model tokens |
firewall_approval_pending | Before dispatch | No model tokens |
7. Related references
Why was this blocked?
Trace a single block to the exact rule, surface, and reason that produced it.
Verdict glossary
Every firewall verdict — allow, audit, deny, sanitize, pending_approval,
cap_cost — and what each emits.
Webhook & error payloads
The full error envelope,
error.metadata fields, and the approval-callback
signature.Enforcement modes
Shadow, observe, and enforce — when a verdict actually changes traffic.
