Skip to main content
When you read a firewall event or a guardrail match, the row tells you what the gateway decideddeny, sanitize, [EMAIL]. This page is the lookup table for those words: what each one means, what it does to the call, and where to go for the full mechanics. Keep it open while you author rules or triage the events feed. Two control planes produce two vocabularies. The Firewall governs tool actions and emits a verdict. Guardrails screen prompt and response text and emit an action plus, on a mask, a typed masking tag. They never share a word — a guardrail never says deny, a firewall never says mask.
This is a reference index, not a how-to. For the use-case behind each control see Guardrails vs Firewall; for the HTTP bodies see Security error codes.

1. The firewall verdict glossary

A firewall rule (or the policy’s default_verdict) resolves every tool call to exactly one of these six verdicts. The engine walks the rules in priority order, first match wins, and falls back to the default if nothing matches.
The call proceeds to the tool. Still logged as a firewall event so it shows up in Runs and the events feed. This is what you want for tools an agent is explicitly trusted to use.
Identical traffic to allow, but flagged as something you wanted to watch. It is the recommended default_verdict: observe everything, block nothing, until your rules are tuned. The balanced autonomy level ships the PII Shield guardrail as flag-only (audit), so PII is recorded without holding the call.
The call never reaches the tool. On the inbound surface this returns HTTP 400 firewall_blocked; through the MCP gateway it comes back as a tool error (firewall deny: <reason>) so the model can react instead of crashing. Marked skip-retry. Costs no model tokens.
Replaces matched substrings (secrets, PII) in the tool-call arguments with a [redacted:<preset>] token, then forwards the call with the cleaned arguments. It redacts arguments only — never the content a tool returns. On the inbound surface, where there are no call-time arguments yet, sanitize escalates to a deny.
The call is enqueued for review and the agent gets a held response carrying an approval id (HTTP 400 firewall_approval_pending). A reviewer resolves it in the console or via an HMAC webhook callback; the agent polls the id and re-submits once with a single-use approval header. See Human approval.
Authored as a rule with a per-rule cents ceiling. It resolves to allow while the agent run is under budget and to deny once accumulated spend crosses the cap — so an event shows allow or deny, not the literal word cap_cost. A circuit-breaker for runaway loops.
In shadow mode, deny / sanitize / pending_approval are all downgraded to audit and the reason is prefixed [shadow] would …. The event records the verdict that would have fired, but traffic is unchanged — that is the whole point of a safe rollout.

Default verdict

default_verdict accepts only the three non-interactive verdicts:
ValueMeaning when no rule matches
allowPermit uncovered tool calls silently.
auditPermit but record — the default.
denyBlock anything no rule explicitly allows (default-deny posture).
The tight autonomy level sets default_verdict: deny; balanced and the shipped default use audit.

2. Guardrail actions

A guardrail rule fires one of five actions. They are the text-plane equivalent of verdicts — and a guardrail rule never produces a firewall verdict.
ActionWhat it doesQuota
blockReject the whole request with HTTP 400 guardrail_blocked.None — input blocks fire before metering; output blocks refund.
maskRedact each match to a typed tag (see §3) and forward the cleaned text.Normal — the call proceeds.
flagLog only. Records a match; changes nothing about the traffic.Normal.
annotateNon-blocking. Attaches a human-readable note to the request (injected upstream as a security notice) without masking or blocking the text.Normal.
spotlightNon-blocking. Wraps the matched (untrusted) text in delimiters and tells the model to treat the delimited region as data, never instructions — the prompt-injection “spotlighting” defense.Normal.
A blocked guardrail request is marked skip-retry — re-running the same prompt against another channel would just block again.
Use flag to measure a new rule against real traffic before you switch it to block or mask. The Matches feed shows what would have been caught with zero traffic impact — the guardrail counterpart to firewall shadow mode.
A single pii rule can apply different actions to different entities with entity_actions — mask emails and phones, but block on credit_card and ssn, from one rule. Keys must be an entity enabled on the rule; values must be block / mask / flag / annotate.

3. The masking tag glossary

On a mask action, every matched entity is replaced inline with a typed tag — [<UPPERCASE_ENTITY_NAME>] — so the model (input stage) or the caller (output stage) sees the shape of the data without the value. Masking runs on both stages, including streamed responses: a token-aware stream scanner masks matches that straddle chunk boundaries before they reach the client.
EntityTag
email[EMAIL]
phone[PHONE]
credit_card[CREDIT_CARD]
ssn[SSN]
ip[IP]
iban[IBAN]
mac_address[MAC_ADDRESS]
jwt[JWT]
aws_access_key[AWS_ACCESS_KEY]
api_key_openai[API_KEY_OPENAI]
bitcoin_address[BITCOIN_ADDRESS]
Three regional identifiers ship on top of the base set:
EntityTagRegion
jp_mynumber[JP_MYNUMBER]Japan
kr_rrn[KR_RRN]South Korea
cn_resident_id[CN_RESIDENT_ID]China
Custom entities follow the same convention. A custom entity named employee_id masks to [EMPLOYEE_ID] unless you set an explicit mask_with replacement. Up to 25 custom entities per rule, each an RE2 regex with an optional luhn checksum. See PII detection.

4. One worked example

A single db.query tool call, read top to bottom, touches both vocabularies:
firewall verdict : sanitize        # secret stripped from the SQL argument
guardrail action : mask            # an email in the prompt redacted
masking tag      : [EMAIL]         # what the model actually receives
The firewall sanitize cleaned the tool arguments; the guardrail mask cleaned the prompt text; the [EMAIL] tag is what the model sees in place of the address. Same request, three different layers, three words from this glossary.

5. Posture words you’ll see alongside verdicts

These aren’t verdicts or actions, but they decide whether a verdict is enforced at all — so they show up in the same events and settings views.
WordPlaneMeaning
Shadow modeFirewallPer-policy flag. Downgrades every enforcing verdict to audit, prefixes the reason [shadow] would ….
Observe modeFirewallWorkspace setting. When no policy resolves, allows the call but logs it as a coverage gap (Discovered tools).
EnforceFirewallShadow off + a policy attached: verdicts take effect.
Fail-openGuardrailsDefault for advanced rules (llm_judge, grounding, external) — a timeout is observed, the request continues. Flip to fail-closed per rule.
Log raw contentGuardrailsOff by default. When off, a match records that a rule fired but not the matched substring.
For the deny-vs-audit-vs-shadow distinction in depth, see Enforcement modes.

6. Where each word is defined

SurfaceVocabularyHome page
Firewall policyallow audit deny sanitize pending_approval cap_costFirewall
Firewall rule matchingtool_name_glob, args_match, egress, sequenceFirewall rules
Guardrail ruleblock mask flag annotate spotlightGuardrails
Guardrail PIIentity names + masking tagsGuardrails
MCP & skillsskill risk bands, quarantine / block modesFirewall MCP, Firewall skills
HTTP error bodiesguardrail_blocked, firewall_blocked, firewall_approval_pendingError codes
Every term here also appears in the broader Concepts glossary, which adds identity, scope, and threat terms. This page is the narrow, decision-focused slice — verdicts, actions, and masking tags only.

Why was this blocked?

Trace a single denied call back to the exact rule and verdict that stopped it.

Enforcement modes

How audit, shadow, observe, and enforce relate — and how to roll out safely.

Guardrails vs Firewall

Which plane owns which decision, and why a request can pass through both.

Dangerous tool calls

The threat the deny and sanitize verdicts exist to stop.