Verdict, action and masking glossary

When you read a firewall event or a guardrail match, the row tells you what the gateway decided — deny, sanitize, [EMAIL]. This page is the lookup table for those words: what each one means, what it does to the call, and where to go for the full mechanics. Keep it open while you author rules or triage the events feed. Two control planes produce two vocabularies. The Firewall governs tool actions and emits a verdict. Guardrails screen prompt and response text and emit an action plus, on a mask, a typed masking tag. They never share a word — a guardrail never says deny, a firewall never says mask.

This is a reference index, not a how-to. For the use-case behind each control see Guardrails vs Firewall; for the HTTP bodies see Security error codes.

1. The firewall verdict glossary

A firewall rule (or the policy’s default_verdict) resolves every tool call to exactly one of these six verdicts. The engine walks the rules in priority order, first match wins, and falls back to the default if nothing matches.

allow — let the call through

The call proceeds to the tool. Still logged as a firewall event so it shows up in Runs and the events feed. This is what you want for tools an agent is explicitly trusted to use.

audit — allow, but record it for review

Identical traffic to allow, but flagged as something you wanted to watch. It is the recommended default_verdict: observe everything, block nothing, until your rules are tuned. The balanced autonomy level ships the PII Shield guardrail as flag-only (audit), so PII is recorded without holding the call.

deny — block the call

The call never reaches the tool. On the inbound surface this returns HTTP 400 firewall_blocked; through the MCP gateway it comes back as a tool error (firewall deny: <reason>) so the model can react instead of crashing. Marked skip-retry. Costs no model tokens.

sanitize — redact the arguments, forward the cleaned call

Replaces matched substrings (secrets, PII) in the tool-call arguments with a [redacted:<preset>] token, then forwards the call with the cleaned arguments. It redacts arguments only — never the content a tool returns. On the inbound surface, where there are no call-time arguments yet, sanitize escalates to a deny.

pending_approval — hold for a human

The call is enqueued for review and the agent gets a held response carrying an approval id (HTTP 400 firewall_approval_pending). A reviewer resolves it in the console or via an HMAC webhook callback; the agent polls the id and re-submits once with a single-use approval header. See Human approval.

cap_cost — deny once the run overspends

Authored as a rule with a per-rule cents ceiling. It resolves to allow while the agent run is under budget and to deny once accumulated spend crosses the cap — so an event shows allow or deny, not the literal word cap_cost. A circuit-breaker for runaway loops.

In shadow mode, deny / sanitize / pending_approval are all downgraded to audit and the reason is prefixed [shadow] would …. The event records the verdict that would have fired, but traffic is unchanged — that is the whole point of a safe rollout.

Default verdict

default_verdict accepts only the three non-interactive verdicts:

Value	Meaning when no rule matches
`allow`	Permit uncovered tool calls silently.
`audit`	Permit but record — the default.
`deny`	Block anything no rule explicitly allows (default-deny posture).

The tight autonomy level sets default_verdict: deny; balanced and the shipped default use audit.

2. Guardrail actions

A guardrail rule fires one of five actions. They are the text-plane equivalent of verdicts — and a guardrail rule never produces a firewall verdict.

Action	What it does	Quota
`block`	Reject the whole request with HTTP 400 `guardrail_blocked`.	None — input blocks fire before metering; output blocks refund.
`mask`	Redact each match to a typed tag (see §3) and forward the cleaned text.	Normal — the call proceeds.
`flag`	Log only. Records a match; changes nothing about the traffic.	Normal.
`annotate`	Non-blocking. Attaches a human-readable note to the request (injected upstream as a security notice) without masking or blocking the text.	Normal.
`spotlight`	Non-blocking. Wraps the matched (untrusted) text in delimiters and tells the model to treat the delimited region as data, never instructions — the prompt-injection “spotlighting” defense.	Normal.

A blocked guardrail request is marked skip-retry — re-running the same prompt against another channel would just block again.

Use flag to measure a new rule against real traffic before you switch it to block or mask. The Matches feed shows what would have been caught with zero traffic impact — the guardrail counterpart to firewall shadow mode.

A single pii rule can apply different actions to different entities with entity_actions — mask emails and phones, but block on credit_card and ssn, from one rule. Keys must be an entity enabled on the rule; values must be block / mask / flag / annotate.

3. The masking tag glossary

On a mask action, every matched entity is replaced inline with a typed tag — [<UPPERCASE_ENTITY_NAME>] — so the model (input stage) or the caller (output stage) sees the shape of the data without the value. Masking runs on both stages, including streamed responses: a token-aware stream scanner masks matches that straddle chunk boundaries before they reach the client.

Entity	Tag
`email`	`[EMAIL]`
`phone`	`[PHONE]`
`credit_card`	`[CREDIT_CARD]`
`ssn`	`[SSN]`
`ip`	`[IP]`
`iban`	`[IBAN]`
`mac_address`	`[MAC_ADDRESS]`
`jwt`	`[JWT]`
`aws_access_key`	`[AWS_ACCESS_KEY]`
`api_key_openai`	`[API_KEY_OPENAI]`
`bitcoin_address`	`[BITCOIN_ADDRESS]`

Three regional identifiers ship on top of the base set:

Entity	Tag	Region
`jp_mynumber`	`[JP_MYNUMBER]`	Japan
`kr_rrn`	`[KR_RRN]`	South Korea
`cn_resident_id`	`[CN_RESIDENT_ID]`	China

Custom entities follow the same convention. A custom entity named employee_id masks to [EMPLOYEE_ID] unless you set an explicit mask_with replacement. Up to 25 custom entities per rule, each an RE2 regex with an optional luhn checksum. See PII detection.

4. One worked example

A single db.query tool call, read top to bottom, touches both vocabularies:

firewall verdict : sanitize        # secret stripped from the SQL argument
guardrail action : mask            # an email in the prompt redacted
masking tag      : [EMAIL]         # what the model actually receives

The firewall sanitize cleaned the tool arguments; the guardrail mask cleaned the prompt text; the [EMAIL] tag is what the model sees in place of the address. Same request, three different layers, three words from this glossary.

5. Posture words you’ll see alongside verdicts

These aren’t verdicts or actions, but they decide whether a verdict is enforced at all — so they show up in the same events and settings views.

Word	Plane	Meaning
Shadow mode	Firewall	Per-policy flag. Downgrades every enforcing verdict to `audit`, prefixes the reason `[shadow] would …`.
Observe mode	Firewall	Workspace setting. When no policy resolves, allows the call but logs it as a coverage gap (Discovered tools).
Enforce	Firewall	Shadow off + a policy attached: verdicts take effect.
Fail-open	Guardrails	Default for advanced rules (`llm_judge`, `grounding`, `external`) — a timeout is observed, the request continues. Flip to fail-closed per rule.
Log raw content	Guardrails	Off by default. When off, a match records that a rule fired but not the matched substring.

For the deny-vs-audit-vs-shadow distinction in depth, see Enforcement modes.

6. Where each word is defined

Surface	Vocabulary	Home page
Firewall policy	`allow` `audit` `deny` `sanitize` `pending_approval` `cap_cost`	Firewall
Firewall rule matching	`tool_name_glob`, `args_match`, egress, sequence	Firewall rules
Guardrail rule	`block` `mask` `flag` `annotate` `spotlight`	Guardrails
Guardrail PII	entity names + masking tags	Guardrails
MCP & skills	skill risk bands, `quarantine` / `block` modes	Firewall MCP, Firewall skills
HTTP error bodies	`guardrail_blocked`, `firewall_blocked`, `firewall_approval_pending`	Error codes

Every term here also appears in the broader Concepts glossary, which adds identity, scope, and threat terms. This page is the narrow, decision-focused slice — verdicts, actions, and masking tags only.

Why was this blocked?

Trace a single denied call back to the exact rule and verdict that stopped it.

Enforcement modes

How audit, shadow, observe, and enforce relate — and how to roll out safely.

Guardrails vs Firewall

Which plane owns which decision, and why a request can pass through both.

Dangerous tool calls

The threat the deny and sanitize verdicts exist to stop.

​1. The firewall verdict glossary

​Default verdict

​2. Guardrail actions

​3. The masking tag glossary

​4. One worked example

​5. Posture words you’ll see alongside verdicts

​6. Where each word is defined

​7. Related reading

Why was this blocked?

Enforcement modes

Guardrails vs Firewall

Dangerous tool calls

1. The firewall verdict glossary

Default verdict

2. Guardrail actions

3. The masking tag glossary

4. One worked example

5. Posture words you’ll see alongside verdicts

6. Where each word is defined

7. Related reading