sk-… key dropped into a command field,
a customer SSN pasted into a body, an internal token in a request
header. The firewall sanitize verdict catches that material in the
tool-call arguments, replaces it with a typed redaction token, and
forwards the cleaned call — so the action still runs, but the secret
never leaves the gateway.
This is one verdict in the firewall’s matching language. For the full set
see Verdicts and the
rule reference; this page is the focused guide
to authoring and reasoning about sanitize.
1. What sanitize does — and what it never touches
A rule withverdict: sanitize runs a redaction engine over the
tool-call arguments before the call is dispatched. Each match is
replaced with a canonical token and the call proceeds with the cleaned
arguments — the tool still executes, just without the raw secret in it.
Redacts
The JSON arguments of a model-emitted
tool_call or an MCP
tools/call — command, body, headers, any string field where a
secret or PII landed.Never redacts
The content a tool returns, the prompt, the model’s response text.
Sanitize is an argument-only redactor. Text screening is a
Guardrail concern.
[redacted:<preset>] (e.g. [redacted:openai_key]), and a
custom-pattern match becomes [redacted:custom]. The shape of the
argument is preserved — only the sensitive substring changes — so a tool
that expects valid JSON still receives valid JSON.
2. The built-in detector presets
A sanitize rule names one or more presets (well-known secret/PII shapes) and/or custom regex patterns. The built-in presets:| Preset | Catches |
|---|---|
aws_access_key | AWS access key id (AKIA… / ASIA… + 16 chars) |
aws_secret_key | A 40-char AWS secret (boundary-aware) |
openai_key | sk- + ≥32 chars |
anthropic_key | sk-ant- + ≥40 chars |
bearer_token | Bearer + a ≥16-char token (keyword kept) |
email | An email address |
ssn_us | A US SSN in 3-2-4 form |
credit_card | A 13–19 digit run that passes a Luhn check |
A sanitize rule must declare at least one preset or custom pattern —
an empty sanitizer is rejected when you save the rule. A
credit_card
match is additionally Luhn-checked, so a same-length number that isn’t a
valid card is left untouched rather than over-redacted.3. One concrete example
Author this in the console rule editor. The example redacts an OpenAI key and any email from the arguments of anyhttp.* tool call your agent
emits, then forwards the cleaned call:
key=[redacted:openai_key] user=[redacted:email] — the request still
runs, the secret and the address never leave the gateway.
4. On the inbound surface, sanitize escalates to deny
Theinbound surface evaluates the tools an
agent advertises on a request — tool definitions, which carry no
call-time arguments yet. There is nothing to redact there, so a
sanitize verdict on the inbound surface escalates to a deny
(fail-closed): the request is blocked with firewall_blocked rather than
forwarded unredacted.
5. Sanitize vs. the other ways to handle a secret
Three layers can act on a secret an agent is about to leak — pick by what and where:Sanitize (firewall) — redact the tool-call arguments
Sanitize (firewall) — redact the tool-call arguments
Strips the secret out of a tool call’s arguments and still runs
the call. Use it when the action is legitimate but the agent put
something sensitive in a field. Argument-layer only.
Deny (firewall) — block the whole call
Deny (firewall) — block the whole call
Stops the call entirely. Use it when the action itself is dangerous,
not just an argument. This is also what sanitize becomes on the
inbound surface. See block tools.
Guardrails — screen prompt / response text
Guardrails — screen prompt / response text
The Secrets Blocker and PII guardrails
screen the text of a request or response (including, on the output
stage, model-generated content). That is the layer for “what a tool or
model returns” — the thing sanitize does not do.
6. Where sanitize shows up in your trail
Like every verdict, a sanitize evaluation is recorded as a firewall event — filterable by verdict, surface, tool, and run in the events log and rolled up in analytics. In shadow mode a sanitize verdict (like every enforcing verdict) is downgraded toaudit and the reason is
prefixed [shadow] would …, so you can measure impact before any
argument is actually rewritten.
Where to go next
All verdicts
allow, audit, deny, sanitize, pending_approval, cap_cost.
Validate arguments
Match a call by what’s in its arguments — the JSONPath clause grammar.
Block tools
When the action itself is the problem, deny the whole call.
Firewall + Guardrails
Screen the text a tool or model returns — the layer sanitize doesn’t cover.
