Review guardrail matches — the Matches feed

You attached a guardrail and now you want to see what it caught. The Matches feed is OrcaRouter’s guardrail match log — every time a rule fires (block, mask, flag, annotate, or spotlight), the gateway records a match you can review in the console or pull over the API. It’s how you answer “what did the PII rule redact yesterday?”, “which key trips the secrets blocker?”, and “is this rule firing on real traffic or just noise?”. This page is the focused guide to reading and triaging matches. For how rules are authored and what each action does, see the Guardrails reference.

1. What the guardrail match log records

Every fired rule writes one match into a workspace-scoped feed (GET /api/guardrail/match, open to any Member). The feed is separate from your request log — it stores only what a guardrail did, not the full request body. Each match records:

The verdict

rule_type (keyword, regex, pii, max_chars, external, llm_judge, grounding), the effective action (block / mask / flag / annotate / spotlight), and the stage (input or output) — so you can tell instantly what fired and what it did.

Where it fired

guardrail_name, the firing rule_label, plus the request context: model_name, the token it rode in on, the caller ip, and the request_id that joins back to your request log.

A detail string

detail — the engine’s short human-readable note for the violation (e.g. which entity or pattern tripped), always recorded.

The matched substring — only when you opt in

matched is populated only when the guardrail’s Log raw content toggle is on. It’s off by default, so by default the feed tells you a rule fired and why, but never stores the sensitive string itself.

Raw content is opt-in and non-retroactive. With Log raw content off (the default), the matched field stays empty — the feed records the verdict and detail, never the email address, secret, or PII that tripped the rule. Turn it on per guardrail only when you need the substring for triage; it applies to matches recorded after you enable it. See Logging & privacy.

2. List and filter the match log

The default list view is cursor-paginated, newest-first, and scoped to your workspace. Narrow it with query params — the console exposes these as filter chips:

Param	Filters by
`guardrail_id`, `rule_type`, `action`, `stage`	The verdict
`token_id`, `model_name`, `request_id`	The request context
`days` / `start_at` + `end_at`, `hide_fp`	Window and false-positive state

A typical “show me everything the secrets guardrail blocked this week” read, using your console session token:

curl "https://api.orcarouter.ai/api/guardrail/match?guardrail_id=42&action=block&days=7" \
  -H "Authorization: Bearer <your-session-token>" \
  -H "X-Workspace-Id: <workspace-id>"

Management routes like /api/guardrail/* authenticate with your console session / access token, not a relay key. The sk-orca-... keys are only for /v1/* model calls. In day-to-day use you’ll read the feed straight from the Matches tab on the Guardrails page.

3. Group by request

A single request can trip several rules at once — an input PII mask and a max-length cap, say. The grouped view (GET /api/guardrail/match/grouped, Member) collapses matches by request_id so you see one row per offending request with its matches folded inline, instead of scrolling past five rows for the same call. Tune how many matches show inline per group with inline_limit (default 5).

4. Stats and the trend strip

The stats endpoint (GET /api/guardrail/match/stats, Member) powers the count strip and chart on the Matches tab — totals over a days window, optionally broken down with group_by:

`group_by`	Breakdown
(omitted)	Totals only
`rule_type`	Which rule types fire most
`guardrail_id`	Which guardrail accounts for the activity

Pass request_id to get a constant-time match count for one request (used by the request-log cross-link). This is where per-guardrail usage, action mix, and false-positive rate live — slice it rather than paging the raw list.

5. Export for an audit trail

When you need matches outside the console — an evidence pack, a spreadsheet, a downstream SIEM — GET /api/guardrail/match/export (Member) streams your current filter set as CSV or JSON:

curl "https://api.orcarouter.ai/api/guardrail/match/export?format=csv&guardrail_id=42&days=30" \
  -H "Authorization: Bearer <your-session-token>" \
  -H "X-Workspace-Id: <workspace-id>" \
  -o guardrail-matches.csv

The export carries the same columns the feed records — time, guardrail, rule type and label, stage, action, model, token, detail, the matched substring (only if raw-content capture was on at record time), request id, ip, and the false-positive timestamp.

The CSV is formula-injection-safe: any cell that would otherwise be read as a spreadsheet formula is neutralized, so opening an export in Excel or Sheets can’t execute a payload smuggled through a matched substring.

6. Triage false positives

Not every match is a real hit. When a rule fires on benign traffic, a workspace Admin can mark the match as a false positive (POST /api/guardrail/match/:id/mark-fp); the inverse DELETE /api/guardrail/match/:id/mark-fp un-marks it. Marking is Admin-only even though the rest of the feed is Member-readable — triage is a privileged action. Marking a false positive does two things: it tags the match (so hide_fp=true filters it out of the feed) and remembers the finding so the same rule on the same content is skipped on future requests. Un-mark to restore enforcement. For the broader workflow of tuning a noisy rule, see Tune false positives.

A match is diagnostic data, not an enforcement decision. Whether a request was blocked, masked, or merely flagged is already settled by the action at request time — the feed is the record after the fact. Marking a false positive changes future behavior, never the call that already happened.

7. Where matches come from

Matches are produced by the guardrail engine on the relay path, so the feed reflects exactly what your attached policies did:

Input-stage matches record what the gateway screened before the model saw it — see Input stage.
Output-stage matches record what it screened on the response — see Output stage.
A blocked request also surfaces as an HTTP 400 guardrail_blocked to the caller; the match is the server-side record of it.

If no guardrail resolves on a request, nothing is screened and nothing lands in the feed — behavior is identical to a workspace that never enabled the feature. See Attach to a key and Account default for how a policy gets in front of traffic in the first place.

Guardrails reference

The full engine: rule types, stages, actions, presets, eval harness.

Logging & privacy

The Log raw content toggle and what the feed does — and doesn’t — store.

Tune false positives

Use the feed to find and quiet noisy rules without weakening the policy.

Versioning

Diff and revert a guardrail when the feed shows a change misfired.

For the bigger picture of how the gateway inspects traffic, see How OrcaRouter inspects and Guardrails vs firewall.

​1. What the guardrail match log records

​2. List and filter the match log

​3. Group by request

​4. Stats and the trend strip

​5. Export for an audit trail

​6. Triage false positives

​7. Where matches come from

​8. Related

Guardrails reference

Logging & privacy

Tune false positives

Versioning

1. What the guardrail match log records

2. List and filter the match log

3. Group by request

4. Stats and the trend strip

5. Export for an audit trail

6. Triage false positives

7. Where matches come from

8. Related