Guardrail API reference

When you want to manage guardrails as code — create a PII policy in CI, diff two versions before a release, or pull the matches feed into your own dashboard — you talk to the /api/guardrail/* routes. This page is the route-by-route guardrail api reference: every endpoint, the workspace role it requires, and the auth it expects. For what a guardrail is and how rules compose, read Guardrails. This page is the wire-level companion.

1. Auth and scope

Every /api/guardrail/* route is management plane: it authenticates with your console session or access token (the same identity you log into the console with), not a relay key.

Your sk-orca-... relay key authenticates /v1/* model calls — it does not work on /api/guardrail/*. Configuration routes use your user session/access token, scoped to the active workspace.

Routes are workspace-scoped — they only ever see the active workspace’s guardrails. Nothing crosses a tenant boundary.
Every route is RBAC-gated by your workspace role (Viewer / Developer / Admin / Owner). Reads open to Viewer+; the sandbox and all writes require Developer+; the false-positive endpoints require Admin (Admin or Owner).

Most teams never call these routes directly — the console drives all of them. Reach for the REST surface when you want guardrails in source control, in CI, or wired into your own tooling.

2. One concrete call — list your guardrails

A read is the simplest place to start. Authenticated as any workspace member (Viewer+):

curl https://api.orcarouter.ai/api/guardrail/ \
  -H "Authorization: Bearer <your-access-token>" \
  -H "X-Workspace-Id: <workspace-id>"

You get back the workspace’s guardrails with their attached-key counts. To screen your first request end-to-end instead — create a policy, attach a key, send a call — follow the 5-step quickstart, which does it all from the console.

3. The role model in one table

The action you take, not the route, picks the tier.

Action	Minimum role
Read (list/get, history, matches, eval runs), run an eval	Viewer+
Run sandbox test	Developer+
Create, update, delete, revert, upload/delete corpus	Developer+
Mark / un-mark a match false positive	Admin

Reads map to the guardrails:read permission (held by Viewer and up); writes map to guardrails:write (Developer and up). Marking a false positive additionally requires the Admin role. See Scope, keys & policies for how roles and permissions combine.

4. Routes by area

Guardrails (CRUD + sandbox)

Method & path	Role
`GET /api/guardrail/`	Viewer+
`GET /api/guardrail/meta`	Viewer+
`GET /api/guardrail/my-permissions`	any member
`GET /api/guardrail/:id`	Viewer+
`GET /api/guardrail/:id/tokens`	Viewer+
`POST /api/guardrail/test`	Developer+
`POST /api/guardrail/`	Developer+
`PUT /api/guardrail/`	Developer+
`DELETE /api/guardrail/:id`	Developer+

meta returns the engine vocabulary — rule types, stages, actions, PII entities, presets, and preset categories — so a tool can build a form without hard-coding the enums. test runs the current policy over sample text in a sandbox: nothing is persisted, nothing goes upstream.

Version history

Method & path	Role
`GET /api/guardrail/:id/history`	Viewer+
`GET /api/guardrail/:id/history/diff`	Viewer+
`GET /api/guardrail/:id/history/:version`	Viewer+
`POST /api/guardrail/:id/revert`	Developer+

Every create, update, and delete writes a history row in the same transaction. revert copies an old version forward as a new version — history is never mutated.

Eval & corpora

Method & path	Role
`POST /api/guardrail/:id/eval`	Viewer+
`GET /api/guardrail/:id/eval/runs`	Viewer+
`GET /api/guardrail/eval/runs/:run_id`	Viewer+
`GET /api/guardrail/eval/corpora`	Viewer+
`POST /api/guardrail/eval/corpora`	Developer+
`GET /api/guardrail/eval/corpora/:id`	Viewer+
`DELETE /api/guardrail/eval/corpora/:id`	Developer+

Run a guardrail against a bundled red-team corpus or a custom JSONL set you upload, then read the per-sample failures. Useful for tuning a judge rubric or proving a policy catches known attacks before you ship.

Matches feed

Method & path	Role
`GET /api/guardrail/match`	any member
`GET /api/guardrail/match/grouped`	any member
`GET /api/guardrail/match/stats`	any member
`GET /api/guardrail/match/export`	any member
`GET /api/guardrail/match/cap-status`	any member
`GET /api/guardrail/match/:id`	any member
`POST /api/guardrail/match/:id/mark-fp`	Admin
`DELETE /api/guardrail/match/:id/mark-fp`	Admin

A match records the rule type, action, stage, and a detail string — plus the matched substring only if “Log raw content” is on for that guardrail (off by default). The read routes carry no extra permission middleware, so any active workspace member can reach them; the two mark-fp routes are Admin-only (Admin or Owner) and rate-limited.

5. Resolution: which guardrail applies

The routes above manage policies; resolution decides which one runs on a given relay call. It is an explicit-or-default model with no silent fallback on an explicit attachment:

Explicit guardrail_id on the key → that guardrail applies, if it exists and is enabled. A disabled attachment is the off switch — it does not fall back.
No attachment → the workspace’s enabled default guardrail (is_default).
Neither → no enforcement. The request is byte-identical to a workspace that never enabled the feature.

This is the one behavior that differs from the Firewall: a disabled attached firewall policy falls back to the workspace default, whereas a disabled attached guardrail resolves to none. See Guardrails vs Firewall.

Set guardrail_id on a key through the key editor or the token API; 0/null means “no explicit attachment.”

6. What a block returns

When a block-action rule fires, the relay call (/v1/*, on your relay key) — not these management routes — returns:

Field	Value
HTTP status	`400`
Error code	`guardrail_blocked`
Quota cost	an input-stage block fires before pre-consume, so no quota is charged
Retry	marked skip-retry

The message names the guardrail and the rule that fired. For the full code catalog and triage paths, see Error codes and Why was my request blocked?.

7. Actions, stages, and rule types at a glance

The meta route returns these as enums; here they are for quick reference.

Actions: block (reject, HTTP 400), mask (redact the match, forward the cleaned text), flag (log only — observe without changing traffic), annotate (non-blocking — record the match and inject a human-readable note upstream so the model is told about it before it answers), spotlight (non-blocking — wrap the matched untrusted span in delimiters and tell the model to treat it as data, not instructions; a prompt-injection defense, input-stage in practice).
Stages: input (the request), output (the model’s response), or both.
Rule types: keyword, regex, pii, max_chars, external, llm_judge, grounding.

Output-stage rules are enforced on both streaming and non-streaming responses: a block cuts the stream, and a mask rewrites matched spans in band as chunks flow. On a stream, already-flushed bytes cannot be retracted, so a match is only acted on once enough of it has buffered — for the strongest guarantee, scan on the input stage, which sanitizes the request before the model ever sees it. Prove your exact stage/stream combination in the sandbox first.

For per-entity PII overrides, custom entities, the LLM judge, and contextual grounding fields, see the deep reference in Guardrails.

Firewall API

The action-plane peer — /api/workspace/firewall/* and the gateway routes.

Compliance API

/api/compliance/* — packs, signed reports, residency, readiness.

Error codes

Every *_blocked code, its HTTP status, and what to do about it.

Guardrails (deep dive)

Rule types, PII entities, presets, evals, and the matches feed in full.

​1. Auth and scope

​2. One concrete call — list your guardrails

​3. The role model in one table

​4. Routes by area

​5. Resolution: which guardrail applies

​6. What a block returns

​7. Actions, stages, and rule types at a glance

​8. Related references

Firewall API

Compliance API

Error codes

Guardrails (deep dive)

1. Auth and scope

2. One concrete call — list your guardrails

3. The role model in one table

4. Routes by area

5. Resolution: which guardrail applies

6. What a block returns

7. Actions, stages, and rule types at a glance

8. Related references