Guardrail versioning: history, diff, revert

You shipped a tighter pii-shield policy on Monday, a teammate widened a regex on Wednesday, and now real traffic is throwing false positives. You need to see what changed, who changed it, and roll back — without guessing at the previous JSON or redeploying anything. That is what guardrail versioning gives you: a history row per change, a diff between any two, and one-click revert. This page is the focused landing for the versioning surface. For the guardrail engine itself — rule types, stages, actions — start at the Guardrails overview or the full Guardrails reference.

1. What guardrail versioning records

Every mutation on a guardrail — create, update, delete, and revert — writes an append-only history row in the same transaction as the change. The row captures a snapshot of the user-visible config at that moment:

the guardrail name,
whether it was enabled,
whether it was the workspace default,
the full rules body.

Each row carries a monotonic version number (starting at 1), the operation that produced it, the author, and a timestamp. Because the row is written transactionally with the edit, the history can never drift out of sync with the live policy — if the edit commits, so does its history row.

History is append-only. A revert does not rewind or rewrite past rows; it appends a new version (see §4). You always see the complete sequence of who did what, in order.

2. One concrete example — find the bad edit and roll it back

Say guardrail 42 has drifted. You author all of this from the console on your own session — the sk-orca-... relay key is only for /v1/* calls, never for reading or changing policy.

List the history

Open History on the guardrail row in /console/guardrails. The feed is newest-first. You see v5 update (Wednesday, by a teammate), v4 update (Monday, by you), v3 update, and so on back to v1 create. Reading history is open to any workspace Member.

Diff the suspect change

Pick the two versions that bracket the regression — v4 and v5 — and view the diff. The rules body is shown side by side, so the widened regex jumps out as the line that changed.

Revert

Restore v4. The live guardrail’s name, enabled flag, default flag, and rules are set back to that snapshot, and a fresh v6 revert row is appended. The change is live on the next request — no redeploy, no SDK change. Reverting requires the Developer+ role.

The same flow over the REST API, all on your session / access token (never the relay key), workspace-scoped via X-Workspace-Id:

# 1. List versions (Member)
curl https://api.orcarouter.ai/api/guardrail/42/history \
  -H "Authorization: Bearer <session-token>" \
  -H "X-Workspace-Id: <ws-id>"

# 2. Diff v4 against v5 (Member) — returns both snapshots to render side by side
curl "https://api.orcarouter.ai/api/guardrail/42/history/diff?from=4&to=5" \
  -H "Authorization: Bearer <session-token>" \
  -H "X-Workspace-Id: <ws-id>"

# 3. Revert to v4 — appends a new "revert" version (Developer+)
curl -X POST https://api.orcarouter.ai/api/guardrail/42/revert \
  -H "Authorization: Bearer <session-token>" \
  -H "X-Workspace-Id: <ws-id>" \
  -H "Content-Type: application/json" \
  -d '{"to_version": 4}'

The revert response returns the post-revert live guardrail so your UI can refresh without an extra round-trip. The next /v1/* call screened by this guardrail sees the restored policy.

3. History, diff, and the version feed

The history feed

GET /api/guardrail/:id/history returns the version trail, newest first. Each entry is one snapshot with its version number, operation (create / update / delete / revert), author, and timestamp. The feed is workspace-scoped — a caller in another workspace gets the same not-found envelope as a missing guardrail, so existence never leaks.

A single version

GET /api/guardrail/:id/history/:version fetches one snapshot by its version number — handy for inspecting the exact rules body that was live at a point in time before you decide whether to revert to it.

The diff

GET /api/guardrail/:id/history/diff?from=N&to=M returns both snapshots — from and to — so the console can render a side-by-side comparison of the name, flags, and rules. Both versions must belong to your workspace, or the call returns the uniform not-found envelope.

Reads — history list, single version, and diff — are open to any workspace Member. They are pure inspection: nothing about traffic changes, and no model or vendor call is made.

4. Revert restores as a new version

A revert is not a rewind. POST /api/guardrail/:id/revert with a to_version body:

Loads the target version’s snapshot.
Restores the live guardrail’s name, enabled flag, default flag, and rules to that snapshot — atomically, in one transaction.
Appends a fresh revert history row capturing the now-live state.

So reverting v5 back to v4 produces a new v6 whose content equals v4. Your history reads v1 → v2 → … → v5 → v6(revert) — every step preserved, nothing mutated. Revert that older snapshot again later and you get a v7, and so on.

A restored disabled or non-default state round-trips intact. If the version you revert to had enabled: false or was not the workspace default, reverting sets the live guardrail back to exactly that — it does not silently keep the policy on. Diff first so you know whether a revert will also flip those flags.

Because the binding lives on the gateway, a revert shifts every API key attached to this guardrail at once — and the workspace default, if this is it — on the next call. See attach to a key and the workspace default for how attachment resolves.

5. Roles and retention

Action	Route	Role
List / read versions, diff	`GET …/history`, `…/history/diff`, `…/history/:version`	Member
Revert to a version	`POST …/revert`	Developer+

All history routes are /api/guardrail/* and authenticate with your session / access token under X-Workspace-Id — never an sk-orca-... relay key. Reverting carries the same Developer+ gate as creating or updating a guardrail, since it changes live traffic.

History is retained at the 50 most-recent versions per guardrail. Older rows are pruned automatically as you keep editing, so a chatty edit-loop workflow never grows the trail unbounded. The list endpoint returns up to the newest 50, newest first.

Pair versioning with flag-first tuning: ship a new rule as flag, watch the matches feed, and if it misbehaves, diff and revert in seconds instead of reconstructing the old policy by hand.

6. Where to go next

Test & eval before you ship

Prove a policy in the sandbox and against a corpus before it becomes a version you’d have to revert.

Tune false positives

The flag-then-promote loop that versioning makes safe.

Actions: block, mask, flag

What each rule does once a version is live.

Guardrails reference

The full engine — rule types, stages, presets, and the complete API.

Versioning here covers guardrail content policy. The firewall has its own change surface for tool policy; for how the two enforcement layers differ, see guardrails vs. firewall.

​1. What guardrail versioning records

​2. One concrete example — find the bad edit and roll it back

​3. History, diff, and the version feed

​4. Revert restores as a new version

​5. Roles and retention

​6. Where to go next

Test & eval before you ship

Tune false positives

Actions: block, mask, flag

Guardrails reference

1. What guardrail versioning records

2. One concrete example — find the bad edit and roll it back

3. History, diff, and the version feed

4. Revert restores as a new version

5. Roles and retention

6. Where to go next