Human-in-the-loop agent approval

Some tool calls are too consequential to allow blindly and too useful to ban outright — a production database write, a wire transfer, a *.delete on real data. For those you want a person in the loop: hold the call, let a human look, then proceed only on a yes. That is exactly what the pending_approval verdict does. This page covers the human in the loop agent approval flow end to end: how a held call surfaces, how a reviewer resolves it from the console or a webhook, and how the agent re-submits the approved call. For where the verdict sits in the rule grammar, see Firewall Rules; for the policy model around it, see the Firewall overview.

1. What a held call looks like

When a rule resolves to pending_approval, the engine enqueues an approval record and the call does not reach the tool. The relay returns HTTP 400 with error.code firewall_approval_pending; the approval id the agent will poll on is carried in the human-readable error.message:

{
  "error": {
    "code": "firewall_approval_pending",
    "message": "tool \"db.write\" held for approval (…) — resolve approval 507f1f77bcf86cd799439011 and retry with header X-OrcaRouter-Firewall-Approval"
  }
}

The structured error.metadata (when present) carries the verdict’s reason detail — reason_code, factors, risk_score — not the approval id. Parse the id out of the message, or get it from the SDK helper below. The hold is immediate — there is no inline long-poll blocking your request. The agent gets the id back, the call is parked server-side in the pending state, and resolution happens out-of-band.

A held call is recorded as a firewall event with verdict pending_approval, so it is filterable in the events log right alongside deny events — you can always see what was held and, via the approval record, what was resolved.

2. One concrete example

Author a rule that holds any write to a production connection for a human:

{
  "label": "hold prod db writes",
  "tool_name_glob": "db.write",
  "verdict": "pending_approval",
  "args_match_json": "{\"clauses\":[{\"path\":\"$.connection\",\"op\":\"eq\",\"value\":\"prod\"}]}"
}

Now the lifecycle:

Agent calls the tool

The agent issues db.write against prod. The rule matches, the engine holds the call, and the relay returns 400 firewall_approval_pending with an approval_id.

A human (or your system) reviews

A reviewer resolves the approval — in the console or via a signed webhook callback (see §3).

Agent polls until resolved

The agent polls the approval id until its state is no longer pending (see §4).

Agent re-submits with the approval header

On approved, the agent re-issues the exact same call once, carrying a single-use X-OrcaRouter-Firewall-Approval header. The engine claims the approval and lets that one call through.

3. Resolving an approval

There are two ways to turn a pending approval into approved or rejected. Both share a first-decision-wins guarantee — the first resolve to land is applied atomically, and any later resolve (or a duplicate) is an idempotent no-op returning 200.

Console — a reviewer clicks approve/reject (Developer+)

The Approvals tab lists pending holds oldest-first, each with the tool name and a “Held because…” line naming the policy and the rule clause that fired. (The raw call arguments are not stored on the approval record — only the tool name, provenance, and an args hash — so the reviewer decides from the tool plus the matched clause.) A reviewer resolves one with:

PATCH /api/workspace/firewall/approvals/:id

{ "decision": "approved", "reason": "verified change ticket #4821" }

decision must be approved or rejected. This route is UserAuth (the reviewer’s console session) and gated to Developer+ — your reviewer’s identity is the authorization, so no shared secret is involved. Resolutions are written to the workspace audit log.

Webhook — your own system decides, HMAC-signed

To wire approvals into an external system (a Slack approval, a ticketing workflow), configure an approval webhook secret for the workspace, then POST the decision back:

POST /api/v1/firewall/approvals/:id/callback

{ "decision": "approved", "reason": "auto-approved by change-control bot" }

The callback is authenticated by HMAC-SHA256: set the X-Orca-Signature: sha256=<hex> header to the HMAC of <approval_id>\n<raw_body> keyed with your workspace’s approval webhook secret. The id is part of the signed material, so a captured signature can’t be replayed against a different approval. Without a configured secret, callback-driven resolution is rejected — resolve via the console PATCH instead.

Configuring an approval-webhook rejection path is the safe default for unattended runs: if no human resolves a hold, the call simply stays parked and the agent keeps polling. A held call never silently becomes an allow.

4. Poll, then re-submit

The agent side is a poll loop followed by one re-submit. Poll the approval state with a firewall-gateway-scoped token:

GET /api/v1/firewall/approvals/:id

This route requires a token with the firewall-gateway scope (the same dedicated gateway key used for /evaluate and the MCP gateway); a regular relay key gets 403. It returns the approval doc — wait until state is approved or rejected rather than pending. A cross-workspace or unknown id returns 404, never disclosing that it exists to another tenant. Re-submit once the state is approved: re-issue the same tool call, carrying the approval id in a single-use header:

X-OrcaRouter-Firewall-Approval: 507f1f77bcf86cd799439011

The engine atomically claims the approval — single-use. The first re-submit carrying it is allowed through that one time; a replay of the same header finds the approval already consumed and is held again, not allowed. A rejected approval is never claimable, so the agent should treat rejection as a terminal deny and pick another path.

The OrcaRouter MCP SDK’s HITL helper runs this poll-then-re-submit loop for you: when evaluate returns pending_approval, it polls GET /api/v1/firewall/approvals/:id and re-submits with the approval header on approval — you only author the rule and staff the reviewer.

5. States and roles at a glance

State	Meaning	Agent action
`pending`	Held, awaiting a decision	Keep polling
`approved`	Reviewer said yes	Re-submit once with the header
`rejected`	Reviewer said no	Treat as a deny

Action	Route	Auth · role
List the queue	`GET /api/workspace/firewall/approvals`	UserAuth · Developer+
Resolve	`PATCH /api/workspace/firewall/approvals/:id`	UserAuth · Developer+
Webhook callback	`POST /api/v1/firewall/approvals/:id/callback`	HMAC-signed
Poll state	`GET /api/v1/firewall/approvals/:id`	Gateway token

6. Where approvals fit

A pending_approval verdict is one of the firewall verdicts — it composes with everything else in a policy. Two interactions worth knowing:

Skill quarantine escalates to a hold. If a held tool call is owned by a quarantined skill, anything short of a deny is escalated to pending_approval automatically — quarantine and approvals are the same review gate from two directions.
Shadow mode flattens it. In shadow mode a pending_approval verdict is downgraded to audit and logged as [shadow] would …, so you can measure how often a hold would fire before it starts gating real traffic.

This is the right control for dangerous tool calls and excessive agency — the cases where a verdict of “ask a human” beats both allow and deny.

Where to go next

Verdicts

All six firewall verdicts and the default verdict.

Gateway keys

Mint the firewall-gateway token used to poll approvals.

Shadow mode

Measure a hold before it gates real traffic.

Rule reference

Author the rule that produces a pending_approval verdict.

​1. What a held call looks like

​2. One concrete example

​3. Resolving an approval

​4. Poll, then re-submit

​5. States and roles at a glance

​6. Where approvals fit

​Where to go next

Verdicts

Gateway keys

Shadow mode

Rule reference

1. What a held call looks like

2. One concrete example

3. Resolving an approval

4. Poll, then re-submit

5. States and roles at a glance

6. Where approvals fit

Where to go next