DROP on a ledger table, card numbers leaking
into a prompt — is measured in dollars and audit findings. This recipe
assembles the controls that make such an agent safe to run: tight
autonomy as the floor, human approval on the money-moving tools,
a per-run cost cap as the circuit breaker, and an installable
SOC 2 / PCI compliance pack that materializes the policy and the
signed evidence an auditor will ask for.
Everything here is configured in the console (Firewall → Posture /
Policies, Guardrails, Compliance). Those management routes use your
console session, not a relay key — only the
/v1/* calls your agent
makes carry an sk-orca-… key. Policy edits require the Developer
role; compliance install / go-live / residency require workspace
Admin and a paid plan.1. Why a secure finance ai agent needs more than guardrails
Content screening catches a card number in a prompt. It does not stop the agent from callingrefund.issue ten thousand times, reaching an
internal 10.x host, or running a destructive migration. A
finance-grade posture has to govern both planes at once:
The text plane
Guardrails screen request and response text —
PII masked, secrets blocked, before the model ever sees them.
The action plane
The Firewall governs every tool call, MCP
dispatch, and outbound request — allow, audit, deny, sanitize, hold,
or cap cost.
2. Floor: apply tight autonomy
Start from the strongest one-switch posture. In Firewall → Posture, apply thetight autonomy level
(Developer role). In a single transaction it sets both planes:
| Plane | What tight materializes |
|---|---|
| Firewall | Default-deny; deny destructive shell; deny SSRF egress (fetch-shaped tool names) |
| Guardrails | PII Shield + Secrets Blocker enforced on requests |
autonomy_* policy and
guardrail rows — it’s a seed, not a black box. It has one-click undo from
an audit snapshot.
3. Approvals: hold the money-moving tools for a human (HITL)
Default-deny stops what you didn’t allow. The tools you do allow but that move money —refund.issue, payment.send, ledger.adjust — should
be neither auto-allowed nor auto-denied. Give them the pending_approval
verdict so a human signs off out-of-band.
In Firewall → Policies, add a rule above your default:
- Tool glob:
refund.*(orpayment.send,ledger.adjust, …) - Verdict:
pending_approval
- The held call returns HTTP 400
firewall_approval_pendingwith an approval id; the call does not reach the tool. - A reviewer resolves it — from the console (Developer+), or via an
HMAC-signed webhook callback to your own approval system at
POST /api/v1/firewall/approvals/:id/callback. - The agent polls
GET /api/v1/firewall/approvals/:id, then re-submits the original call with a single-useX-OrcaRouter-Firewall-Approvalheader — the gateway lets it through that once.
4. Circuit breaker: cap the cost of a run
A finance agent stuck in a retry loop is both a correctness bug and a billing one. Acap_cost rule is the runaway-loop breaker: it denies a
tool call once the agent run’s accumulated spend crosses a per-rule cents
cap.
Add a rule with verdict cap_cost and a cap_cost_cents ceiling — e.g.
2000 (USD $20.00) — scoped to your agent’s tools. Once a run’s running
spend exceeds the cap, further calls in that run are denied; a fresh run
starts clean.
cap_cost caps the agent run’s spend, not a single key’s lifetime
budget. For a hard ceiling on a key, set credit_limit_usd on the API
key itself (0 = unlimited) — the two compose: the key budget bounds
total spend, cap_cost bounds any one run.5. Belt-and-braces on the text plane
tight already enforces PII Shield and Secrets Blocker. For a finance
agent, lean on the specifics:
Block card numbers and secrets from requests
Block card numbers and secrets from requests
The Secrets Blocker guardrail catches API keys and credentials in
the prompt before the model sees them. For card data, a
pii rule
with credit_card set to the block action (via per-entity
entity_actions) rejects the request outright with HTTP 400
guardrail_blocked — and a block costs no quota (input blocks
fire before metering). See
Guardrails §5.Mask PII on the way in
Mask PII on the way in
The PII Shield preset is a single
pii rule, mask, stage
both. Input-stage masking is live: an iban or ssn in the
request is rendered as [IBAN] / [SSN] before the model is called.
(Live output/streaming masking is on the roadmap; output block is
enforced on streaming and non-streaming today.)Sanitize args, never trust results
Sanitize args, never trust results
A Firewall
sanitize verdict redacts matched substrings from a tool
call’s arguments before forwarding — it never rewrites what a tool
returns. To keep a secret out of a request entirely, that’s the
Secrets Blocker guardrail’s job on the text plane.6. The compliance pack: SOC 2 and PCI in one install
The controls above are the implementation. An auditor wants the evidence. The Compliance plane closes that loop: browse the framework catalog (free, any Member), then install a pack as workspace Admin on a paid plan. Installing a pack materializes guardrails and firewall policies that map to the framework’s controls — so the same install that gives you the audit artifact also stands up real enforcement.soc2 (AICPA
SOC 2 Trust Services Criteria), pci_dss (PCI DSS 4.0), glba
(Gramm-Leach-Bliley), and dora_eu (Digital Operational Resilience
Act) — alongside privacy frameworks (gdpr, uk_gdpr, ccpa),
security/AI frameworks (iso_27001, iso_42001, nist_ai_rmf,
eu_ai_act, nist_800_53), and the owasp_llm (OWASP Top 10 for
LLM Applications) pack. Browse the live catalog for the full set.
The report an auditor can verify
| What | Detail |
|---|---|
| Signature | Ed25519 over a SHA-256 evidence hash — tamper-evident |
| Formats | CSV / JSON / PDF |
| Verify | Public — GET /api/public/compliance/pubkey, POST /api/public/compliance/verify |
| Share | A read-only auditor link: GET /api/public/compliance/share/:token |
The free plan includes one report; CSV/JSON export and additional reports
are paid. Generating a report and going live are server-gated to paid
plans — the catalog and readiness views stay free.
7. Data residency, retention, and erasure
A finance-grade posture has to answer “where is the evidence, and how long do you keep the logs.”- Residency is the region of the compliance report artifact —
us,eu,uk,ap,cn, orglobal, set viaPUT /api/compliance/residency(Admin). Cross-region reads are withheld. (This pins the artifact, not where inference runs.) - Retention — request logs default to 30 days and are server-clamped to a hard max of 180 days.
- Erasure — a self-service account deletion enters a 30-day grace window, then an irreversible PII scrub cascades through guardrail matches, request logs, and firewall events.
8. Verify before you depend on it
Don’t ship a finance policy on faith. Both planes have a sandbox that persists nothing and dispatches nothing:- Guardrails → Test — paste a sample, pick a stage, see the verdict and rendered (masked) text.
- Firewall → Test (Developer+) — dry-run a sample tool call and see the verdict, the matched rule, and the reason.
retry_loop, and
never-before-seen tool paths — exactly the signals that precede a
financial incident.
Recap
Secure Agents baseline
What
tight materializes, and how to simulate before applying.Firewall rules
Argument predicates, cost caps, egress, and sequences in depth.
SOC 2 evidence
Turn the materialized controls into a signed audit artifact.
PII-safe logging
Keep card and account data out of your request logs.
Enforcement modes
Observe → shadow → enforce, the safe rollout for money-moving tools.
Dangerous tool calls
The threat a finance agent’s tool allow-list defends against.
