Runaway cost and denial-of-wallet

An agent doesn’t have to leak data to hurt you. It can simply spend — a retry loop hammering an expensive model, a prompt-injected instruction that fans out a thousand tool calls, or a leaked API key racking up inference until the bill arrives. This is denial of wallet: the attack is the cost itself. Unlike a classic denial-of-service, the gateway stays up and every request looks individually legitimate — the damage is the aggregate spend. OrcaRouter gives you three independent ceilings that all sit in front of the upstream model, so no single runaway path can run your bill unbounded.

1. The denial of wallet ai threat

A denial-of-wallet incident usually traces to one of three shapes:

Runaway agent loop

An agent retries the same failing tool or re-plans in a tight loop, re-paying for tokens on every pass. No malice required — a bad stop condition is enough.

Injected fan-out

A prompt injection steers the agent into spamming a tool or issuing oversized requests, multiplying spend per turn.

Leaked or over-scoped key

A key ends up somewhere it shouldn’t — a committed .env, a shared notebook — and an attacker runs inference on your account until the spend is noticed.

The defense is the same in all three cases: a hard ceiling the attacker can’t talk their way past, enforced at the gateway, not in your agent code.

2. Per-run cost ceiling with `cap_cost`

The Firewall’s cap_cost verdict is a circuit-breaker for runaway loops. You author it as a rule with a per-run cents cap; the engine sums the agent run’s accumulated spend and, once the run crosses the cap, resolves the verdict to deny — every later tool call in that run is blocked. cap_cost is a pre-dispatch ceiling: it evaluates before the call reaches the tool, so it stops the next expensive call rather than refunding one already made. A typical catch-all cap on every tool:

{
  "priority": 50,
  "label": "cap runaway spend at $5 per run",
  "tool_name_glob": "*",
  "verdict": "cap_cost",
  "cap_cost_cents": 500
}

Below the cap the call is allowed; above it, the run is denied with an HTTP 400 firewall_blocked — marked skip-retry, so the loop can’t hammer around the denial. The ceiling is per agent run and summed across your whole workspace policy, so one runaway conversation can’t bleed into another’s budget.

cap_cost reads running spend from your request logs. Keep request-log capture on for the workspace so the running-spend rollup has rows to sum — otherwise the prior-spend estimate is conservatively 0 and the cap can’t see what a run already cost.

See the Firewall rules reference for the full matching language and where cap_cost sits among the other verdicts.

3. Hard budget per key with `credit_limit_usd`

cap_cost bounds a single run. To bound a key — every run it ever issues — set credit_limit_usd on the API key. It’s a hard USD ceiling on that key’s lifetime spend: the gateway converts it into the key’s remaining quota, and once the key has spent its allowance, further relay calls are rejected for insufficient credit. 0 means unlimited. Pair it with the key’s other scopes so a leaked key is bounded on every axis at once:

credit_limit_usd

Hard USD spend ceiling for the key (0 = unlimited).

expired_time

Auto-expiry timestamp (-1 = never). A short-lived key bounds the blast-radius window.

allow_ips

Pin the key to known source IPs — a leaked key is useless off-network.

model_limits

Restrict the key to specific models, so it can’t reach the priciest ones at all.

Give each agent its own narrowly-scoped key with a credit_limit_usd it should never legitimately exceed. The limit is the budget, not a guess at attacker behavior — even a fully-compromised key stops at the ceiling.

Configure all of this from the console key editor (or the token API) under your session — these are key settings, not relay calls. Only the /v1/* inference requests use the sk-orca-... key itself. Editing the limit takes effect on the key’s next request; no redeploy.

4. Catch the spike you didn’t predict: cost anomalies

A static cap stops spend you anticipated. The Firewall’s anomaly detection catches the spend you didn’t. It learns each workspace’s normal tool-use shape against an hour-of-week baseline (a 14-day rolling average) and surfaces deviations on a Member-readable feed:

Anomaly	What it flags
`burn_spike`	Cost for a tool far above its learned baseline cost — the denial-of-wallet signal.
`rate_spike`	Call volume far above baseline — fan-out and floods.
`retry_loop`	The same tool with the same arguments repeating in a tight window — the classic runaway loop.

So “this tool burned 40× its usual cost this hour” stands out even when each individual call was allowed by policy. You can snooze an anomaly for up to 7 days while you investigate.

Anomaly detection is your early warning; cap_cost and credit_limit_usd are the hard stops. Watch the feed to discover where your real spend lives, then write a cap around it.

5. Putting it together

Layer the three so a runaway never reaches the bill:

Control	Scope	When it fires
`cap_cost` rule	One agent run	Run’s accumulated spend crosses the cents cap
`credit_limit_usd`	One key, lifetime	Key’s total spend hits its USD ceiling
`burn_spike` / `retry_loop`	Workspace, learned	Spend or repeat pattern deviates from baseline

A practical baseline: a per-run cap_cost on *, a credit_limit_usd on every agent key, and a habit of checking the anomaly feed. Roll a new cap_cost policy out in shadow mode first — it logs [shadow] would deny without blocking — so you can size the cap against real traffic before it bites.

cap_cost and the anomaly feed bound tool calls and runs that cross the gateway. A tool an agent executes entirely inside its own process never reaches the engine. Route model-mediated and MCP tool calls through the gateway — and give every key a credit_limit_usd — so the ceiling holds regardless of how the agent loops.

Denial of wallet rarely arrives alone — the loop that burns your budget is often driven by something upstream:

Prompt injection — injected instructions are a common trigger for fan-out and tool spam.
Excessive agency — an agent with too much latitude has more ways to spend.
Dangerous tool calls — the same firewall rule plane bounds what a tool may do, not just how much it costs.
The threat model — where runaway cost fits in the full agentic attack surface.

Firewall overview

Verdicts, anomaly detection, autonomy levels, and observability.

Scoped keys & policies

How key limits, guardrails, and firewall policies compose per key.

​1. The denial of wallet ai threat

​2. Per-run cost ceiling with cap_cost

​3. Hard budget per key with credit_limit_usd

credit_limit_usd

expired_time

allow_ips

model_limits

​4. Catch the spike you didn’t predict: cost anomalies

​5. Putting it together

​6. Related threats

Firewall overview

Scoped keys & policies

1. The denial of wallet ai threat

2. Per-run cost ceiling with `cap_cost`

3. Hard budget per key with `credit_limit_usd`

4. Catch the spike you didn’t predict: cost anomalies

5. Putting it together

6. Related threats