Skip to main content
A compromised agent doesn’t stop on its own. A prompt injection that tricks it into a retry loop, or a leaked key in a CI log, will keep calling models until something says stop. On OrcaRouter that “something” is two fields on the key itself: a spend cap and an expiry. Set them once in the key editor and the gateway enforces both on every request — no agent-code change, no redeploy. This page is the focused reference for those two limits. For the full key field list, see the token object; for the identity model around them, see scoped keys overview.

1. The api key spend limit: credit_limit_usd

credit_limit_usd is the lifetime spend ceiling for a key, expressed in plain USD. You type a dollar figure in the key editor; OrcaRouter converts it into the key’s starting quota and meters every call against it.

Bounded

credit_limit_usd: 25 mints a key with $25 of spend. Each call debits its cost; once the remaining balance hits zero the key stops authorizing and every further request is rejected.

Unlimited

credit_limit_usd: 0 is the sentinel for no cap — the key draws on your workspace balance with no per-key ceiling. Convenient, but the worst blast radius if it leaks.
0 does not mean “zero dollars” — it means unlimited. A key you intended to lock down to a tiny budget must carry a positive number. To express “this key may spend nothing,” disable or delete it, don’t set the cap to 0.

2. How the cap is metered: remain_quota & used_quota

The dollar cap you enter is the human-facing surface. Under it, the gateway tracks two running counters on the key:
FieldMeaning
remain_quotaSpend left before the key stops authorizing.
used_quotaSpend consumed so far over the key’s lifetime.
Setting a positive credit_limit_usd seeds remain_quota from that dollar figure; every billed call moves cost from remain_quota into used_quota. A key with an unlimited cap carries unlimited_quota instead, and the balance check is skipped entirely.
A guardrail or firewall block costs nothing against the cap when it fires before the model runs — an input-stage guardrail_blocked and an inbound firewall_blocked both happen pre-metering, so remain_quota is untouched. An output-stage guardrail block refunds the request. See guardrails and firewall.

3. Auto-expiry: expired_time

expired_time is an absolute cut-off — a Unix epoch timestamp (seconds) after which the key stops authorizing, no matter how much budget remains.
  • A future timestamp expires the key at that instant. The gateway compares it against the current time on every request and rejects the call once it has passed.
  • -1 is the sentinel for never expires.
The two limits are independent and both must pass. A key with budget left but a passed expired_time is dead; a key inside its validity window with remain_quota at zero is dead. Whichever bound trips first wins. The editor rejects an expiry set in the past, so you can’t mint a born-expired key by accident.
For short-lived keys minted per CI run or per ephemeral agent, see expiring keys.

4. One concrete capped, expiring key

A nightly job that reconciles invoices with one cheap model, runs for a two-week pilot, and should never cost more than a few dollars a night needs almost no agency. Configure its key in the console key editor (/console/tokenDeveloper+):
1

Set the spend cap

credit_limit_usd: 40 — the pilot’s whole budget. A runaway retry loop exhausts the key, not your workspace balance.
2

Set the expiry

expired_time: the Unix timestamp for the end of the pilot window. The key auto-expires and can’t be reused after the pilot ships.
3

Pair with the other scopes

Add model_limits so it can’t escalate to a frontier model, and allow_ips so a leaked key is useless off the scheduler’s host.
If this agent is hijacked on day three, the damage is bounded to whatever is left of its $40, and the whole key is gone in eleven days regardless. The rest of the workspace is untouched.
Both fields are USD-and-time on the key, not workspace-wide policy. To cap the spend of a single agent run (rather than a key’s lifetime), the Firewall’s cap_cost verdict is the per-run circuit-breaker — see firewall rules. The two compose: the key cap bounds the lifetime, cap_cost bounds a single run.

5. Who can set these

Setting credit_limit_usd and expired_time is part of creating or editing a key, which requires the Developer role or above. Any workspace member can read a key’s masked record; only Developer+ can change its limits. Keys are masked on display — plaintext is shown once at creation (see key masking).

6. Bounded by default

A key with credit_limit_usd: 0 and expired_time: -1 has no spend cap and never expires — maximum agency, worst blast radius. Make that the deliberate exception, not the default.

Unlimited vs bounded

When an uncapped, non-expiring key is actually the right call — and when it isn’t.

Least-agency checklist

Run every production key through the same hardening pass before it ships.

The token object

Every field on a key, including the quota counters.

Bind policies

Attach a guardrail and a firewall policy to the same key.

Excessive agency

The threat spend caps and expiry are built to contain.
A spend cap and an expiry are the cheapest insurance on a key: two numbers that turn an open-ended credential into one that fails safe — empty or expired — instead of running until your bill notices.