Quota, credit cap & expiry on an API key

A compromised agent doesn’t stop on its own. A prompt injection that tricks it into a retry loop, or a leaked key in a CI log, will keep calling models until something says stop. On OrcaRouter that “something” is two fields on the key itself: a spend cap and an expiry. Set them once in the key editor and the gateway enforces both on every request — no agent-code change, no redeploy. This page is the focused reference for those two limits. For the full key field list, see the token object; for the identity model around them, see scoped keys overview.

1. The api key spend limit: `credit_limit_usd`

credit_limit_usd is the lifetime spend ceiling for a key, expressed in plain USD. You type a dollar figure in the key editor; OrcaRouter converts it into the key’s starting quota and meters every call against it.

Bounded

credit_limit_usd: 25 mints a key with $25 of spend. Each call debits its cost; once the remaining balance hits zero the key stops authorizing and every further request is rejected.

Unlimited

credit_limit_usd: 0 is the sentinel for no cap — the key draws on your workspace balance with no per-key ceiling. Convenient, but the worst blast radius if it leaks.

0 does not mean “zero dollars” — it means unlimited. A key you intended to lock down to a tiny budget must carry a positive number. To express “this key may spend nothing,” disable or delete it, don’t set the cap to 0.

2. How the cap is metered: `remain_quota` & `used_quota`

The dollar cap you enter is the human-facing surface. Under it, the gateway tracks two running counters on the key:

Field	Meaning
`remain_quota`	Spend left before the key stops authorizing.
`used_quota`	Spend consumed so far over the key’s lifetime.

Setting a positive credit_limit_usd seeds remain_quota from that dollar figure; every billed call moves cost from remain_quota into used_quota. A key with an unlimited cap carries unlimited_quota instead, and the balance check is skipped entirely.

A guardrail or firewall block costs nothing against the cap when it fires before the model runs — an input-stage guardrail_blocked and an inbound firewall_blocked both happen pre-metering, so remain_quota is untouched. An output-stage guardrail block refunds the request. See guardrails and firewall.

3. Auto-expiry: `expired_time`

expired_time is an absolute cut-off — a Unix epoch timestamp (seconds) after which the key stops authorizing, no matter how much budget remains.

A future timestamp expires the key at that instant. The gateway compares it against the current time on every request and rejects the call once it has passed.
-1 is the sentinel for never expires.

The two limits are independent and both must pass. A key with budget left but a passed expired_time is dead; a key inside its validity window with remain_quota at zero is dead. Whichever bound trips first wins. The editor rejects an expiry set in the past, so you can’t mint a born-expired key by accident.

For short-lived keys minted per CI run or per ephemeral agent, see expiring keys.

4. One concrete capped, expiring key

A nightly job that reconciles invoices with one cheap model, runs for a two-week pilot, and should never cost more than a few dollars a night needs almost no agency. Configure its key in the console key editor (/console/token — Developer+):

Set the spend cap

credit_limit_usd: 40 — the pilot’s whole budget. A runaway retry loop exhausts the key, not your workspace balance.

Set the expiry

expired_time: the Unix timestamp for the end of the pilot window. The key auto-expires and can’t be reused after the pilot ships.

Pair with the other scopes

Add model_limits so it can’t escalate to a frontier model, and allow_ips so a leaked key is useless off the scheduler’s host.

If this agent is hijacked on day three, the damage is bounded to whatever is left of its $40, and the whole key is gone in eleven days regardless. The rest of the workspace is untouched.

Both fields are USD-and-time on the key, not workspace-wide policy. To cap the spend of a single agent run (rather than a key’s lifetime), the Firewall’s cap_cost verdict is the per-run circuit-breaker — see firewall rules. The two compose: the key cap bounds the lifetime, cap_cost bounds a single run.

5. Who can set these

Setting credit_limit_usd and expired_time is part of creating or editing a key, which requires the Developer role or above. Any workspace member can read a key’s masked record; only Developer+ can change its limits. Keys are masked on display — plaintext is shown once at creation (see key masking).

6. Bounded by default

A key with credit_limit_usd: 0 and expired_time: -1 has no spend cap and never expires — maximum agency, worst blast radius. Make that the deliberate exception, not the default.

Unlimited vs bounded

When an uncapped, non-expiring key is actually the right call — and when it isn’t.

Least-agency checklist

Run every production key through the same hardening pass before it ships.

The token object

Every field on a key, including the quota counters.

Bind policies

Attach a guardrail and a firewall policy to the same key.

Excessive agency

The threat spend caps and expiry are built to contain.

A spend cap and an expiry are the cheapest insurance on a key: two numbers that turn an open-ended credential into one that fails safe — empty or expired — instead of running until your bill notices.

IP allow-list Expiring keys

​1. The api key spend limit: credit_limit_usd

Bounded

Unlimited

​2. How the cap is metered: remain_quota & used_quota

​3. Auto-expiry: expired_time

​4. One concrete capped, expiring key

​5. Who can set these

​6. Bounded by default

Unlimited vs bounded

Least-agency checklist

​7. Related

The token object

Bind policies

Excessive agency

1. The api key spend limit: `credit_limit_usd`

2. How the cap is metered: `remain_quota` & `used_quota`

3. Auto-expiry: `expired_time`

4. One concrete capped, expiring key

5. Who can set these

6. Bounded by default

7. Related