Block secret and credential leakage

A prompt that carries an AKIA... key, a pasted .env, an agent that echoes its own sk-... token — any of it can ship a live credential to OpenAI, Anthropic, or Google in the clear, where it lands in their logs and yours. Secrets Blocker stops that at the gateway: a one-click guardrail preset that scans the request for credential shapes and rejects the call with HTTP 400 before a single byte leaves your gateway. This is a focused landing for the secret-leakage use case. For the full guardrail engine — every rule type, field, and route — see the Guardrails reference.

1. Prevent api key leak llm flows in one preset

The whole point of prevent api key leak llm plumbing is to catch the credential before the upstream call, not after it’s already in a provider’s request log. The Secrets Blocker preset does exactly that. It is a small guardrail of input-stage block rules, each a regex for a well-known credential shape:

AWS access key

AKIA followed by 16 uppercase-alphanumeric characters — the canonical AWS access-key-id shape.

OpenAI-style secret key

An sk- prefix followed by a long token body — the shape used by OpenAI and several look-alike provider keys.

GitHub personal access token

A ghp_ prefix followed by a 36-character body.

When any rule matches, the request is blocked — the gateway never forwards it. The policy lives in the gateway, not your application, so your app keeps calling /v1/chat/completions exactly as before, with no SDK change and no redeploy.

Input stage, before metering. Secrets Blocker screens what you send. A match rejects the call before the model is invoked, so the credential never reaches the provider and a blocked request costs no quota. To also catch a secret a model emits back to the client, pair it with an output-block preset — see §5.

2. Apply the preset in the console

Every step here is a console action on the hosted gateway under your own session. Creating and editing guardrails requires Developer+ in the workspace. Only the final /v1/* call uses an sk-orca-... relay key.

Open the template

In the console, open Guardrails, click the New guardrail split-button, and pick Secrets & API-Key Blocker from the Secrets template category. It seeds the input-stage block rules.

Name and save

Give it a name (≤ 64 chars), e.g. secrets-blocker, and save. A preset is a seed, not a lock — add or edit rules freely afterward (see §4).

Test it

Open the Test tab, paste a sample credential at the input stage, and run the policy locally — no upstream call, no quota (see §3).

Attach a key

Edit an API key and pick secrets-blocker from the Guardrail dropdown (sets guardrail_id on the key), or mark it the workspace default. See Attach to a key and Account default.

3. Test before you attach

Prove the rule fires before any key points at it. Open the Test tab inside the editor, paste a dummy credential, pick the input stage, and run:

Here is my key: AKIAIOSFODNN7EXAMPLE

The sandbox evaluates the current policy locally — nothing is sent upstream, nothing is metered — and returns the block verdict naming the rule that fired. For an A/B grid against a corpus of leaked-secret and benign samples, the Eval harness lives one tab over.

4. Extend the coverage

Secrets Blocker covers the three highest-traffic shapes. The Secrets category ships sibling presets you can apply alongside it, and you can author your own regex rule for any token your stack issues:

Private Keys & Cloud Tokens

A companion Secrets preset that blocks PEM private keys, Slack and Stripe tokens, Google API keys, and JWTs on the request.

Crypto Wallet Block

Blocks BTC and ETH-style wallet addresses on the request when they should never reach the provider.

To match an internal token format, add a regex rule — RE2 patterns, linear-time, no backreferences — at the input stage with action block. Bad patterns are rejected at save time, so a guardrail you can save always compiles.

Rather than block, want to redact a leaked secret and let the request through sanitized? Use a pii rule with a mask action — the built-in detector set includes aws_access_key, api_key_openai, and jwt, each rendering to a typed tag like [AWS_ACCESS_KEY]. See Actions for block vs. mask.

5. Also catch secrets in the response

Secrets Blocker screens the request. A separate Secrets preset, Code Secret in Output, screens the model’s response for private keys and AWS/OpenAI-style tokens and blocks the call if one leaks back. Output block is enforced both ways: on a non-streaming response the answer is screened before it returns, and on a streaming response a scanner cuts the stream before any blocked content reaches the client. An output-stage block refunds the pre-consumed quota. See Output-stage rules and Streaming coverage.

6. What a block looks like

A blocked request returns HTTP 400 with error code guardrail_blocked and a message naming the guardrail and the rule that fired:

{
  "error": {
    "code": "guardrail_blocked",
    "message": "request blocked by guardrail \"secrets-blocker\": regex(...)"
  }
}

The request costs no quota — an input-stage block fires before metering — and is marked skip-retry, since re-running the same prompt against another channel would just block again. See the guardrail_blocked error.

7. See what fired

Every rule that fires records a match — rule type, action, stage, and a detail string — surfaced in the workspace Matches feed. The matched substring itself (the credential) is recorded only when Log raw content is on, which is off by default.

For a secrets control, leaving Log raw content off is usually the point: capturing the matched substring would re-write the leaked credential straight into your own telemetry. Keep it off unless you have a narrow triage need, and rotate any credential that was caught — a blocked request means the secret was exposed in a prompt, not that it’s safe. See Matches feed and Logging & privacy.

8. Where to go next

Regex detectors

Author your own credential patterns with RE2 regex rules.

Actions

Choose block, mask, flag, annotate, or spotlight per rule — and block, mask, flag, or annotate per entity.

PII Shield

Mask emails, SSNs, and cards to typed tags before the model sees them.

Tune false positives

Mark false positives and tighten detectors from the Matches feed.

Secrets Blocker keeps credentials out of the content you send. To stop an agent from leaking a secret through a tool call — exfiltrating to an attacker-controlled host — use the Firewall and read the data-exfiltration threat and secret-leakage threat. For the complete guardrail engine, see the Guardrails reference.

​1. Prevent api key leak llm flows in one preset

​2. Apply the preset in the console

​3. Test before you attach

​4. Extend the coverage

Private Keys & Cloud Tokens

Crypto Wallet Block

​5. Also catch secrets in the response

​6. What a block looks like

​7. See what fired

​8. Where to go next

Regex detectors

Actions

PII Shield

Tune false positives

1. Prevent api key leak llm flows in one preset

2. Apply the preset in the console

3. Test before you attach

4. Extend the coverage

5. Also catch secrets in the response

6. What a block looks like

7. See what fired

8. Where to go next