Code-security guardrails

When your application sends code into a model — to review it, complete it, or run it through an agent — you want the model warned about the risky parts and the workspace stopped from leaking secrets in the same pass. A code security guardrail does exactly that: it runs your code-security rules over the request before the upstream model sees a single token. This is a focused landing page. For the full guardrail engine — rule types, stages, resolution, the test sandbox — see the Guardrails reference and the guardrails overview.

1. What a code security guardrail actually does

OrcaRouter ships a code_security preset family you apply from the template picker. Each one is an ordinary guardrail rule — workspace-scoped, ordered, attachable to any key — tuned for code:

.env / Secret-File Block

Blocks .env-style secret assignments (DATABASE_URL=, AWS_SECRET_ACCESS_KEY=, API_TOKEN=…) and pasted multi-line config dumps before they reach the provider. Keys on assignment syntax, not the value.

License Compliance (copyleft)

Flags requests carrying strong-copyleft headers — GPL / AGPL / LGPL / SSPL SPDX tags or full license names — so a reviewer can confirm the code is safe to mix into a permissive codebase. Flag-only.

GPL/AGPL Provenance (output)

Output-stage flag on model suggestions that carry copyleft provenance signatures — a marker the model may have regurgitated copyleft training data into generated code.

Insecure-API Advisory

Annotates the prompt with a security advisory when it references a high-risk sink — eval( / exec( / os.system( / subprocess.run( / pickle.loads( / child_process.exec(. Non-blocking.

The first three reuse actions you already know — block and flag. The Insecure-API Advisory uses annotate: instead of rejecting or redacting the request, it augments it with a note the model reads before it answers. Same primitive powers CVE/SBOM decoration (below).

The code_security presets are deterministic — pure regex, no network call, safe on the hot path. The networked scanners (CVE lookup, SBOM, SAST) are separate external connections, not presets. See §3.

2. Annotate — warn the model without changing the traffic

The actions you configure on a guardrail are block (reject the call, HTTP 400), mask (redact the match), and flag (log only). Code security adds a fourth behavior under the hood: annotate, which neither blocks nor masks. When an annotate rule matches, the gateway records a short note and the relay injects it upstream as a system advisory — so the model is told, e.g., “this request references a high-risk API (code eval, shell execution, or unsafe deserialization); prefer safer alternatives” — before it answers. The user’s text is never rejected and never rewritten.

One concrete example

Apply the Insecure-API Advisory preset to a guardrail and attach it to a key. Then send code that calls a dangerous sink:

curl https://api.orcarouter.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-orca-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o-mini",
    "messages": [
      {"role": "user",
       "content": "Refactor this: result = eval(user_supplied_expr)"}
    ]
  }'

The request goes through unmodified — same content, same model — but the gateway prepends a security advisory the model reads first. The completion comes back steered toward parameterized APIs and input validation, with no code change in your application and no second round-trip.

Annotate composes with the other actions. A single guardrail can mask a secret and annotate the same request — the text is redacted and a note is added in one pass.

3. CVE and SBOM decoration via external scanners

The advisory primitive generalizes. Connect a code-security scanner as an external vendor and its findings ride the same annotate path:

Dependency CVE lookup (OSV)

Extracts imports and manifest pins from the request text and cross-references them against the public OSV vulnerability database. A hit decorates the prompt with, e.g., “requests@2.0.0 has CVE-2014-1830 (HIGH). Fixed in 2.20.0.” — so the model is told about a known vulnerability in a package it was asked to use. Free and unauthenticated, so there’s no API-key field. Defaults to annotate; you can set it to flag or block instead.

SBOM and SAST scanners

Connect an SBOM (software bill-of-materials) or SAST (static-analysis) scanner the same way you connect any external vendor — a base URL plus credentials, stored encrypted and masked on read. Each finding carries a stable identity, so a finding you’ve already triaged doesn’t re-fire on every request.

External scanners follow the same fail-open default as every advanced rule: a scanner error or timeout is recorded as telemetry and the request continues. Set fail_open to false on the rule to fail closed for policies where a missed scan is unacceptable.

4. Pairing with secrets and license rules

A code-security guardrail rarely rides alone. The common shape is one guardrail with a few rules:

Goal	Rule
Stop pasted credentials	`.env` / Secret-File Block (block)
Catch inline secret values	Secrets Blocker (block)
Gate copyleft code	License Compliance (flag)
Steer dangerous sinks	Insecure-API Advisory (annotate)

Add them all to one named policy, attach it to your coding-agent’s key, and every request is screened — block on the unambiguous violations, annotate the judgment calls, flag the rest for review.

A blocked request returns HTTP 400 guardrail_blocked and costs no quota — an input-stage block fires before metering. It is also marked skip-retry, so re-running the same prompt against another channel just blocks again. See the guardrail-blocked error.

5. Configure it (console + roles)

Everything here is configured in the console, not via the relay key. Management routes (/api/guardrail/*) authenticate with your session / access token, not the sk- relay key. Reads — listing guardrails and the Matches feed — are open to every workspace member. Writes (create / edit / delete) and the test sandbox require the Developer role or above: the sandbox can fire paid model calls and outbound vendor requests, so it’s gated like a write.

Create the guardrail

In the console, open Guardrails → New guardrail. The split-button drops you into the template library — pick a Code security preset as your starting point.

Edit freely

A preset is a seed, not a lock. Tune the regex, add a Secrets Blocker rule, change an action. Use the Test tab to prove a rule fires the way you expect against sample text before you attach it to a key.

Attach a key

Set the guardrail on an API key (guardrail_id), or mark it the workspace default. The binding lives on the key in the gateway, so editing the guardrail shifts every attached key on the next call.

Findings land in the workspace Matches feed (rule type, action, stage, detail). The matched substring is recorded only when Log raw content is on — off by default, the privacy-conservative posture. See logging & privacy.

6. Where to go next

Block secrets — the companion rule that catches credential values in request args.
Actions — block, mask, flag, annotate, and spotlight in depth.
Compliance logger — keep an immutable record of every code-security finding.
Testing & eval — prove your policy catches known-bad code before you ship it.
Guardrails reference — the full engine.
Securing AI agents — where code-security rails fit in the zero-trust control stack.

​1. What a code security guardrail actually does

.env / Secret-File Block

License Compliance (copyleft)

GPL/AGPL Provenance (output)

Insecure-API Advisory

​2. Annotate — warn the model without changing the traffic

​One concrete example

​3. CVE and SBOM decoration via external scanners

​4. Pairing with secrets and license rules

​5. Configure it (console + roles)

​6. Where to go next

1. What a code security guardrail actually does

2. Annotate — warn the model without changing the traffic

One concrete example

3. CVE and SBOM decoration via external scanners

4. Pairing with secrets and license rules

5. Configure it (console + roles)

6. Where to go next