1. What a code security guardrail actually does
OrcaRouter ships acode_security preset family you apply from the
template picker. Each one is an ordinary guardrail
rule — workspace-scoped, ordered, attachable to any key — tuned for code:
.env / Secret-File Block
Blocks
.env-style secret assignments (DATABASE_URL=,
AWS_SECRET_ACCESS_KEY=, API_TOKEN=…) and pasted multi-line config
dumps before they reach the provider. Keys on assignment syntax,
not the value.License Compliance (copyleft)
Flags requests carrying strong-copyleft headers — GPL / AGPL / LGPL /
SSPL SPDX tags or full license names — so a reviewer can confirm the
code is safe to mix into a permissive codebase. Flag-only.
GPL/AGPL Provenance (output)
Output-stage flag on model suggestions that carry copyleft
provenance signatures — a marker the model may have regurgitated
copyleft training data into generated code.
Insecure-API Advisory
Annotates the prompt with a security advisory when it references a
high-risk sink —
eval( / exec( / os.system( / subprocess.run(
/ pickle.loads( / child_process.exec(. Non-blocking.The
code_security presets are deterministic — pure regex, no network
call, safe on the hot path. The networked scanners (CVE lookup, SBOM,
SAST) are separate external connections, not presets. See §3.2. Annotate — warn the model without changing the traffic
The actions you configure on a guardrail are block (reject the call, HTTP 400), mask (redact the match), and flag (log only). Code security adds a fourth behavior under the hood: annotate, which neither blocks nor masks. When an annotate rule matches, the gateway records a short note and the relay injects it upstream as a system advisory — so the model is told, e.g., “this request references a high-risk API (code eval, shell execution, or unsafe deserialization); prefer safer alternatives” — before it answers. The user’s text is never rejected and never rewritten.One concrete example
Apply the Insecure-API Advisory preset to a guardrail and attach it to a key. Then send code that calls a dangerous sink:3. CVE and SBOM decoration via external scanners
The advisory primitive generalizes. Connect a code-security scanner as an external vendor and its findings ride the same annotate path:Dependency CVE lookup (OSV)
Dependency CVE lookup (OSV)
Extracts imports and manifest pins from the request text and
cross-references them against the public OSV vulnerability database.
A hit decorates the prompt with, e.g., “requests@2.0.0 has
CVE-2014-1830 (HIGH). Fixed in 2.20.0.” — so the model is told about
a known vulnerability in a package it was asked to use. Free and
unauthenticated, so there’s no API-key field. Defaults to annotate;
you can set it to flag or block instead.
SBOM and SAST scanners
SBOM and SAST scanners
Connect an SBOM (software bill-of-materials) or SAST (static-analysis)
scanner the same way you connect any external vendor — a base URL plus
credentials, stored encrypted and masked on read. Each finding carries
a stable identity, so a finding you’ve already triaged doesn’t re-fire
on every request.
fail_open to false on the rule to fail closed for
policies where a missed scan is unacceptable.
4. Pairing with secrets and license rules
A code-security guardrail rarely rides alone. The common shape is one guardrail with a few rules:| Goal | Rule |
|---|---|
| Stop pasted credentials | .env / Secret-File Block (block) |
| Catch inline secret values | Secrets Blocker (block) |
| Gate copyleft code | License Compliance (flag) |
| Steer dangerous sinks | Insecure-API Advisory (annotate) |
5. Configure it (console + roles)
Everything here is configured in the console, not via the relay key. Management routes (/api/guardrail/*) authenticate with your session /
access token, not the sk- relay key. Reads — listing guardrails and the
Matches feed — are open to every workspace member. Writes (create /
edit / delete) and the test sandbox require the Developer role or
above: the sandbox can fire paid model calls and outbound vendor requests,
so it’s gated like a write.
Create the guardrail
In the console, open Guardrails → New guardrail. The split-button
drops you into the template library — pick a Code security preset
as your starting point.
Edit freely
A preset is a seed, not a lock. Tune the regex, add a Secrets Blocker
rule, change an action. Use the Test tab to prove a rule fires the
way you expect against sample text before you attach it to a key.
Findings land in the workspace Matches feed (rule type, action, stage,
detail). The matched substring is recorded only when Log raw content
is on — off by default, the privacy-conservative posture. See
logging & privacy.
6. Where to go next
- Block secrets — the companion rule that catches credential values in request args.
- Actions — block, mask, flag, annotate, and spotlight in depth.
- Compliance logger — keep an immutable record of every code-security finding.
- Testing & eval — prove your policy catches known-bad code before you ship it.
- Guardrails reference — the full engine.
- Securing AI agents — where code-security rails fit in the zero-trust control stack.
