Supply-chain risk in agent tooling

Every MCP server you register, every skill an agent installs, and every host a tool reaches is a dependency you didn’t write. An agent’s supply chain is dynamic — it grows at runtime, often without a human in the loop — so the classic “review the lockfile at build time” model doesn’t hold. A community skill can be hijacked after you trust it; a remote MCP server can quietly add a tool; a fetch tool can be steered to an attacker-controlled host. OrcaRouter’s answer is to govern the supply chain where it acts — at the gateway, on first use — instead of trying to vet every dependency at install time. This page is the use-case landing for ai supply chain security; the Firewall and Skills references carry the full mechanics.

1. ai supply chain security for agents, at the gateway

The choke point is the relay path. Whether a capability was hand-registered, auto-installed by the agent, or pulled from a community registry, its first tool call crosses api.orcarouter.ai — and that’s where the Firewall evaluates it. Four controls compose into a single posture:

MCP gateway, per-call eval

Every tools/call is evaluated against your policy before dispatch — the manifest is never the source of truth.

Skill risk-bands & quarantine

Installed capabilities are scanned, scored, and held for review until a human approves them.

Encrypted MCP credentials

Server auth secrets are encrypted at rest and injected at dispatch — never exposed to the model, the agent, or call arguments.

Egress allow-lists

Pin where tool calls may send data, so a compromised dependency can’t exfiltrate to a host you never approved.

Detection is at the gateway, on first use — not in your package manager or filesystem. That’s deliberate: it’s the one path that sees every agent and every tool call regardless of how the capability got there.

2. The threat: a dependency that grows after you trust it

Vector	What happens
Rug-pull	A registered MCP server adds a tool (`shell.exec`, a new `fetch`) you never approved.
Skill creep	An installed skill uses tools or hosts its manifest never declared.
Credential theft	A compromised server’s tool implementation reads its own auth secret to call home.
Egress exfiltration	A retrieve→send chain ships your data to an attacker-controlled host.

The common root cause: “I trust this server” is treated as permanent, and the agent keeps calling new or modified tools with no further review.

3. One concrete example — registering and pinning an MCP server

You register a third-party MCP server from the console (Settings → Firewall → MCP servers; writes need Developer+). The server’s auth secret is stored encrypted — you supply it once, the gateway injects it at dispatch, and it’s masked on every read after that. An MCP server record carries:

Field	Values
`auth_mode`	`none`, `bearer`, `oauth`, `basic`
`status`	`ok`, `degraded`, `down` (set by the health probe)
`credentials`	encrypted at rest, never returned in plaintext

After registering, probe it from the console to enumerate its current tools. The probe is a workspace-session (/api/workspace/firewall/*) operation that needs Developer+, not a relay key — register, probe, and rule-authoring all happen on the management plane:

# Console / management plane — workspace session, Developer+.
# (The relay sk-orca-... key is for /v1/* traffic only.)
curl -X POST https://api.orcarouter.ai/api/workspace/firewall/mcp_servers/<id>/probe \
  -H "Authorization: Bearer <workspace-session-token>"

The probe persists the server’s reachability and snapshots a baseline hash of its advertised tool set (trust-on-first-use). Then scope a Firewall rule with tool_name_glob: <server>.* to pending_approval until you’ve seen a clean call history — every call from that server is held for a human before it runs. Once you trust it, relax the rule to audit or allow. From that point on the MCP gateway evaluates every tools/call on the mcp surface before dispatch — so if a rug-pull later adds an undeclared tool, your policy, not the server’s manifest, decides whether it runs.

Re-probe after any upstream version bump (POST /api/workspace/firewall/mcp_servers/:id/probe, Developer+). If the advertised tool set drifts from the approved baseline, the server’s schema_status flips to changed and dispatch fails closed until an admin re-baselines (approve_schema) or quarantines it — the rug-pull can’t go live silently.

4. Skill risk-bands & quarantine

Every installable capability — whether you registered it or the gateway auto-detected it at runtime — is run through the skill scanner. Findings roll up to a risk band and an enforcement mode:

Risk bands

low · medium · high · critical. The band is derived from deterministic scanner passes over the manifest and declared scopes (undeclared tool use, network egress outside approved scopes, unsafe filesystem writes, injection-shaped manifest text).

Enforcement modes

allow (your policy rules decide), quarantine (any non-deny verdict escalates to pending_approval — a human approves each call), block (force deny on all of this skill’s tools regardless of rules). A high-band skill quarantines automatically; critical blocks.

Why auto-detected = always quarantined

A capability an agent self-installs, or a tool a rug-pull adds, is held in pending_approval regardless of its scan score until a human reviews it. An operator can’t quietly add a tool and have your agents start using it.

The enforcement mode only ever ratchets tighter — approving a skill never relaxes a block a fresh scan set.

5. Egress allow-lists — contain the “call home”

The most damaging supply-chain outcome is a compromised dependency that exfiltrates. The Firewall’s egress surface evaluates the outbound destination (host / IP / CIDR) a tool reports, so you can pin where data is allowed to go. You author an egress rule yourself: a host/CIDR allow-list with a cidr_match predicate denies everything off-list. Combine it with a sequence rule that breaks the retrieve→egress chain, and a poisoned tool that tries to ship a retrieved document to an unknown host is denied at the gateway.

The tight autonomy level ships an SSRF preset, but it denies fetch-shaped tool names (http_fetch, web_search, fetch_url, request) — it is not a CIDR/cloud-metadata denylist. If you need RFC-1918 / metadata / specific-CIDR egress blocking, author the egress host/CIDR deny rule yourself. See Firewall: Rules for the cidr_match operator and egress scoping.

6. Encrypted credentials — a compromised server can’t read your keys

Server auth secrets are encrypted at rest and injected by the gateway at dispatch time. They never reach the model, the agent, or the tool-call arguments — so a compromised or malicious server can’t exfiltrate your API keys by reading its own credential blob. The console always returns the secret masked — even to an Admin. The decrypted value is handed out on exactly one path: a request bearing a firewall-gateway-scoped token (a dedicated token type an Admin explicitly mints for the gateway/proxy), so an ordinary leaked relay key can’t enumerate your MCP credentials.

7. Rolling it up for an audit

Supply-chain governance is also an audit artifact. OrcaRouter maps to the OWASP Top 10 for LLM Applications — including the LLM05 Supply Chain control — as part of the compliance engine, alongside frameworks like soc2, iso_27001, iso_42001, nist_ai_rmf, and the eu_ai_act. Installing a compliance pack (POST /api/compliance/packs/:key/install, workspace Admin, paid plan) materializes the matching guardrails and firewall policies and starts in an observe-first posture. Compliance reports include an AI-supply-chain evidence section — the upstream providers your workspace actually routed to, plus a privileged-access and key-hygiene review — and are Ed25519-signed and publicly verifiable. Browsing the catalog and readiness is free to every Member; see Compliance for the full lifecycle.

MCP governance is two complementary layers: per-call firewall evaluation on the mcp surface (enforcement on what a dependency does), plus a tool-schema integrity baseline (trust-on-first-use hash of the advertised tool set, re-checked on every probe — drift flips the server’s schema_status to changed and fails dispatch closed until an admin re-baselines or quarantines it). Together with skill risk-bands and quarantine, that’s enforcement on both what a dependency does and a verifiable record of what it declared.

8. A supply-chain baseline

Before you trust a new MCP server or skill

Register it, probe its tool set, and scope a <server>.* rule to pending_approval or audit. Read the scan findings — any undeclared-tool or external-egress finding is a reason to keep it quarantined. Verify who controls the endpoint URL.

In steady state

Keep an egress allow-list pinned for any agent with fetch/search/export tools. Watch the Discovered tools view for capabilities that appeared without a rule, and the anomaly feed for novel tool-to-tool paths.

After a suspected rug-pull

Disable the server (PUT .../mcp_servers, "enabled": false) — its credentials are never decrypted while disabled. Re-probe to surface new tools, rescan the skill, and review the pending_approval queue rather than bulk-approving.

MCP tool poisoning & rug-pulls — the deep dive on malicious and hijacked MCP servers.
Data exfiltration — egress rules that restrict where tool calls may send data.
Dangerous tool calls — blocking destructive actions regardless of where the tool came from.
Secret leakage — keeping credentials out of prompts, arguments, and logs.
Securing AI agents and the control stack — how these controls fit the broader posture.

Firewall: MCP Servers

Register MCP servers behind the gateway, probe their tools, and apply a per-call verdict before any call reaches the real server.

Firewall: Skills

Scan and risk-score every installable capability. Quarantine or block risky skills before their tools run.

​1. ai supply chain security for agents, at the gateway

MCP gateway, per-call eval

Skill risk-bands & quarantine

Encrypted MCP credentials

Egress allow-lists

​2. The threat: a dependency that grows after you trust it

​3. One concrete example — registering and pinning an MCP server

​4. Skill risk-bands & quarantine

​5. Egress allow-lists — contain the “call home”

​6. Encrypted credentials — a compromised server can’t read your keys

​7. Rolling it up for an audit

​8. A supply-chain baseline

​9. Related threats & concepts

Firewall: MCP Servers

Firewall: Skills

1. ai supply chain security for agents, at the gateway

2. The threat: a dependency that grows after you trust it

3. One concrete example — registering and pinning an MCP server

4. Skill risk-bands & quarantine

5. Egress allow-lists — contain the “call home”

6. Encrypted credentials — a compromised server can’t read your keys

7. Rolling it up for an audit

8. A supply-chain baseline

9. Related threats & concepts