Trust checklist for third-party MCP servers

A third-party MCP server is an unreviewed bundle of tools, a live credential, and fresh network reach. The moment an agent dials one directly, nobody is watching the call — and “the server changed its tools after you approved it” is a real attack, not a hypothetical. Before you point an agent at a server someone else operates, you want a repeatable pre-flight. This page is that pre-flight: a short, ordered checklist to vet mcp server connections on OrcaRouter using controls that already exist — per-call evaluation, default-deny allow-listing, egress limits, encrypted credentials, and skill quarantine. Each step links to the focused how-to for depth. Run it once per new server; re-run the drift-sensitive steps whenever the server changes.

Every configuration step here is done from the console (or the REST API with your session/access token) and is role-gated. Only the firewall gateway routes and /v1/* relay calls carry an sk-orca-...-style key.

1. The checklist to vet mcp server connections

Work top to bottom. The first three steps are mandatory for any server you don’t operate yourself; the rest harden it.

1. Probe before you trust

Discover the real tool list and reachability before writing a single rule.

2. Default-deny, then allow-list

Permit only the tools you reviewed; everything else is denied.

3. Encrypt the credential

Store auth so it’s encrypted at rest, masked on read, never seen by the model.

4. Lock egress

Constrain where the server’s tools may reach on the network.

5. Quarantine self-installed skills

Hold anything the agent installs on its own until a human reviews it.

6. Shadow first, then watch

Roll out in audit-only, then read events and anomalies before enforcing.

2. Probe before you trust

You cannot review tools you’ve never seen, and a server’s advertised tool list is the thing most likely to change under you. Register the server, then probe it — the gateway runs an MCP initialize + tools/list against the endpoint and returns the real tools with their input schemas, plus a reachability status of ok, degraded, or down.

# Console route, called with your session/access token (UserAuth). Developer+.
curl -X POST \
  https://api.orcarouter.ai/api/workspace/firewall/mcp_servers/42/probe \
  -H "Authorization: Bearer <your-access-token>"

Read every tool name and what its arguments accept. A server advertising a shell.exec or an http_fetch you didn’t expect is a finding, not a detail — that’s the whole point of probing first.

Re-probe whenever a server changes hands or you suspect drift. A new tool appearing in the list — the “rug pull” — is exactly what you’re watching for. See Rug-pull defense.

The full registration and probe reference lives in Firewall: MCP servers; the end-to-end walkthrough is Connect an MCP server.

3. Default-deny, then allow-list the tools you reviewed

An allow-list is the difference between “the server can do six things” and “the server can do whatever its operator decides tomorrow.” Set the policy’s default_verdict to deny, then add a rule per tool you reviewed and trust. Because the gateway namespaces every tool <server>.<tool>, you can scope rules to one server without touching the others.

// Policy on the mcp surface: deny by default, allow only what you reviewed.
// tool_name_glob supports a full-segment wildcard: "github.*" (prefix),
// "*.exec" (suffix), or "*.shell.*" (infix). Mid-segment globs like
// "github.get_*" fall back to an exact match and won't expand.
{
  "default_verdict": "deny",
  "rules": [
    { "tool_name_glob": "github.create_issue", "verdict": "allow" },
    { "tool_name_glob": "github.get_issue",    "verdict": "allow" }
  ]
}

Now github.create_issue runs, github.get_issue runs, and a freshly introduced github.delete_repo is denied until you’ve reviewed and permitted it. A denied tools/call returns to the model as a tool error (firewall deny: …) — the agent adapts instead of crashing. See Allow-list MCP tools for the full recipe, and Firewall rules for the matching DSL.

4. Encrypt the credential — never hand-roll auth

A third-party server almost always needs a credential, and a credential is the thing you least want sitting in plaintext or reaching the model. Register the server’s auth through OrcaRouter so it’s encrypted at rest, masked on read, and injected only at dispatch time. auth_mode is one of none, bearer, oauth, or basic:

# Console route, UserAuth, Developer+.
curl https://api.orcarouter.ai/api/workspace/firewall/mcp_servers \
  -H "Authorization: Bearer <your-access-token>" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "github",
    "endpoint": "https://api.githubcopilot.com/mcp",
    "auth_mode": "bearer",
    "auth_json": "{\"token\":\"ghp_x\"}"
  }'

The credential is encrypted and masked the moment it’s stored — it never reaches the model or the client, and on read you only ever see the mask. On an update, echo the mask back to keep the stored value; send fresh auth_json only when you’re rotating. See Authenticate and Credential rotation.

5. Lock egress: where can its tools reach?

Per-call verdicts decide which tool runs; egress decides where it may reach. A tool that “returns data” and a tool that “exfiltrates your secrets to an attacker’s host” can be the same tool with a different argument — egress control is what tells them apart. The gateway already validates every remote endpoint and its resolved dial IP against an SSRF policy on every hop, refusing intranet ranges and the cloud-metadata address and re-checking the IP to defeat DNS rebinding. On top of that, author your own egress deny rule for the hosts and CIDRs this server should never touch:

// An egress-stage rule scopes its verdict to the outbound destination.
// egress_json carries host/CIDR allow + deny lists.
{
  "stage": "egress",
  "verdict": "deny",
  "egress_json": "{\"deny\":[\"10.0.0.0/8\"]}"
}

There’s no preset that ships CIDR rules for you — you author the host/CIDR deny list yourself, scoped to what this server legitimately needs. See Egress limits and Data exfiltration.

6. Quarantine what the agent installs on its own

The server you registered is one risk; the skills, BYO MCP servers, and plugins an agent self-installs afterward are another. OrcaRouter scans every installable capability, assigns it a risk band, and derives an enforcement mode — allow, quarantine, or block — that rides on top of every rule verdict. Anything auto-detected on first use is quarantined until a human reviews it: a capability nobody approved doesn’t get a free pass just because it scanned benign. A quarantine capability escalates anything short of a deny to pending_approval, so its tools run only after you’ve looked.

Don’t try to register every skill by hand. Pre-approve the ones you trust and let the rest be auto-detected and quarantined — then review from real data. The mode ratchets tighter on a re-scan, never looser. See Firewall: skills and MCP tool poisoning.

7. Shadow first, then watch the trail

Don’t flip a brand-new server straight to enforcing. Put the policy in shadow mode — enforcing verdicts are downgraded to audit and logged as [shadow] would … — so you can see what would have been blocked before it actually is. When the audit trail looks right, drop shadow mode and enforce. After it’s live, the controls keep watching:

Firewall events

Every governed call records its verdict, surface, and matched rule. Read them to confirm the allow-list and egress rules behave as intended. See Audit MCP events.

Anomaly feed

Rate and cost spikes against a learned baseline, plus retry loops and novel tool paths, surface as anomalies — readable by any Member.

Discovered tools

Turn on observe mode to log calls a policy doesn’t yet cover as gaps, so you tighten from what an agent actually does, not from guesses.

8. The fast path: pick an autonomy level

If you’d rather not hand-build steps 3–5 for a server you don’t fully trust, apply an autonomy level and edit from there. The levels write real, editable policy and guardrail rows — they’re a starting point, not a black box:

Level	What it sets
`permissive`	Observe mode on — logs everything, enforces nothing.
`balanced`	Default-audit policy that denies destructive shell, plus the PII Shield guardrail in flag-only mode.
`tight`	Default-deny policy denying destructive shell and fetch-shaped tools (`http_fetch`/`web_search`/`fetch_url`/`request` — the SSRF vector), plus the PII Shield and Secrets Blocker guardrails enforced. Secrets in arguments are caught by the Secrets Blocker guardrail on the request, not by a tool-arg rule.

For a third-party server you’re still vetting, start at tight, probe, then relax specific tools into an allow-list. The one-click undo restores the pre-apply snapshot.

Reads of settings, policies, discovered tools, anomalies, registered MCP servers, and skills are open to any Member; event, run, and aggregate reads require Developer+, and every write requires Developer+. Revealing a token’s plaintext key is also Developer+.

9. Where to go next

Connect an MCP server

Allow-list MCP tools

Default-deny a server and permit only reviewed tools.

Rug-pull defense

Catch a server or skill that changes after you approved it.

MCP security overview

The full map of the MCP governance surface.

New to the model? Read Guardrails vs. firewall for where MCP governance fits, then Excessive agency and Dangerous tool calls for the threats this checklist closes.

​1. The checklist to vet mcp server connections

1. Probe before you trust

2. Default-deny, then allow-list

3. Encrypt the credential

4. Lock egress

5. Quarantine self-installed skills

6. Shadow first, then watch

​2. Probe before you trust

​3. Default-deny, then allow-list the tools you reviewed

​4. Encrypt the credential — never hand-roll auth

​5. Lock egress: where can its tools reach?

​6. Quarantine what the agent installs on its own

​7. Shadow first, then watch the trail

​8. The fast path: pick an autonomy level

​9. Where to go next

Connect an MCP server

Allow-list MCP tools

Rug-pull defense

MCP security overview

1. The checklist to vet mcp server connections

2. Probe before you trust

3. Default-deny, then allow-list the tools you reviewed

4. Encrypt the credential — never hand-roll auth

5. Lock egress: where can its tools reach?

6. Quarantine what the agent installs on its own

7. Shadow first, then watch the trail

8. The fast path: pick an autonomy level

9. Where to go next