Detect multi-step attacks across an agent run

A single tool call can look perfectly innocent. Read one CRM record: allowed. Call an export tool: allowed. Hit an external host: allowed. The shape of the run — fifty reads, then an export, then egress to a host you’ve never seen at 3am on a Sunday — is the attack. Per-call verdicts judge each call in isolation and never see it. This page covers the two firewall mechanisms that watch a run over time instead of one call at a time: sequence rules (an ordered chain you author) and behavioral anomaly detection (deviation from your workspace’s learned normal). Together they’re how you detect agent attack chain behavior that no single allow/deny rule can catch.

Everything here is configured in the console (Security → Firewall), whose management routes use your session / access token — not a relay sk-orca-… key. Your agent’s /v1/* calls don’t change.

1. Why per-call rules miss the chain

The firewall’s tool globs and argument clauses are stateless and deterministic by design — they decide one call, fast, on the hot path. That’s exactly what you want for “block shell.exec rm -rf.” It’s exactly wrong for a slow-burn exfiltration where every individual call is legal. Two complementary tools fill the gap:

Sequence rules

A rule you author that matches an ordered chain of calls within a time window — “bulk read → export → egress.” You name the pattern.

Anomaly detection

The firewall learns each workspace’s normal tool-use shape and flags deviations — retry loops, never-before-seen tool paths, and volume/cost spikes. No rule to author.

2. Sequence rules: name the attack chain

A sequence rule lives inside a firewall policy like any other rule, but instead of a single tool_name_glob it carries an ordered list of steps. Each step is a tool glob with an optional min_count and an optional egress: true; the steps must occur in order (interleaving with unrelated calls is fine) and the whole chain must complete within window_seconds.

{
  "label": "bulk-read-then-exfil",
  "verdict": "audit",
  "sequence": {
    "window_seconds": 600,
    "steps": [
      { "match": "crm.*",   "min_count": 50 },
      { "match": "*.export" },
      { "match": "*", "egress": true }
    ]
  }
}

This fires when an agent reads 50+ crm.* records, then calls any *.export tool, then makes any egress call — all inside ten minutes. Each call on its own would pass; the pattern is the signal.

A sequence is evaluated on the call that completes it. The inline rule loop skips chain rules (one call can’t satisfy a multi-step chain); the match runs when a call could be a chain’s final step, at which point the firewall pulls that principal’s recent events and tests the chain. The verdict you set on the rule is what then happens to the completing call: audit records it and lets it through, pending_approval holds it for human review, and deny blocks it. So a chain can stop its final call in real time — pick the verdict to match. Use audit when you only want to detect and alert; use pending_approval or deny (or pair with a per-call deny / egress rule) when you need a hard stop.

The full sequence field syntax — window_seconds: 0 for no time bound, min_count defaults, step ordering semantics — is in the rule schema. Author sequence rules in the console rule editor; saving is a Developer+ action.

3. Anomaly detection: deviation from learned normal

Where sequence rules ask “did this specific pattern happen,” anomaly detection asks “is anything about this run abnormal for this workspace.” It needs no rule — the firewall builds a baseline from your own traffic and scores live activity against it. Four kinds surface:

rate_spike — a volume flood

Per-(tool, key) call volume scored against the learned baseline for this hour-of-week. A row surfaces when the count clears an absolute floor and runs high relative to baseline, or when its z-score crosses the statistical threshold. So “100 db.query calls at 3am Sunday” stands out even though a Tuesday-2pm burst of the same size wouldn’t.

burn_spike — a cost spike

The same idea applied to spend: a tool burning multiples of its learned baseline cost for this hour-of-week. The denial-of-wallet early warning — pair it with a cap_cost rule to enforce a hard ceiling.

retry_loop — hammering a failing tool

A (conversation, tool, arguments) group that repeats many times in a tight window — an agent stuck calling the same failing tool with the same arguments over and over, rather than slow legitimate polling.

novel_path — an unseen tool-to-tool transition

A tool_a → tool_b transition this workspace has never made before. The first time an agent goes from read_file straight to http_fetch, that edge lights up even if both tools are individually allowed.

The hour-of-week baseline

The baseline is a 14-day rolling average bucketed by hour of week (weekday × 24 + hour), so Tuesday-14:00 is compared against past Tuesday-14:00 history specifically — not a flat all-time mean that would wash out your real daily and weekly rhythm. A brand-new workspace with no learned norm yet still catches an obvious flood via an absolute floor, so you’re protected from day one.

The feed reports tool names, redacted key ids, counts, and a z-score — never raw key material. Each anomaly carries a suggested remediation (rate_limit, review, or block_tool) so the next step is one click, not a guess.

4. One concrete walkthrough

Suppose a compromised prompt drives one of your agents into a tight failure loop, then probes an export path it’s never touched. Here’s what you see — no rule authored in advance:

The agent misbehaves

Injected instructions push the agent to retry a failing db.query with identical arguments, then call report.export followed by an outbound fetch — a path this workspace has never run.

Open the anomaly feed

In Security → Firewall → Anomalies, the run surfaces a retry_loop on db.query and a novel_path on the report.export → http_fetch edge. Reading the feed is a Member action — anyone on the team can triage.

Confirm in the events trace

Click through to the events log and the run analytics to see the exact call sequence, correlated to the agent run and conversation. The anomaly feed is Member-readable, but the events log and run trace carry tool-call provenance and are Developer+.

Convert the finding into a rule

Now that you’ve seen the chain, encode it: a deny on the dangerous export, an egress allow-list on the fetch, or a sequence rule that audits the whole pattern next time. Anomaly detection finds the unknown; a rule pins the known.

If the feed is noisy while you tune — a legitimate batch job that genuinely spikes every Sunday, say — snooze the anomaly feed for up to 7 days while you investigate. Snoozing is a Developer+ action; the window is server-clamped so detection always comes back on its own.

5. Sequence rules vs. anomaly detection

They solve adjacent problems — pick the one that matches what you know:

	Sequence rule	Anomaly detection
You author	The exact chain	Nothing — it learns
Catches	A known multi-step pattern	The unknown / abnormal
Acts	Applies the rule’s verdict to the completing call (`audit` / `pending_approval` / `deny`)	Surfaces on the feed

A mature workspace runs both: anomaly detection is the radar that surfaces chains you didn’t anticipate — surfacing only, never blocking; sequence rules are how you codify the ones you have, so they’re labelled, tracked, and (with a pending_approval or deny verdict) able to gate the completing call. For a hard stop on a single call regardless of any chain, reach for a per-call verdict.

6. RBAC & the routes behind the feed

The anomaly feed and sequence rules sit under the workspace firewall management routes — your session / access token, never a relay key:

Method & path	Role	Purpose
`GET /api/workspace/firewall/anomalies`	Member	Read the anomaly feed (`?window=`).
`POST /api/workspace/firewall/anomalies/snooze`	Developer+	Snooze the feed (`{until}`, clamped to 7 days).
`POST /api/workspace/firewall/rules`	Developer+	Create a sequence (or any) rule under a policy.
`POST /api/workspace/firewall/test`	Developer+	Dry-run a policy against a sample call before depending on it.

Reads of the feed are open to every Member so the whole team can triage; authoring rules and snoozing the feed are Developer+ writes, consistent with the rest of the firewall RBAC model.

Where to go next

Rule schema

The full sequence field — steps, min_count, window_seconds, and every other rule field.

Events log

Where matched sequences and anomalies land — filter by run, surface, and verdict.

Cap cost

Turn a burn_spike signal into a hard per-run spend ceiling.

Egress control

Stop the final exfiltration step of a chain at the network boundary.

For the attacker playbooks these mechanisms counter, see chained attacks, data exfiltration, and excessive agency. For the deep firewall reference, see Firewall rules.

​1. Why per-call rules miss the chain

Sequence rules

Anomaly detection

​2. Sequence rules: name the attack chain

​3. Anomaly detection: deviation from learned normal

​The hour-of-week baseline

​4. One concrete walkthrough

​5. Sequence rules vs. anomaly detection

​6. RBAC & the routes behind the feed

​Where to go next

Rule schema

Events log

Cap cost

Egress control

1. Why per-call rules miss the chain

2. Sequence rules: name the attack chain

3. Anomaly detection: deviation from learned normal

The hour-of-week baseline

4. One concrete walkthrough

5. Sequence rules vs. anomaly detection

6. RBAC & the routes behind the feed

Where to go next