Everything here is configured in the console (Security → Firewall),
whose management routes use your session / access token — not a relay
sk-orca-… key. Your agent’s /v1/* calls don’t change.1. Why per-call rules miss the chain
The firewall’s tool globs and argument clauses are stateless and deterministic by design — they decide one call, fast, on the hot path. That’s exactly what you want for “blockshell.exec rm -rf.” It’s exactly
wrong for a slow-burn exfiltration where every individual call is legal.
Two complementary tools fill the gap:
Sequence rules
A rule you author that matches an ordered chain of calls within a
time window — “bulk read → export → egress.” You name the pattern.
Anomaly detection
The firewall learns each workspace’s normal tool-use shape and
flags deviations — retry loops, never-before-seen tool paths, and
volume/cost spikes. No rule to author.
2. Sequence rules: name the attack chain
Asequence rule lives inside a firewall policy
like any other rule, but instead of a single tool_name_glob it carries an
ordered list of steps. Each step is a tool glob with an optional
min_count and an optional egress: true; the steps must occur in
order (interleaving with unrelated calls is fine) and the whole chain
must complete within window_seconds.
crm.* records, then calls any
*.export tool, then makes any egress call — all inside ten minutes. Each
call on its own would pass; the pattern is the signal.
The full sequence field syntax — window_seconds: 0 for no time bound,
min_count defaults, step ordering semantics — is in the
rule schema. Author sequence rules in the
console rule editor; saving is a Developer+ action.
3. Anomaly detection: deviation from learned normal
Where sequence rules ask “did this specific pattern happen,” anomaly detection asks “is anything about this run abnormal for this workspace.” It needs no rule — the firewall builds a baseline from your own traffic and scores live activity against it. Four kinds surface:rate_spike — a volume flood
rate_spike — a volume flood
Per-(tool, key) call volume scored against the learned baseline for
this hour-of-week. A row surfaces when the count clears an absolute
floor and runs high relative to baseline, or when its z-score crosses
the statistical threshold. So “100
db.query calls at 3am Sunday”
stands out even though a Tuesday-2pm burst of the same size wouldn’t.burn_spike — a cost spike
burn_spike — a cost spike
The same idea applied to spend: a tool burning multiples of its learned
baseline cost for this hour-of-week. The denial-of-wallet early warning
— pair it with a
cap_cost rule to
enforce a hard ceiling.retry_loop — hammering a failing tool
retry_loop — hammering a failing tool
A
(conversation, tool, arguments) group that repeats many times in a
tight window — an agent stuck calling the same failing tool with the
same arguments over and over, rather than slow legitimate polling.novel_path — an unseen tool-to-tool transition
novel_path — an unseen tool-to-tool transition
A
tool_a → tool_b transition this workspace has never made before.
The first time an agent goes from read_file straight to http_fetch,
that edge lights up even if both tools are individually allowed.The hour-of-week baseline
The baseline is a 14-day rolling average bucketed by hour of week (weekday × 24 + hour), so Tuesday-14:00 is compared against past
Tuesday-14:00 history specifically — not a flat all-time mean that would
wash out your real daily and weekly rhythm. A brand-new workspace with no
learned norm yet still catches an obvious flood via an absolute floor, so
you’re protected from day one.
4. One concrete walkthrough
Suppose a compromised prompt drives one of your agents into a tight failure loop, then probes an export path it’s never touched. Here’s what you see — no rule authored in advance:The agent misbehaves
Injected instructions push the agent to retry a failing
db.query with
identical arguments, then call report.export followed by an outbound
fetch — a path this workspace has never run.Open the anomaly feed
In Security → Firewall → Anomalies, the run surfaces a
retry_loop
on db.query and a novel_path on the report.export → http_fetch
edge. Reading the feed is a Member action — anyone on the team can
triage.Confirm in the events trace
Click through to the events log and the
run analytics to see the exact call
sequence, correlated to the agent run and conversation. The anomaly feed
is Member-readable, but the events log and run trace carry tool-call
provenance and are Developer+.
Convert the finding into a rule
Now that you’ve seen the chain, encode it: a
deny on the dangerous export, an
egress allow-list on the fetch, or
a sequence rule that audits the whole pattern next time. Anomaly
detection finds the unknown; a rule pins the known.5. Sequence rules vs. anomaly detection
They solve adjacent problems — pick the one that matches what you know:| Sequence rule | Anomaly detection | |
|---|---|---|
| You author | The exact chain | Nothing — it learns |
| Catches | A known multi-step pattern | The unknown / abnormal |
| Acts | Applies the rule’s verdict to the completing call (audit / pending_approval / deny) | Surfaces on the feed |
pending_approval or deny verdict) able to gate the
completing call. For a hard stop on a single call regardless of any chain,
reach for a per-call verdict.
6. RBAC & the routes behind the feed
The anomaly feed and sequence rules sit under the workspace firewall management routes — your session / access token, never a relay key:| Method & path | Role | Purpose |
|---|---|---|
GET /api/workspace/firewall/anomalies | Member | Read the anomaly feed (?window=). |
POST /api/workspace/firewall/anomalies/snooze | Developer+ | Snooze the feed ({until}, clamped to 7 days). |
POST /api/workspace/firewall/rules | Developer+ | Create a sequence (or any) rule under a policy. |
POST /api/workspace/firewall/test | Developer+ | Dry-run a policy against a sample call before depending on it. |
Where to go next
Rule schema
The full
sequence field — steps, min_count, window_seconds, and
every other rule field.Events log
Where matched sequences and anomalies land — filter by run, surface, and
verdict.
Cap cost
Turn a
burn_spike signal into a hard per-run spend ceiling.Egress control
Stop the final exfiltration step of a chain at the network boundary.
