pii detector covers the common entities — email, phone,
credit card, SSN, IBAN, JWT, cloud keys. But your sensitive data is
yours: employee IDs, internal case numbers, customer account references,
a partner’s order format. A custom PII entity lets you teach the same
masking rule to recognize those shapes, so the gateway redacts them before
the model — or any downstream tool — ever sees them.
This page covers the one thing custom entities add to the
PII detection rule: your own detectors.
For the full engine — every rule type, stage, and route — see the
Guardrails reference.
Every step here is a console action on the hosted gateway
(
api.orcarouter.ai). You author the guardrail under your own session;
only the final /v1/* call uses an sk-orca-... relay key. Creating and
editing guardrails requires Developer+ in the workspace.1. When you need a custom PII detector LLM guardrail
The built-in entity set is closed and shared across the engine, the validator, and the rule builder. It is the right tool for standard identifiers. Reach for a custom entity when the data you want to redact has a predictable shape that no built-in covers:Internal identifiers
Employee IDs (
EMP482915), case numbers, ticket refs, internal SKUs
— anything with a fixed prefix and digit run.Account & order numbers
Customer account references or a partner’s order format that should
never reach a third-party model verbatim.
Checksummed numbers
Card-like or membership numbers that pass a Luhn check — add the
checksum to cut false positives on look-alike digit runs.
Domain-specific codes
Policy numbers, claim IDs, device serials — any token your industry
treats as sensitive but the generic detectors don’t know.
pii
rule. It detects matches and applies the rule’s action — mask, block,
or flag — exactly like a built-in entity does.
2. Anatomy of a custom entity
A custom entity is three small fields plus an optional mask tag. You add them in thepii rule editor under Custom entities:
| Field | Required | What it does |
|---|---|---|
name | yes | Stable ID, e.g. employee_id. Lowercase ASCII / digits / _, must start with a letter. Flows into the Matches feed and audit logs. |
pattern | yes | A Go RE2 regex (linear-time, no backreferences). Must compile. |
checksum | no | luhn validates each match with the Luhn algorithm. Only "" (none) or "luhn" are accepted. |
mask_with | no | Verbatim replacement on a mask action. Defaults to [<UPPERCASE_NAME>]. |
The optional Luhn checksum
Many “number-shaped” identifiers — payment cards, some membership and account numbers — carry a Luhn (mod-10) check digit. A bare regex like\d{16} matches any 16-digit run, including phone numbers, timestamps,
and order totals. Setting checksum: "luhn" makes the detector fire
only when the matched digits also pass the Luhn check, so look-alike
runs slip through cleanly and your false-positive rate stays low. Leave it
empty for non-checksummed tokens like an employee ID.
Your own mask tag
On amask action, a built-in email renders as [EMAIL]. A custom entity
renders as [<UPPERCASE_NAME>] by default — an employee_id match
becomes [EMPLOYEE_ID]. Set mask_with to override that verbatim
(e.g. <id> or ***) when you want a specific replacement token in the
text the model receives. See
Masking formats for the rendering
rules across entity types.
3. One concrete example
Suppose your prompts carry employee IDs in the formEMP followed by six
digits, and you want them masked at the input stage so the upstream
model never sees a real ID. Add a single custom entity to a pii rule:
/v1/chat/completions exactly as before — the gateway masks the
request before forwarding, with no SDK change.
Masking runs at both stages: an input rule redacts the request before
the model sees it, and an output rule redacts the model’s reply —
including streaming responses, where the scanner rewrites matches in-band.
Block actions are enforced on both stages too. To gate model responses,
see Output-stage rules.
A checksummed example
For a card-like membership number, add the Luhn check so 16-digit runs that aren’t valid numbers don’t match:4. Limits and validation
The rule builder validates every custom entity on save — a bad detector never reaches the hot path.Up to 25 custom entities per rule
Up to 25 custom entities per rule
Each custom entity is a regex scan over the full text, so the per-rule
cap is 25. The cap keeps the hot path linear; compiled patterns are
cached across requests. Need more shapes? Split them across multiple
pii rules in the same guardrail.The pattern must compile
The pattern must compile
pattern is a Go RE2 regex — linear-time, no backreferences. The
validator rejects a pattern that doesn’t compile, with the offending
entity named in the error.checksum is a closed set
checksum is a closed set
Only
"" (no check) and "luhn" are accepted. Anything else —
"sha256", "mod10", even "LUHN" — is rejected on save.Names are unique and well-formed
Names are unique and well-formed
A
name must start with a letter and use only lowercase ASCII,
digits, and underscores. Two custom entities in one rule can’t share a
name.5. Per-entity action overrides
A custom entity participates in the sameentity_actions map as a
built-in entity. One pii rule can mask most things but block on
a high-sensitivity custom detector — reference the entity by its name:
entity_actions must reference a built-in entity enabled on the
rule or a custom entity’s name; values must be block, mask, flag,
or annotate. The validator rejects anything else.
6. Where to go next
PII Shield
The single masking rule custom entities layer onto — the built-in
detector set and the typed mask tags.
Masking formats
How each entity renders on a mask action, and how
mask_with
overrides it.Regex detectors
When a plain
regex rule fits better than a typed PII entity.Tune false positives
Use the Matches feed and the checksum to dial in precision.
