Custom PII entities - OrcaRouter

The built-in pii detector covers the common entities — email, phone, credit card, SSN, IBAN, JWT, cloud keys. But your sensitive data is yours: employee IDs, internal case numbers, customer account references, a partner’s order format. A custom PII entity lets you teach the same masking rule to recognize those shapes, so the gateway redacts them before the model — or any downstream tool — ever sees them. This page covers the one thing custom entities add to the PII detection rule: your own detectors. For the full engine — every rule type, stage, and route — see the Guardrails reference.

Every step here is a console action on the hosted gateway (api.orcarouter.ai). You author the guardrail under your own session; only the final /v1/* call uses an sk-orca-... relay key. Creating and editing guardrails requires Developer+ in the workspace.

1. When you need a custom PII detector LLM guardrail

The built-in entity set is closed and shared across the engine, the validator, and the rule builder. It is the right tool for standard identifiers. Reach for a custom entity when the data you want to redact has a predictable shape that no built-in covers:

Internal identifiers

Employee IDs (EMP482915), case numbers, ticket refs, internal SKUs — anything with a fixed prefix and digit run.

Account & order numbers

Customer account references or a partner’s order format that should never reach a third-party model verbatim.

Checksummed numbers

Card-like or membership numbers that pass a Luhn check — add the checksum to cut false positives on look-alike digit runs.

Domain-specific codes

Policy numbers, claim IDs, device serials — any token your industry treats as sensitive but the generic detectors don’t know.

A custom entity layers on top of the built-in set inside one pii rule. It detects matches and applies the rule’s action — mask, block, or flag — exactly like a built-in entity does.

2. Anatomy of a custom entity

A custom entity is three small fields plus an optional mask tag. You add them in the pii rule editor under Custom entities:

Field	Required	What it does
`name`	yes	Stable ID, e.g. `employee_id`. Lowercase ASCII / digits / `_`, must start with a letter. Flows into the Matches feed and audit logs.
`pattern`	yes	A Go RE2 regex (linear-time, no backreferences). Must compile.
`checksum`	no	`luhn` validates each match with the Luhn algorithm. Only `""` (none) or `"luhn"` are accepted.
`mask_with`	no	Verbatim replacement on a mask action. Defaults to `[<UPPERCASE_NAME>]`.

The name follows the same key convention as the rest of the gateway — lowercase, starts with a letter, no spaces or hyphens. Pick a clear one (case_number, partner_order_id); it is what you’ll see in the Matches feed when the rule fires.

The optional Luhn checksum

Many “number-shaped” identifiers — payment cards, some membership and account numbers — carry a Luhn (mod-10) check digit. A bare regex like \d{16} matches any 16-digit run, including phone numbers, timestamps, and order totals. Setting checksum: "luhn" makes the detector fire only when the matched digits also pass the Luhn check, so look-alike runs slip through cleanly and your false-positive rate stays low. Leave it empty for non-checksummed tokens like an employee ID.

Your own mask tag

On a mask action, a built-in email renders as [EMAIL]. A custom entity renders as [<UPPERCASE_NAME>] by default — an employee_id match becomes [EMPLOYEE_ID]. Set mask_with to override that verbatim (e.g. <id> or ***) when you want a specific replacement token in the text the model receives. See Masking formats for the rendering rules across entity types.

3. One concrete example

Suppose your prompts carry employee IDs in the form EMP followed by six digits, and you want them masked at the input stage so the upstream model never sees a real ID. Add a single custom entity to a pii rule:

{
  "type": "pii",
  "stage": "input",
  "action": "mask",
  "entities": ["email"],
  "custom_entities": [
    {
      "name": "employee_id",
      "pattern": "EMP\\d{6}",
      "mask_with": "[EMPLOYEE_ID]"
    }
  ]
}

That rule masks both standard emails and your employee IDs in the same pass. Test it in the Test tab before attaching a key:

Forward EMP482915's note to jane@acme.com

Forward [EMPLOYEE_ID]'s note to [EMAIL]

Nothing is sent upstream and nothing is metered. Then attach the guardrail to a key (see Attach to a key) and call /v1/chat/completions exactly as before — the gateway masks the request before forwarding, with no SDK change.

Masking runs at both stages: an input rule redacts the request before the model sees it, and an output rule redacts the model’s reply — including streaming responses, where the scanner rewrites matches in-band. Block actions are enforced on both stages too. To gate model responses, see Output-stage rules.

A checksummed example

For a card-like membership number, add the Luhn check so 16-digit runs that aren’t valid numbers don’t match:

{
  "name": "member_card",
  "pattern": "\\d{16}",
  "checksum": "luhn",
  "mask_with": "[MEMBER_CARD]"
}

4. Limits and validation

The rule builder validates every custom entity on save — a bad detector never reaches the hot path.

Up to 25 custom entities per rule

Each custom entity is a regex scan over the full text, so the per-rule cap is 25. The cap keeps the hot path linear; compiled patterns are cached across requests. Need more shapes? Split them across multiple pii rules in the same guardrail.

The pattern must compile

pattern is a Go RE2 regex — linear-time, no backreferences. The validator rejects a pattern that doesn’t compile, with the offending entity named in the error.

checksum is a closed set

Only "" (no check) and "luhn" are accepted. Anything else — "sha256", "mod10", even "LUHN" — is rejected on save.

Names are unique and well-formed

A name must start with a letter and use only lowercase ASCII, digits, and underscores. Two custom entities in one rule can’t share a name.

5. Per-entity action overrides

A custom entity participates in the same entity_actions map as a built-in entity. One pii rule can mask most things but block on a high-sensitivity custom detector — reference the entity by its name:

{
  "type": "pii",
  "stage": "input",
  "action": "mask",
  "entities": ["email", "phone"],
  "custom_entities": [
    { "name": "ssn_internal", "pattern": "ID-\\d{9}", "checksum": "luhn" }
  ],
  "entity_actions": {
    "ssn_internal": "block"
  }
}

Keys in entity_actions must reference a built-in entity enabled on the rule or a custom entity’s name; values must be block, mask, flag, or annotate. The validator rejects anything else.

6. Where to go next

PII Shield

The single masking rule custom entities layer onto — the built-in detector set and the typed mask tags.

Masking formats

How each entity renders on a mask action, and how mask_with overrides it.

Regex detectors

When a plain regex rule fits better than a typed PII entity.

Tune false positives

Use the Matches feed and the checksum to dial in precision.

Read the Guardrails reference for the complete PII rule — every field, the eval harness, and the full API — or Create your first guardrail to walk the end-to-end loop from scratch.

​1. When you need a custom PII detector LLM guardrail

Internal identifiers

Account & order numbers

Checksummed numbers

Domain-specific codes

​2. Anatomy of a custom entity

​The optional Luhn checksum

​Your own mask tag

​3. One concrete example

​A checksummed example

​4. Limits and validation

​5. Per-entity action overrides

​6. Where to go next

PII Shield

Masking formats

Regex detectors

Tune false positives

1. When you need a custom PII detector LLM guardrail

2. Anatomy of a custom entity

The optional Luhn checksum

Your own mask tag

3. One concrete example

A checksummed example

4. Limits and validation

5. Per-entity action overrides

6. Where to go next