Skip to content

PII anonymizer

When an investigation calls a cloud LLM (anything via OpenRouter — OpenAI, Anthropic, Google, etc.), the prompt necessarily contains parts of your raw events: log lines with usernames, internal hostnames, IP addresses, sometimes email addresses. None of those are useful to the model; all of them are personal data.

The PII anonymizer wraps every cloud LLM call to redact this data before the prompt leaves your server, and restore it in the model's response on your server only.

The flow

sequenceDiagram
    participant A as AnalystAgent
    participant ANON as PII Anonymizer
    participant LLM as Cloud LLM<br/>(OpenRouter)

    A->>ANON: anonymize(prompt)
    Note right of ANON: replace identifiers<br/>with tokens
    ANON-->>A: (anonymized_prompt, mapping)

    A->>LLM: anonymized_prompt
    LLM-->>A: response (uses tokens)

    A->>ANON: deanonymize(response, mapping)
    ANON-->>A: response (real values restored)

    A->>A: persist & display

The mapping never leaves the server. The cloud model only ever sees opaque tokens like <PRIVATE_PERSON_1>, <PRIVATE_IP_2>, <PRIVATE_EMAIL_3>.

What's redacted

Fenrir uses OpenAI Privacy Filter (opf) — a 1.5B-parameter context-aware token classifier (Apache 2.0 licensed, runs locally on CPU). It's purpose-built to find PII spans in arbitrary text.

The labels detected (and replaced with <LABEL_N> tokens) include:

  • private_person — names of natural people
  • private_organization — company names
  • private_email — email addresses
  • private_phone — phone numbers
  • private_ip — IP addresses (when they look like personal devices)
  • private_address — postal addresses
  • private_url — URLs that may contain identifiers
  • private_id — generic IDs (national ID, SSN, etc.)

The classifier is context-aware — it correctly handles cases where the same string is PII in one context but not in another (e.g. "Alice" in "Alice was born in 1990" vs "Alice in Wonderland").

Identical values share a token

If the same value appears multiple times in a prompt (e.g. an attacker IP mentioned three times in a log excerpt), it gets the same placeholder. This lets the LLM reason about one entity instead of three independent strings:

Original:    Failed login from 1.2.3.4. Then 1.2.3.4 tried /wp-admin. Then 1.2.3.4 left.
Anonymized:  Failed login from <PRIVATE_IP_1>. Then <PRIVATE_IP_1> tried /wp-admin. Then <PRIVATE_IP_1> left.

The LLM's reasoning is preserved. After deanonymization, you get back exactly the original text.

When the anonymizer runs

def should_anonymize() -> bool:
    return bool(settings.openrouter_api_key)

The anonymizer is active when OpenRouter is configured, bypassed when only Ollama is configured (local LLM = data never leaves the host anyway, no need to redact).

This is intentional: the cost of anonymization is ~100ms CPU per call, which is wasted if you're already running fully local.

The anonymizer is an optional dependency. Without it, Fenrir falls back to sending raw prompts to the cloud LLM (with a warning logged).

To install on the production server:

sudo -u p3guardian /opt/p3guardian/.venv/bin/pip install opf
# or, from the repo:
sudo -u p3guardian git clone https://github.com/pierluigipisanti/privacy-filter ~/privacy-filter
sudo -u p3guardian /opt/p3guardian/.venv/bin/pip install -e ~/privacy-filter

The first call downloads the 1.5B-parameter checkpoint (~3 GB) to ~/.opf/privacy_filter/. Subsequent calls are fast (~30s cold-start, ~100ms steady-state per redaction).

Why this matters for compliance

GDPR Article 6 requires a lawful basis for transmitting personal data to a third-party processor. Sending raw security-event logs to OpenAI / Anthropic / Google is processing personal data outside the EEA and triggers Article 44 (transfers) plus DPIA requirements (Article 35).

With the anonymizer:

  • The personal data never leaves your server. The cloud processor sees only pseudonymous tokens. No transfer occurs in the GDPR sense.
  • The processor cannot re-identify the data subjects. The mapping table is on your server only.
  • Even if the cloud provider were breached, the leaked data would be pseudonymous.

This is recognized as pseudonymization under Article 4(5) GDPR — explicitly listed in Article 32(1)(a) as an appropriate technical measure for security of processing.

We're not lawyers — get a real DPO involved for production deployments — but this is the same reasoning that lets European banks use US-hosted LLMs at all.

Limitations

  • English/Italian focused. The model was primarily trained on Western European languages. Other scripts may have lower recall.
  • Not perfect. The model can miss unusual identifiers (e.g. obscure naming patterns, internal codes that look like words). Treat it as a strong defense-in-depth, not a guarantee.
  • CPU only by default. GPU inference is supported (~10x faster) if you have one. For most Fenrir deployments CPU is fine.
  • No URL parameters. If your logs contain query strings with tokens (?session_id=abc123), those aren't currently redacted by default — needs URL-aware preprocessing on top.