AI Investigations overview¶
When a ThreatEvent lands in the database with severity >= HIGH, the next thing that happens is an autonomous Tier-1 investigation. This is what turns Fenrir from "another tool that pings you" into a tool that actually does the work.
The pattern is borrowed from SOCfortress Talon (an AI SOC analyst for the CoPilot stack), reimplemented in Python on top of Fenrir's existing tool catalog.
What an investigation looks like end-to-end¶
sequenceDiagram
participant RE as Rule Engine
participant D as Dispatcher
participant A as AnalystAgent
participant LLM as LLM (cloud or local)
participant T as Tools
participant DB as Database
participant TG as Telegram
RE->>D: ThreatEvent (HIGH/CRITICAL)
D->>D: dedupe check, concurrency cap
D->>A: spawn investigation
A->>A: load playbook for category
A->>LLM: prompt = persona + playbook + event
LLM-->>A: tool calls (geoip, search_logs, ...)
A->>T: execute tools
T-->>A: results
A->>LLM: feed results back
LLM-->>A: more tools or final_report
A->>A: parse final_report JSON
A->>DB: persist Job + Report + IOCs
A->>T: execute low-risk auto-actions
A->>TG: send verdict alert
Typical wall-clock time: 30-90 seconds per investigation, dominated by LLM round-trips.
The three artifacts¶
Every investigation produces three rows in the database:
investigation_jobs¶
The lifecycle: open → investigating → done (or failed). Includes the playbook used, the number of tool-loop rounds, started/completed timestamps.
investigation_reports¶
The verdict:
verdict: "confirmed_threat" | "false_positive" | "inconclusive"
confidence: int # 0-100
summary: str # 2-4 sentences
recommended_actions: list[str]
auto_actions_taken: list[dict] # what the agent already executed
raw_llm_output: str # full transcript for audit
investigation_iocs¶
Extracted indicators of compromise. One row per IOC:
type: "ip" | "hash" | "domain" | "url" | "cve" | "file_path" | "user"
value: str
verdict: "malicious" | "suspicious" | "clean"
score: int # 0-100
enrichment: dict # provider-specific (VT score, abuseipdb, etc.)
Trigger rules¶
A ThreatEvent triggers an investigation when:
severity >= HIGH— the dispatcher's hardcoded floor.(category, ip)hasn't already been investigated within the dedupe window — prevents storm-investigations on a sustained attack.- The dispatcher's concurrency cap isn't reached — keeps LLM costs bounded.
Both the dedupe window and the concurrency cap are tunable in code; production values are documented in the Hardening checklist.
Anything below HIGH (LOW, MEDIUM) is logged but not investigated. You can still ask the AI bot interactively about it via Telegram (@fenrir_bot what was the spike at 14:00 UTC?), and the bot will use the same tool catalog to investigate on demand.
Model choice¶
By default, Fenrir uses Qwen 3.5 Flash via OpenRouter — fast, cheap (~$0.0001 per investigation), good enough for routine HIGH events.
For sensitive deployments or premium tiers, swap to:
| Model | Use case | Approx. cost per investigation |
|---|---|---|
qwen/qwen3.5-flash-02-23 (default) |
Routine, cost-sensitive | ~$0.0001 |
anthropic/claude-haiku-4.7 |
Better reasoning, still cheap | ~$0.001 |
anthropic/claude-opus-4-7 |
Best reasoning, audit-grade | ~$0.01-0.03 |
local Ollama |
Air-gapped, sovereign | $0 (compute cost) |
Set via OPENROUTER_MODEL in .env. No code change required.
Tiered service offering
A common Fenrir deployment pattern: route low-severity HIGH to Qwen, route CRITICAL events to Claude. Implement by branching on event.severity before calling the dispatcher. Get in touch if you want this configured for your tier.
What the analyst can and can't do¶
The agent has access to a read-only and low-risk tool catalog:
run_command— restricted shell (norm, nokill, noshutdown— see tools.py)check_disk,check_memory,check_processes,check_connectionssearch_logs,check_logsgeoip_lookup,threat_intel_lookupban_ip,unban_ip(the only "destructive" tools, gated by auto-action rules)
It cannot:
- Restart services (must request via human approval)
- Modify firewall rules outside fail2ban
- Apply package updates
- Rotate credentials, kill users, or delete data
When the analyst thinks something destructive is needed, it goes into recommended_actions for a human to review, not into auto_actions_suggested.
Read on:
- Playbooks — the per-category investigation guides
- Auto-actions — what the agent can do unsupervised
- PII anonymizer — privacy-preserving cloud LLM calls