Skip to content

Autonomous Response (Sprint 2)

The autonomous responder extends the original fail2ban-only banning with a small set of revertible host-level actions, gated by AI confidence and operator-controlled allowlists.

Disabled by default — and dry-run on first enable

Autonomous response is OFF on a fresh install (AUTO_ACTION_ENABLED=false). When you first enable it, leave AUTO_ACTION_DRY_RUN=true for at least one full week and review every entry on /autonomous before flipping dry-run off. A wrong kill on sshd is unrecoverable from the dashboard.

What it does

When the AI analyst returns verdict=confirmed_threat with confidence >= AUTO_ACTION_MIN_CONFIDENCE (default 85), and the suggested tool name matches one of the supported action types, the dispatcher in p3guardian/responders/autonomous.py runs the matching action module.

action_type Effect Reverts via
kill_process SIGTERM (3s grace) → SIGKILL on a PID informational only — kill is irreversible
file_quarantine mv to AUTO_ACTION_QUARANTINE_DIR (default /var/quarantine), chmod 000 move back + restore mode
service_stop systemctl stop <name> systemctl start <name>
isolate_network iptables -I OUTPUT -d <ip> -j DROP (heavy-handed; opt-in) iptables -D OUTPUT ...
package_rollback apt-get -y remove <pkg> (heavy-handed; opt-in) apt-get -y install <pkg> (best-effort)

Every attempt — executed, simulated, or skipped — is persisted to the autonomous_actions table with the LLM verdict, confidence, target, reason, output, and (where applicable) the data needed to revert.

Safety rails

The dispatcher rejects an action before any side-effect when:

  1. AUTO_ACTION_ENABLED is false. Master switch.
  2. action_type is not in AUTO_ACTION_TYPES. By default isolate_network and package_rollback are not in that list — they require an explicit override.
  3. confidence < AUTO_ACTION_MIN_CONFIDENCE.
  4. The target hits AUTO_ACTION_PROCESS_WHITELIST. For kill_process we resolve the PID's /proc/<pid>/comm; for service_stop we strip the .service suffix and compare; for package_rollback we compare the package name. Defaults: sshd, systemd, systemd-journald, init, postgres, postgresql, p3guardian, cloudflared, nginx, fail2ban-server, dbus, NetworkManager.
  5. AUTO_ACTION_DRY_RUN is true — the row is persisted with dry_run=True and a synthetic "would have called X" output, but nothing executes.

Reverting an action

Three paths:

  • Telegram — every successful live action is announced with an ↩️ Revert inline button.
  • Web UI/autonomous lists every action with a Revert button on the rows that are still revertible.
  • APIPOST /api/autonomous-actions/{id}/revert with {"reason": "..."}.

Revert is best-effort and per-action-type:

  • kill_process is not restartable from here — the row is informational. If the killed process was a daemon, restart it manually with systemctl or whichever supervisor owns it.
  • file_quarantine restores the file and its original mode if the quarantine path still has it and the original path is free.
  • service_stop runs systemctl start.
  • isolate_network deletes the corresponding iptables rule.
  • package_rollback runs apt-get install <pkg> — note that the previous version may not be in the apt cache; the latest stable replaces it.

Environment variables

Variable Default Purpose
AUTO_ACTION_ENABLED false Master switch for the Sprint 2 dispatcher
AUTO_ACTION_DRY_RUN true If true, log + persist what would happen but don't execute
AUTO_ACTION_MIN_CONFIDENCE 85 LLM confidence threshold (0-100)
AUTO_ACTION_TYPES ["kill_process","file_quarantine","service_stop"] Allowlisted action types. Add isolate_network / package_rollback only when you've audited the code paths
AUTO_ACTION_PROCESS_WHITELIST see above Process / service / package names that are never targeted, even with high confidence
AUTO_ACTION_QUARANTINE_DIR /var/quarantine Where quarantined files go (mode 0o700, created on first use)

Operational checklist before flipping AUTO_ACTION_DRY_RUN=false

  • One full week of dry-run rows reviewed — no surprises.
  • AUTO_ACTION_PROCESS_WHITELIST covers every long-running daemon on the host (run systemctl list-units --type=service --state=running | awk '{print $1}' and add anything critical).
  • The p3guardian service user has sudo rights only for the minimal set of binaries needed (fail2ban-client, systemctl, iptables/ip6tables if you enabled isolate_network, apt-get if you enabled package_rollback). Use a /etc/sudoers.d/p3guardian snippet — do not grant ALL=(ALL) NOPASSWD: ALL.
  • Telegram bot token + chat configured — you want the Revert buttons reachable in under 30 seconds.
  • Operator on call has access to the host as a separate account that can revert manually if p3guardian itself somehow gets impaired.

See also: Hardening checklist.