Alert Explanation
AlertExplanationService turns an alert + its surrounding events + the affected entity’s UEBA baseline into a plain-English explanation an analyst can act on.
When it runs
Section titled “When it runs”Only when something triggers it:
- Dashboard analyst clicks Explain on an alert detail page
POST /api/v1/alerts/{alert_id}/explainis called from a script- The service is not invoked automatically on alert fire — that would burn tokens on noise
The result is cached (LRU, size set by llm.explanation_cache_size, default 256). Subsequent calls for the same alert return instantly.
What the LLM sees
Section titled “What the LLM sees”The service builds a redacted context from three sources:
- The alert — title, rule, severity, MITRE tags, entities, dedup_count, risk_score
- A ±N-second event window around the alert timestamp (events involving the same entities only)
- The entity’s UEBA baseline — warm-up status, top templates, source-IP spread — if available
PII-prone fields (raw message, OS users beyond the alert’s named entities) are summarized rather than passed verbatim.
What the LLM returns
Section titled “What the LLM returns”{ "alert_id": "alr_01HZX...", "summary": "Short one-paragraph explanation", "root_cause": "Why this likely happened", "next_steps": [ "Check ...", "Block ...", "Page on-call if ..." ], "confidence": "high", "model": "claude-sonnet-4-6", "latency_ms": 1840, "generated_at_ns": "1715619300000000000"}The parser.py step is strict — malformed LLM output is rejected, not silently surfaced. A failed parse propagates as HTTP 502 to the client; the dashboard surfaces a “regenerate” button.
REST API
Section titled “REST API”| Method | Path | Purpose |
|---|---|---|
| POST | /api/v1/alerts/{alert_id}/explain | Generate (or return cached) explanation. Async — guarded by a per-request wall-clock timeout. |
| GET | /api/v1/alerts/{alert_id}/explanation | Fetch the cached explanation, if any. 404 when never generated. |
Both endpoints require llm.backend to be non-empty; otherwise the routes return 503 Service Unavailable.
Configuration
Section titled “Configuration”llm: backend: ollama ollama_url: http://localhost:11434 ollama_model: phi4-mini
# Explanation-specific (defaults shown) explanation_cache_size: 256The wall-clock guard uses ollama_timeout_s / cloud_timeout_s depending on the active backend. On timeout the route returns 502 and nothing is cached — a retry will hit the backend again.
Cost control
Section titled “Cost control”- The cache means repeat clicks on the same alert are free.
- Set
explanation_cache_sizehigher (e.g. 2048) if you have many analysts triaging the same incidents. - For sensitive logs, use
backend: llama_cpporbackend: ollama— both are local and incur no per-token cost.