Skip to content

Architecture Overview

Log Sources → OTel Collector → Drain3 (template extraction) → Feature Engineering → Anomaly Detection → Alerting

OpenTelemetry Collector as gateway. Supports:

  • CloudWatch, GCP Logging, Azure Monitor
  • Syslog (UDP/TCP)
  • File tailing
  • Kafka
  • OTLP (gRPC/HTTP)

Drain3 for streaming log template extraction. Reduces millions of raw log lines to thousands of structured patterns in real-time using a fixed-depth parse tree.

All algorithms are online/streaming — they update with every event, no batch retraining:

AlgorithmDetectsUse Case
Half-Space TreesContent anomaliesUnusual log patterns
Holt-WintersVolume anomaliesTraffic spikes/drops
CUSUMChange pointsRegime shifts
Markov chainsSequence anomaliesUnusual event ordering
DSPOTAuto-thresholdsSelf-tuning alert levels

Entity-centric igraph (40-250x faster than NetworkX):

  • Entity types: users, IPs, hosts, processes, files, domains
  • Three strategies: entity-temporal window joins, risk accumulation, graph-structural
  • Temporal watermarking for event ordering
  • pySigma with 3,000+ SigmaHQ rules
  • MITRE ATT&CK framework mapping
  • Kill-chain tracking across entities

The core SeerflowEvent struct unifies four log schema standards:

  • OpenTelemetry LogRecord
  • Elastic Common Schema (ECS)
  • OCSF numeric taxonomy
  • Sigma logsource categories

Protocol-based interfaces — backend switchable via one config line:

ProtocolMethodsPurpose
LogStorewrite_events, query_events, search_textEvent persistence
AlertStorewrite_alert, query_alerts, update_feedbackAlert management
ModelStoresave_state, load_stateML model checkpoints
EntityStoreget_timeline, get_relatedEntity exploration

Backends: SQLite (zero-config default) and PostgreSQL (production scale).