❌

Normal view

Prometheus alerting rules for eBPF, SNMP, WireGuard, Cilium and cert-manager added to awesome-prometheus-alerts

I maintain awesome-prometheus-alerts, a collection of production-ready Prometheus alerting rules. Just added a batch of rules relevant to low-level system and network monitoring:

eBPF (cloudflare/ebpf_exporter) - Program load failures - Map allocation errors - Decoder config issues

SNMP - Interface operational status - Bandwidth utilization - Interface error/discard rate

WireGuard - Peer last handshake age: fires when a peer hasn't been seen in >3 minutes, which reliably catches dropped tunnels without noisy flapping

Cilium - Policy enforcement drop rate - BPF map pressure - Endpoint health

cert-manager - Certificate expiry warnings - Renewal and ACME failure detection

All rules are plain YAML, no dependencies beyond the respective exporters.

-> https://samber.github.io/awesome-prometheus-alerts

If you spot anything wrong in the PromQL or have better thresholds for your environment, issues and PRs welcome.

submitted by /u/samuelberthe
[link] [comments]
❌