Prometheus alerting rules for eBPF, SNMP, WireGuard, Cilium and cert-manager added to awesome-prometheus-alerts
I maintain awesome-prometheus-alerts, a collection of production-ready Prometheus alerting rules. Just added a batch of rules relevant to low-level system and network monitoring:
eBPF (cloudflare/ebpf_exporter) - Program load failures - Map allocation errors - Decoder config issues
SNMP - Interface operational status - Bandwidth utilization - Interface error/discard rate
WireGuard - Peer last handshake age: fires when a peer hasn't been seen in >3 minutes, which reliably catches dropped tunnels without noisy flapping
Cilium - Policy enforcement drop rate - BPF map pressure - Endpoint health
cert-manager - Certificate expiry warnings - Renewal and ACME failure detection
All rules are plain YAML, no dependencies beyond the respective exporters.
-> https://samber.github.io/awesome-prometheus-alerts
If you spot anything wrong in the PromQL or have better thresholds for your environment, issues and PRs welcome.
[link] [comments]