Skip to content

Alertmanager — alert when an alert is stuck in pending

An alertmanager alert sits in state=pending past its for-window — it should be active but is not firing (group_wait too big? notifier broken? misconfigured route?). Nobody gets paged.

Recipe

bash
#!/usr/bin/env bash
# /etc/cron.d/am-stuck
# */5 * * * * root /opt/am-stuck.sh

AM=${AM_URL:-http://localhost:9093}
THRESH_MIN=${THRESH_MIN:-15}

NOW=$(date -u +%s)
STUCK=$(curl -fsS "$AM/api/v2/alerts" \
  | jq --argjson now "$NOW" --argjson max "$THRESH_MIN" '
      [.[] | select(.status.state == "pending") |
       {name: .labels.alertname,
        age_min: (($now - (.startsAt | sub("\\.[0-9]+Z$"; "Z") | fromdateiso8601)) / 60)} |
       select(.age_min > $max)]')

COUNT=$(echo "$STUCK" | jq 'length')

if [ "${COUNT:-0}" -gt 0 ]; then
  EXAMPLES=$(echo "$STUCK" | jq -r '.[] | "\(.name)=\(.age_min|floor)m"' | head -3 | tr '\n' ',')
  curl -fsS "$HEARTBEAT_URL" --data "pending=$COUNT,examples=$EXAMPLES"
  exit 2
fi
echo "OK (no stuck pending alerts)"

Same thing in Enterno.io

Wrap in an Enterno heartbeat — a meta-monitor for alertmanager that catches "alerts firing but not delivering" before on-call notices.

Set up API monitor → ← All recipes

Related recipes