Kubernetes — alert on pods restarting > N times in a window
A CrashLoopBackOff in one namespace — kubectl shows a restart count of 47, but nobody sees it. Want an endpoint that returns high when the counter jumps.
Recipe
#!/usr/bin/env bash
# Compares current restart-count against the snapshot from N minutes ago.
NS="${NS:-default}"
THRESHOLD="${THRESHOLD:-3}"
STATE_DIR="${STATE_DIR:-/var/lib/enterno/k8s-restarts}"
mkdir -p "$STATE_DIR"
CUR=$(kubectl get pods -n "$NS" \
-o jsonpath='{range .items[*]}{.metadata.name} {.status.containerStatuses[*].restartCount}{"\n"}{end}')
OUT="ok"
while IFS= read -r line; do
POD="${line%% *}"
CNT="${line##* }"
PREV_FILE="$STATE_DIR/$POD.txt"
PREV=$(cat "$PREV_FILE" 2>/dev/null || echo 0)
echo "$CNT" > "$PREV_FILE"
DELTA=$(( CNT - PREV ))
[ "$DELTA" -ge "$THRESHOLD" ] && OUT="high $POD restarted ${DELTA}× since last check"
done <<< "$CUR"
echo "$OUT"
Same thing in Enterno.io
Endpoint + an Enterno HTTP monitor with "ok" keyword on a 1-2 min interval = paged within 60 seconds. Pro+ stores the snapshots so you can see which pod is the culprit.
Related recipes
Readiness probes pass inside the pod, but no one sees that the LB refused to route traffic to the new deploy.
Ensure your site returns 2xx every minute, alert to Slack/Telegram on failure.
Minimal script that checks an SSL certificate and alerts 14 days before expiry.