Skip to content

kubelet — alert when a node goes NotReady

A node goes NotReady (kubelet stopped pinging the apiserver, runtime is sick) — pods on it linger like zombies until a taint evicts them. Kubernetes events do not go to Slack by default.

Recipe

bash
#!/usr/bin/env bash
# /etc/cron.d/kubelet-notready
# */2 * * * * root /opt/kubelet-notready.sh

CONTEXT=${KUBE_CONTEXT:-prod}

# Count nodes whose Ready condition isn't True
NOTREADY=$(kubectl --context "$CONTEXT" get nodes -o json \
  | jq '[.items[] | select(.status.conditions[] | select(.type=="Ready") | .status != "True")] | length')

if [ "${NOTREADY:-0}" -gt 0 ]; then
  NAMES=$(kubectl --context "$CONTEXT" get nodes \
    --no-headers | awk '$2!="Ready" {printf "%s,", $1}' | sed 's/,$//')
  curl -fsS "$HEARTBEAT_URL" --data "notready=$NOTREADY,nodes=$NAMES"
  exit 2
fi
echo "OK (all nodes Ready)"

Same thing in Enterno.io

Wrap in an Enterno heartbeat — alert straight to Telegram plus history "node1 went NotReady 3 times this week" that points to a hardware issue (not transient).

Set up HTTP monitor → ← All recipes

Related recipes