Kubernetes — alert when a PVC is stuck Pending
PVC is created, but the provisioner did not allocate the volume (wrong StorageClass? capacity exhausted? CSI driver? upstream cloud quota?). The pod waits and never starts — deployment status does not show why.
Recipe
#!/usr/bin/env bash
# /etc/cron.d/pvc-pending
# */5 * * * * root /opt/pvc-pending.sh
CONTEXT=${KUBE_CONTEXT:-prod}
MAX_MIN=${MAX_MIN:-10}
NOW=$(date -u +%s)
STUCK=$(kubectl --context "$CONTEXT" get pvc -A -o json | jq --argjson now "$NOW" --argjson max "$MAX_MIN" '
[.items[] | select(.status.phase == "Pending") |
{
ns: .metadata.namespace,
name: .metadata.name,
age_min: ($now - (.metadata.creationTimestamp | fromdateiso8601)) / 60
} |
select(.age_min > $max)]')
COUNT=$(echo "$STUCK" | jq 'length')
if [ "${COUNT:-0}" -gt 0 ]; then
EXAMPLES=$(echo "$STUCK" | jq -r '.[] | "\(.ns)/\(.name)"' | head -5 | tr '\n' ',')
curl -fsS "$HEARTBEAT_URL" --data "pvc_pending=$COUNT,examples=$EXAMPLES"
exit 2
fi
echo "OK (no PVC pending > ${MAX_MIN}m)"
Same thing in Enterno.io
Wrap in an Enterno heartbeat — learn that "pvc-X has been pending 15 min" and fix the StorageClass / quota before the deploy team wakes you up.
Related recipes
Readiness probes pass inside the pod, but no one sees that the LB refused to route traffic to the new deploy.
Backup cron silently failed; nobody noticed; the gap surfaces only at the next incident. Need an alert when the newest backup file is older than 30 hours.
A CrashLoopBackOff in one namespace — kubectl shows a restart count of 47, but nobody sees it. Want an endpoint that returns high when the counter jumps.