Jenkins — alert when the build queue is stuck
The Jenkins queue grows — an agent went away, label mismatch, or executors are saturated. PR checks hang, devs start chat-pinging "what is up with CI?".
Recipe
#!/usr/bin/env bash
# /etc/cron.d/jenkins-queue
# */5 * * * * root /opt/jenkins-queue.sh
JENKINS=${JENKINS_URL:-http://jenkins:8080}
USER=${JENKINS_USER}
TOKEN=${JENKINS_TOKEN}
THRESH=${THRESH:-5} # queued jobs > N
MAX_AGE_MIN=${MAX_AGE_MIN:-15} # any job stuck > N min
QUEUE=$(curl -fsS -u "$USER:$TOKEN" "$JENKINS/queue/api/json" \
| jq '.items')
COUNT=$(echo "$QUEUE" | jq 'length')
NOW=$(date -u +%s)
OLD=$(echo "$QUEUE" | jq --argjson now "$NOW" --argjson max "$MAX_AGE_MIN" '
[.[] | select((($now * 1000 - .inQueueSince) / 60000) > $max)] | length')
if [ "$COUNT" -gt "$THRESH" ] || [ "$OLD" -gt 0 ]; then
curl -fsS "$HEARTBEAT_URL" --data "queued=$COUNT,old=$OLD,window=${MAX_AGE_MIN}m"
exit 2
fi
echo "OK (queued=$COUNT, none > ${MAX_AGE_MIN}m)"
Same thing in Enterno.io
Set up an Enterno heartbeat on 5 min — learn about a stuck agent before the team merge window stalls.
Related recipes
Consumer can not keep up; queue grows; disk eventually fills. Need an alert on messages-ready count.
A `schedule:`-driven workflow sometimes silently stops (forked repos, expired tokens, GH outages). You only realise a week later when backups are missing.
A GitLab pipeline sometimes hangs in running for an hour+ (runner lost connectivity, or a job sits in curl with no timeout) — you only notice when a merge request 'isn't moving'.