Fastly — alert when a purge takes longer than N s
Fastly soft-purge is typically sub-second, but sometimes hangs 30+ s (overload, key collision). After a release, new assets do not appear, users see the old version.
Recipe
#!/usr/bin/env bash
# /etc/cron.d/fastly-purge
# */5 * * * * root /opt/fastly-purge.sh
API_KEY=${FASTLY_API_KEY}
SERVICE=${FASTLY_SERVICE_ID}
THRESH_S=${THRESH_S:-3}
# Send a no-op purge with a unique surrogate-key, time the round trip
KEY="hb-test-$(date +%s)"
START=$(date +%s%N)
curl -fsS -X POST -H "Fastly-Key: $API_KEY" \
"https://api.fastly.com/service/$SERVICE/purge/$KEY" >/dev/null
END=$(date +%s%N)
MS=$(( (END - START) / 1000000 ))
S=$(( MS / 1000 ))
if [ "$S" -gt "$THRESH_S" ]; then
curl -fsS "$HEARTBEAT_URL" --data "purge_ms=$MS,threshold_s=$THRESH_S"
exit 2
fi
echo "OK (purge=${MS}ms)"
Same thing in Enterno.io
Wrap in an Enterno heartbeat — see the latency trend and page on-call before the deploy team notices "asset not refreshing".
Related recipes
The CDN cache_status header (cf-cache-status or x-cache) suddenly returns MISS on more than 30% of requests — origin load + bandwidth bills both spike.
A CloudFront distribution started serving 5xx 4 % of the time — far-region clients see broken pages. CloudWatch graph exists; dashboard goes unwatched.
Ensure your site returns 2xx every minute, alert to Slack/Telegram on failure.