Alert on a nginx 5xx-rate spike

Anatoly Oshmanovsky

The server starts returning 503/504 — but a plain uptime check misses it because the homepage is 200 while the API path is on fire.

Stack: nginx · access.log · awk Tags: nginx, errors, awk

Recipe

bash

#!/usr/bin/env bash
# /etc/cron.d/nginx-5xx-rate
# */1 * * * * root /opt/nginx-5xx.sh /var/log/nginx/access.log

LOG=${1:-/var/log/nginx/access.log}
WINDOW=60        # last N seconds
THRESH=10        # 5xx allowed per window
SINCE=$(date -d "-${WINDOW} seconds" '+%d/%b/%Y:%H:%M:%S')

COUNT=$(awk -v since="$SINCE" '
  $4 >= "["since {
    if ($9 ~ /^5/) c++
  }
  END { print c+0 }
' "$LOG")

if [ "$COUNT" -gt "$THRESH" ]; then
  curl -fsS -X POST "$SLACK_WEBHOOK" --data "{\"text\":\"nginx 5xx spike: $COUNT in last ${WINDOW}s\"}"
fi

Same thing in Enterno.io

Replace the bash cron with an Enterno heartbeat ping on threshold breach. You get a retention history of spikes and a unified dashboard alongside the other monitors instead of scattered Slack pings.

Set up HTTP monitor → ← All recipes

Related recipes

Cloudflare Workers — alert when 5xx error rate spikes

bash

A Worker auto-deploys from main. Once prod 5xx rate jumped to 12% — but dashboards get checked once a day. Want a per-minute probe.

nginx — alert when a rate-limit zone is saturating

bash

An attacker is hammering a `limit_req_zone` — legit traffic now eats 429s too. The access log shows it but nobody is watching.

Sentry — alert on project error-rate spike

bash

After a release Sentry only fires on big incidents. I want to catch a slow error-rate climb over a 15-minute window before it becomes a real incident.

Recipe

Same thing in Enterno.io

Related recipes

Cloudflare Workers — alert when 5xx error rate spikes

nginx — alert when a rate-limit zone is saturating

Sentry — alert on project error-rate spike

Start monitoring for free