logrotate — alert when a log file grows past rotation
logrotate stopped (config syntax error on last edit, or the systemd timer was disabled) — the main log file grows. Nobody notices until the disk fills.
Recipe
#!/usr/bin/env bash
# /etc/cron.d/logrotate-stuck
# 0 */1 * * * root /opt/logrotate-stuck.sh
WATCH=${WATCH:-/var/log}
SIZE_GB=${SIZE_GB:-2} # alert per file > 2 GB
AGE_DAYS=${AGE_DAYS:-2} # not rotated in N days
# Files larger than threshold AND not modified-time-suspicious (i.e.
# they are still being appended to but haven't been rotated).
HOT=$(find "$WATCH" -type f -size +${SIZE_GB}G -mtime -1 \
-not -name '*.gz' -not -name '*.zst' 2>/dev/null)
if [ -n "$HOT" ]; then
COUNT=$(echo "$HOT" | wc -l)
curl -fsS "$HEARTBEAT_URL" --data "huge_logs=$COUNT,examples=$(echo "$HOT" | head -3 | tr '\n' ',')"
exit 2
fi
echo "OK (no oversized active logs)"
Same thing in Enterno.io
Set up an Enterno heartbeat on an hourly schedule — catches "disk not full yet but logs are swelling" earlier than a plain disk monitor.
Related recipes
Logs or backup files eat /var; in 24 hours the server falls over. A basic df check every 10 minutes saves a 2 AM incident.
Filebeat / Logstash silently died on one edge node. Elasticsearch ingest rate fell 40 % but no one watches dashboards. Sentry without logs is blindness.
OpenSearch hits 85 % disk (high watermark) — indices go read-only, the write API breaks. You want to catch this before 95 % (flood stage).