Skip to content

Kafka consumer lag — alert when behind by more than N

Consumer group lags behind the producer and messages pile up. Need a lag threshold that triggers an alert.

Recipe

bash
#!/usr/bin/env bash
# /opt/kafka-lag-watch.sh — exposes the worst lag of a group as plain text
# Wrap as HTTP endpoint via nginx fastcgi or python -m http.server.
GROUP="${1:-payments-consumer}"
BROKER="${2:-localhost:9092}"
THRESHOLD="${LAG_THRESHOLD:-10000}"

LAG=$(/opt/kafka/bin/kafka-consumer-groups.sh \
        --bootstrap-server "$BROKER" --describe --group "$GROUP" 2>/dev/null \
      | awk 'NR>2 && $5 != "-" {print $5}' | sort -n | tail -1)

[ -z "$LAG" ] && { echo "no-data"; exit 1; }
[ "$LAG" -ge "$THRESHOLD" ] && echo "lag $LAG" || echo "ok $LAG"

Same thing in Enterno.io

Expose /kafka-lag → "ok N" / "lag N" and point an Enterno HTTP monitor with keyword-rule "does not contain ok". Instant alert, no alertmanager needed.

Set up HTTP monitor → ← All recipes

Related recipes