Skip to content

MongoDB replica set — alert on replication lag

A replica-set secondary falls behind the primary; the app will read stale data within a minute. Want an HTTP endpoint that says "ok" or "lag".

Recipe

bash
#!/usr/bin/env bash
# Wrap as HTTP via fastcgi or python -m http.server.
THRESHOLD_S="${LAG_THRESHOLD_S:-10}"
LAG=$(mongo --quiet --eval '
  var s = rs.printSecondaryReplicationInfo();
  var st = rs.status();
  var lag = 0;
  st.members.forEach(function (m) {
    if (m.stateStr === "SECONDARY" && m.optimeDate) {
      var d = (st.date - m.optimeDate) / 1000;
      if (d > lag) lag = d;
    }
  });
  print(Math.round(lag));
' 2>/dev/null)

[ -z "$LAG" ] && { echo "no-data"; exit 1; }
[ "$LAG" -ge "$THRESHOLD_S" ] && echo "lag $LAG" || echo "ok $LAG"

Same thing in Enterno.io

Endpoint + Enterno HTTP monitor with keyword-rule "does not contain ok" alerts the moment a secondary falls behind. Pioneer+ persists response-time which correlates with heavy writes.

Set up HTTP monitor → ← All recipes

Related recipes

long_query_time = 1, slow_query_log enabled. You need to know when the slow-query rate per minute suddenly jumps (a deploy broke an index, ORM went N+1).