Java GC — alert on pause-time spikes
Spring app is sluggish — long GC pauses (>500ms) every few minutes. Heap size was fine but the new-gen ratio is misconfigured. Want an endpoint with p99 pause.
Recipe
#!/usr/bin/env bash
# Spring Boot 2.4+ — Actuator exposes JVM metrics out of the box.
# Returns p99 pause in ms; "high" when over threshold.
ACTUATOR="${ACTUATOR:-http://localhost:8080/actuator/metrics/jvm.gc.pause}"
THRESHOLD_MS="${GC_THRESHOLD_MS:-500}"
P99=$(curl -s "$ACTUATOR" | python3 -c '
import json, sys
try:
data = json.load(sys.stdin)
val = next((m["value"] for m in data["measurements"] if m["statistic"] == "MAX"), 0)
print(int(round(val * 1000)))
except Exception:
print("0")
')
[ -z "$P99" ] && { echo "no-data"; exit 1; }
[ "$P99" -ge "$THRESHOLD_MS" ] && echo "high $P99" || echo "ok $P99"
Same thing in Enterno.io
Expose the endpoint and an Enterno HTTP monitor with "ok" keyword. Correlate with a PageSpeed monitor on the same page to tell GC stalls from upstream-API stalls.
Related recipes
long_query_time = 1, slow_query_log enabled. You need to know when the slow-query rate per minute suddenly jumps (a deploy broke an index, ORM went N+1).
Memcached fills up and starts evicting keys under load; the app cache-misses and hammers the DB. Want an evictions/min threshold.
Node app stalls under CPU-blocking operations; user latency creeps up. Want an endpoint that exposes the live event-loop-lag value.