Skip to content

AWS Lambda — alert when p99 cold-start regresses

A release bumped the bundle size and p99 cold-start went from 800ms to 3s. The metric is in CloudWatch, but nobody’s watching. Want a heartbeat-style alert.

Recipe

bash
#!/usr/bin/env bash
# Pulls Lambda Init Duration p99 over the last 10 min for one function.
FN="${FN:-my-api}"
THRESHOLD_MS="${THRESHOLD_MS:-1500}"

P99=$(aws cloudwatch get-metric-statistics \
  --namespace AWS/Lambda --metric-name Duration \
  --dimensions Name=FunctionName,Value="$FN" Name=Resource,Value="$FN" \
  --start-time "$(date -u -d '10 min ago' +%FT%TZ)" \
  --end-time   "$(date -u +%FT%TZ)" \
  --period 60 --extended-statistics p99 \
  --query 'Datapoints[-1].ExtendedStatistics.p99' --output text 2>/dev/null)

[ -z "$P99" ] || [ "$P99" = "None" ] && { echo "no-data"; exit 1; }
P99_MS=$(printf "%.0f" "$P99")
[ "$P99_MS" -ge "$THRESHOLD_MS" ] && echo "high $P99_MS" || echo "ok $P99_MS"

Same thing in Enterno.io

Drop next to the function. Wrap the endpoint via an Enterno HTTP monitor + a PageSpeed monitor on the function’s public endpoint — correlation of cold-start spikes vs full-page latency.

Set up Speed → ← All recipes

Related recipes

long_query_time = 1, slow_query_log enabled. You need to know when the slow-query rate per minute suddenly jumps (a deploy broke an index, ORM went N+1).