How to Set Up Prometheus Alerting

Igor Verentsov

By Igor Verentsov · Updated Apr 18, 2026

Key idea:

Prometheus alerting: (1) Define alert rules in Prometheus rules.yaml (PromQL expressions), (2) Prometheus sends firing alerts → Alertmanager, (3) Alertmanager deduplicates + routes to receivers (PagerDuty/Slack/Email), (4) Inhibition rules suppress noisy children. 2026: move to burn-rate alerts instead of threshold-based. Integration with PagerDuty / Opsgenie for on-call rotation.

Below: step-by-step, working examples, common pitfalls, FAQ.

Free online tool — HTTP header checker: instant results, no signup.

Check your site →

Step-by-Step Setup

Prometheus rules file: PromQL expression + for: 5m duration
Alertmanager config: receivers (PagerDuty/Slack) + routing rules
Start Alertmanager: docker run -p 9093:9093 prom/alertmanager
Prometheus config: alerting.alertmanagers: [{ static_configs: [{ targets: [alertmanager:9093] }] }]
Test: trigger alert manually, verify arrived in Slack/PagerDuty
Inhibition: suppress child alerts when parent fires
Silences: mute alerts during planned maintenance

Working Examples

Scenario	Config
Alert rule (PromQL)	`# rules.yaml groups: - name: api rules: - alert: HighErrorRate expr: \| sum(rate(http_requests_total{code=~"5.."}[5m])) / sum(rate(http_requests_total[5m])) > 0.05 for: 10m labels: severity: critical annotations: summary: 'Error rate > 5% on {{ $labels.service }}' runbook: https://wiki.internal/runbooks/high-errors`
Alertmanager config	`# alertmanager.yml route: receiver: slack-default routes: - match: { severity: critical } receiver: pagerduty - match: { team: payments } receiver: slack-payments receivers: - name: pagerduty pagerduty_configs: - routing_key: ${PD_KEY} - name: slack-default slack_configs: - api_url: ${SLACK_URL} channel: '#alerts'`
Burn-rate alert (SRE style)	`- alert: SLOBurnRateFast # Fast burn: 14.4x × 99.9% error rate in 5m expr: (1 - availability_sli) > (14.4 * 0.001) for: 2m - alert: SLOBurnRateSlow # Slow burn: 3x × 99.9% in 6h expr: (1 - availability_sli) > (3 * 0.001) for: 1h`
Inhibition	`# If cluster down, suppress per-pod alerts inhibit_rules: - source_match: alertname: ClusterDown target_match: alertname: PodCrashLooping equal: [cluster]`
Silence during deploy	`# CLI $ amtool silence add \ --alertmanager.url http://localhost:9093 \ --duration=30m \ --comment='Deploy v2.3' \ service=api`

Common Pitfalls

Alert fatigue: 100+ alerts/day → SRE ignores all. Consolidate, inhibit, use burn-rate
No runbook URL in annotation — responder wastes time. Link to wiki/Notion always
for: duration too short → flapping. 5-10 min for transient issues
Email-only routing — SRE misses while sleeping. PagerDuty for critical
Not testing silence before deploy → alerts fire during planned work. Test procedures

Learn more

How-to

Glossary

What is CDC (Change Data Capture)

Research

Frequently Asked Questions

PagerDuty vs Opsgenie?

PagerDuty: market leader, polished UX, $21+/user. Opsgenie (Atlassian): cheaper, tight Jira integration. For small teams — PagerDuty free tier 5 users.

Alertmanager HA?

Clustered mode: 3+ instances gossip state. Without HA — if Alertmanager is down → missed alerts. Run 3 replicas.

Grafana alerting as replacement?

Grafana 9+ has built-in alerting (Unified alerts). For Grafana Cloud users — simpler. Prometheus + AM still standard for self-host.

Enterno integration?

<a href="/en/monitors">Enterno uptime monitoring</a> sends to PagerDuty, Slack, Telegram. For OpenTelemetry-based alerts — Grafana Alerting better.

Try the live tool that powered this guide

Free plan — 10 monitors, checks every 5 min, no card required. Upgrade for 1-minute interval and multi-region monitoring.

Start free See pricing