Observability — 3 Pillars · Definition & Examples

Anatoly Oshmanovsky

What is Observability

By Anatoly Oshmanovsky · Updated May 23, 2026

Key idea:

Observability — the ability to understand a system's internal state from its external outputs. Three pillars: **metrics** (numbers over time — CPU, QPS), **logs** (events — errors, audit trail), **traces** (request path across distributed services). Difference from monitoring: monitoring = knowing known unknowns (CPU high). Observability = exploring unknown unknowns (new bug type).

Below: details, example, related terms, FAQ.

Try it now — free →

Details

Metrics: Prometheus, Grafana, Datadog, New Relic. Aggregated, efficient
Logs: Loki, ELK stack, CloudWatch. Full-text, expensive at scale
Traces: Jaeger, Zipkin, Tempo. Per-request detailed flow
Correlation: trace_id links all 3 (standardised via OpenTelemetry)
Cardinality explosion: high-cardinality labels (user_id) kill Prometheus

Example

// OpenTelemetry instrumented code
const tracer = trace.getTracer('my-app');
const span = tracer.startSpan('db-query');
try {
  await db.query('SELECT ...')
} finally {
  span.end();  // exports trace to Jaeger/Tempo
}

Related Terms

The Importance of Metrics in Observability

Understanding Logs for Enhanced Observability

The Role of Traces in Distributed Systems

Learn more

Glossary

Frequently Asked Questions

Observability vs Monitoring?

Monitoring = alerts on predetermined conditions. Observability = ad-hoc investigation via exploration. Overlap is big but observability goes deeper.

Do I need all 3 pillars?

Minimum: metrics + logs. Traces — when you have microservices/distributed. In a monolith start with the first two.

Stack suggestions?

Small team: Datadog (SaaS, all-in-one) or Grafana Cloud (cheaper). Self-host: Prometheus + Loki + Tempo + Grafana (LGTM).

Try the live tool that powered this guide

Free plan — 20 monitors, 5-minute checks, no card required. Upgrade for 1-minute interval and multi-region monitoring.

Start free See pricing