Docker Container Monitoring: Metrics, Tools, and Best Practices

Anatoly Oshmanovsky

DevOps

Docker Container Monitoring: Metrics, Tools, and Best Practices

Published: 16.03.2026 · ~6 min · 158 views

Why Container Monitoring Is Different

Docker containers are ephemeral. They start, stop, scale up, and scale down automatically. A container running now may not exist in five minutes. Traditional server monitoring — where you track long-lived hosts with static IPs — breaks in a containerized environment. You need monitoring that adapts to dynamic infrastructure.

Container monitoring must handle: short-lived instances, high cardinality (hundreds or thousands of containers), shared host resources, container orchestration events, and the layered architecture of containers running inside hosts running inside clusters.

Key Metrics to Monitor

CPU

CPU usage — percentage of allocated CPU consumed. In Docker, this is relative to the container's CPU limit, not the host total
CPU throttling — when a container hits its CPU limit, the kernel throttles it. High throttling means the limit is too low or the application needs optimization
CPU shares — relative weight when competing with other containers for CPU time

# Check container CPU usage
docker stats --no-stream --format \
    "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}"

# Output:
# NAME          CPU %     MEM USAGE / LIMIT
# web-app       15.23%    256MiB / 512MiB
# redis         2.41%     64MiB / 128MiB
# postgres      8.76%     512MiB / 1GiB

Memory

Memory usage — current RSS (Resident Set Size) of the container process
Memory limit — the maximum memory allocated. Exceeding this triggers the OOM killer, which terminates the container
Cache memory — filesystem cache used by the container. Can be reclaimed under pressure, so distinguish it from actual application memory usage

# Memory metrics from cgroup
cat /sys/fs/cgroup/memory/docker/<container_id>/memory.usage_in_bytes
cat /sys/fs/cgroup/memory/docker/<container_id>/memory.limit_in_bytes
cat /sys/fs/cgroup/memory/docker/<container_id>/memory.stat

Network

Network I/O — bytes sent and received per container
Connection count — number of active TCP connections
Packet drops — indicates network congestion or misconfiguration
DNS resolution time — container DNS can be a bottleneck, especially with Docker's embedded DNS resolver

Disk I/O

Disk read/write bytes — I/O throughput per container
IOPS — I/O operations per second
Container filesystem size — writable layer size. Growing unexpectedly indicates log accumulation or temp file leaks

Container Lifecycle

Restart count — frequent restarts indicate crashes or health check failures
Uptime — how long the container has been running
Exit codes — 0 = normal, 1 = application error, 137 = OOM killed, 143 = SIGTERM
Health check status — Docker health check results (healthy, unhealthy, starting)

Monitoring Stack Architecture

A typical container monitoring stack:

Containers → cAdvisor (metrics collection)
                ↓
           Prometheus (time-series storage)
                ↓
           Grafana (visualization + dashboards)
                ↓
           Alertmanager (notifications)

cAdvisor

Google's Container Advisor runs as a container itself and automatically discovers and collects metrics from all containers on the host:

# Run cAdvisor
docker run -d \
    --name cadvisor \
    --volume /:/rootfs:ro \
    --volume /var/run:/var/run:ro \
    --volume /sys:/sys:ro \
    --volume /var/lib/docker/:/var/lib/docker:ro \
    --publish 8080:8080 \
    gcr.io/cadvisor/cadvisor:latest

Prometheus Configuration

# prometheus.yml
global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'cadvisor'
    static_configs:
      - targets: ['cadvisor:8080']

  - job_name: 'node-exporter'
    static_configs:
      - targets: ['node-exporter:9100']

  # Docker daemon metrics
  - job_name: 'docker'
    static_configs:
      - targets: ['host.docker.internal:9323']

Essential Alerts

Configure alerts for conditions that require immediate attention:

# Prometheus alerting rules
groups:
  - name: container_alerts
    rules:
      # Container using >90% of memory limit
      - alert: ContainerMemoryHigh
        expr: |
          container_memory_usage_bytes /
          container_spec_memory_limit_bytes > 0.9
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Container {{ $labels.name }} memory > 90%"

      # Container restarting frequently
      - alert: ContainerRestartLoop
        expr: |
          increase(container_restart_count[1h]) > 3
        labels:
          severity: critical
        annotations:
          summary: "Container {{ $labels.name }} restarted 3+ times in 1h"

      # Container CPU throttled
      - alert: ContainerCPUThrottled
        expr: |
          rate(container_cpu_cfs_throttled_seconds_total[5m]) > 0.5
        for: 10m
        labels:
          severity: warning

      # Container unhealthy
      - alert: ContainerUnhealthy
        expr: container_health_status{status="unhealthy"} == 1
        for: 1m
        labels:
          severity: critical

Docker Compose Health Checks

# docker-compose.yml
services:
  web:
    image: myapp:latest
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s
    deploy:
      resources:
        limits:
          cpus: '2.0'
          memory: 512M
        reservations:
          cpus: '0.5'
          memory: 256M

Log Monitoring

Container logs are equally important. The standard approach:

stdout/stderr — applications should log to stdout. Docker captures these and makes them available via docker logs
Log drivers — Docker supports multiple log drivers: json-file (default), syslog, fluentd, awslogs, gelf
Centralized logging — ship logs to ELK (Elasticsearch, Logstash, Kibana), Loki, or a cloud service for aggregation and search

# Configure Fluentd log driver
docker run -d \
    --log-driver=fluentd \
    --log-opt fluentd-address=localhost:24224 \
    --log-opt tag="docker.{{.Name}}" \
    myapp:latest

Monitoring with External Tools

While internal metrics (CPU, memory, restarts) tell you about container health, external monitoring tells you about service health — what users actually experience. Use external uptime monitoring (like Enterno.io) to check that your containerized services respond correctly from outside your network. This catches issues that internal metrics miss: DNS problems, load balancer misconfigurations, TLS certificate issues, and network-level failures.

Best Practices

Always set resource limits — containers without memory limits can consume all host memory and crash other containers
Use labels for organization — label containers with service name, team, environment. This makes dashboards and alerts meaningful
Monitor the host, not just containers — disk space, host CPU, kernel memory, and Docker daemon health affect all containers
Implement health checks — Docker health checks enable automatic restart of unhealthy containers and prevent traffic routing to broken instances
Set log rotation — without rotation, container logs can fill the disk. Configure max-size and max-file options
Track image vulnerabilities — monitor base images for known CVEs. Tools: Trivy, Snyk, Docker Scout
Alert on exit code 137 — this means OOM kill. The container needs more memory or has a memory leak
Separate monitoring from monitored — run your monitoring stack on separate infrastructure so it survives the failures it needs to detect

Conclusion

Docker container monitoring requires a shift from static host monitoring to dynamic, label-based, multi-layer observability. Track CPU, memory, network, and disk at the container level; lifecycle events like restarts and OOM kills; application-level health checks; and external service availability. Use cAdvisor, Prometheus, and Grafana as your monitoring foundation, complement with centralized logging, and always combine internal metrics with external uptime monitoring for complete visibility into your containerized services.

Check your website right now

Check your site →

Docker Container Monitoring: Metrics, Tools, and Best Practices

Why Container Monitoring Is Different

Key Metrics to Monitor

CPU

Memory

Network

Disk I/O

Container Lifecycle

Monitoring Stack Architecture

cAdvisor

Prometheus Configuration

Essential Alerts

Docker Compose Health Checks

Log Monitoring

Monitoring with External Tools

Best Practices

Conclusion

Start monitoring for free