Перейти к содержимому
Skip to content
← All articles

Docker Container Monitoring: Metrics, Tools, and Best Practices

Why Container Monitoring Is Different

Docker containers are ephemeral. They start, stop, scale up, and scale down automatically. A container running now may not exist in five minutes. Traditional server monitoring — where you track long-lived hosts with static IPs — breaks in a containerized environment. You need monitoring that adapts to dynamic infrastructure.

Container monitoring must handle: short-lived instances, high cardinality (hundreds or thousands of containers), shared host resources, container orchestration events, and the layered architecture of containers running inside hosts running inside clusters.

Key Metrics to Monitor

CPU

# Check container CPU usage
docker stats --no-stream --format \
    "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}"

# Output:
# NAME          CPU %     MEM USAGE / LIMIT
# web-app       15.23%    256MiB / 512MiB
# redis         2.41%     64MiB / 128MiB
# postgres      8.76%     512MiB / 1GiB

Memory

# Memory metrics from cgroup
cat /sys/fs/cgroup/memory/docker/<container_id>/memory.usage_in_bytes
cat /sys/fs/cgroup/memory/docker/<container_id>/memory.limit_in_bytes
cat /sys/fs/cgroup/memory/docker/<container_id>/memory.stat

Network

Disk I/O

Container Lifecycle

Monitoring Stack Architecture

A typical container monitoring stack:

Containers → cAdvisor (metrics collection)
                ↓
           Prometheus (time-series storage)
                ↓
           Grafana (visualization + dashboards)
                ↓
           Alertmanager (notifications)

cAdvisor

Google's Container Advisor runs as a container itself and automatically discovers and collects metrics from all containers on the host:

# Run cAdvisor
docker run -d \
    --name cadvisor \
    --volume /:/rootfs:ro \
    --volume /var/run:/var/run:ro \
    --volume /sys:/sys:ro \
    --volume /var/lib/docker/:/var/lib/docker:ro \
    --publish 8080:8080 \
    gcr.io/cadvisor/cadvisor:latest

Prometheus Configuration

# prometheus.yml
global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'cadvisor'
    static_configs:
      - targets: ['cadvisor:8080']

  - job_name: 'node-exporter'
    static_configs:
      - targets: ['node-exporter:9100']

  # Docker daemon metrics
  - job_name: 'docker'
    static_configs:
      - targets: ['host.docker.internal:9323']

Essential Alerts

Configure alerts for conditions that require immediate attention:

# Prometheus alerting rules
groups:
  - name: container_alerts
    rules:
      # Container using >90% of memory limit
      - alert: ContainerMemoryHigh
        expr: |
          container_memory_usage_bytes /
          container_spec_memory_limit_bytes > 0.9
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Container {{ $labels.name }} memory > 90%"

      # Container restarting frequently
      - alert: ContainerRestartLoop
        expr: |
          increase(container_restart_count[1h]) > 3
        labels:
          severity: critical
        annotations:
          summary: "Container {{ $labels.name }} restarted 3+ times in 1h"

      # Container CPU throttled
      - alert: ContainerCPUThrottled
        expr: |
          rate(container_cpu_cfs_throttled_seconds_total[5m]) > 0.5
        for: 10m
        labels:
          severity: warning

      # Container unhealthy
      - alert: ContainerUnhealthy
        expr: container_health_status{status="unhealthy"} == 1
        for: 1m
        labels:
          severity: critical

Docker Compose Health Checks

# docker-compose.yml
services:
  web:
    image: myapp:latest
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s
    deploy:
      resources:
        limits:
          cpus: '2.0'
          memory: 512M
        reservations:
          cpus: '0.5'
          memory: 256M

Log Monitoring

Container logs are equally important. The standard approach:

# Configure Fluentd log driver
docker run -d \
    --log-driver=fluentd \
    --log-opt fluentd-address=localhost:24224 \
    --log-opt tag="docker.{{.Name}}" \
    myapp:latest

Monitoring with External Tools

While internal metrics (CPU, memory, restarts) tell you about container health, external monitoring tells you about service health — what users actually experience. Use external uptime monitoring (like Enterno.io) to check that your containerized services respond correctly from outside your network. This catches issues that internal metrics miss: DNS problems, load balancer misconfigurations, TLS certificate issues, and network-level failures.

Best Practices

Conclusion

Docker container monitoring requires a shift from static host monitoring to dynamic, label-based, multi-layer observability. Track CPU, memory, network, and disk at the container level; lifecycle events like restarts and OOM kills; application-level health checks; and external service availability. Use cAdvisor, Prometheus, and Grafana as your monitoring foundation, complement with centralized logging, and always combine internal metrics with external uptime monitoring for complete visibility into your containerized services.

Check your website right now

Check now →
More articles: DevOps
DevOps
Monitoring as Code: Prometheus Rules and Grafana Dashboards
16.03.2026 · 14 views
DevOps
Zero-Downtime Deployment Strategies
16.03.2026 · 12 views
DevOps
Log Management Best Practices: From Chaos to Clarity
16.03.2026 · 10 views