Monitoring Resource Usage
LEVEL 0
The Problem
Your containers are running. But are they using too much CPU? Too much memory? Is one container hogging resources?
You need to monitor resource usage in real-time to:
- Identify performance bottlenecks
- Detect resource leaks
- Set appropriate resource limits
- Plan capacity
LEVEL 1
The Concept — The Dashboard Gauges
The Concept
Imagine a car dashboard.
You have gauges for:
- Speed (how fast you’re going)
- RPM (engine activity)
- Fuel (resource remaining)
- Temperature (system health)
docker stats is the dashboard for your containers.
It shows CPU, memory, network, and disk usage in real-time.
LEVEL 2
The Mechanics — Using docker stats
The Mechanics
View all running containers:
docker stats
Output:
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O
abc123 web 0.25% 45.5MiB / 512MiB 8.89% 1.2MB / 800kB 0B / 0B
def456 db 1.50% 250MiB / 1GiB 24.41% 500kB / 1.2MB 10MB / 5MB
View specific containers:
docker stats web db
No streaming (one-shot):
docker stats --no-stream
Shows current stats and exits (doesn’t continuously update).
Format output:
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}"
LEVEL 3
Understanding the Metrics
CPU %:
- Percentage of CPU used by the container
- Can exceed 100% on multi-core systems (e.g., 150% = using 1.5 cores)
MEM USAGE / LIMIT:
- How much memory the container is using vs. its limit
- If no limit is set, LIMIT shows total host memory
MEM %:
- Memory usage as percentage of limit
NET I/O:
- Network traffic: received / transmitted
BLOCK I/O:
- Disk I/O: read / written
LEVEL 4
Setting Resource Limits
Limit memory:
services:
app:
image: myapp
mem_limit: 512m # Hard limit
mem_reservation: 256m # Soft limit (reservation)
Or:
docker run -m 512m myapp
Limit CPU:
services:
app:
image: myapp
cpus: 1.5 # Use up to 1.5 CPU cores
Or:
docker run --cpus=1.5 myapp
CPU shares (relative weight):
services:
high-priority:
image: myapp
cpu_shares: 1024 # Higher priority
low-priority:
image: worker
cpu_shares: 512 # Lower priority
Combined example:
services:
web:
image: nginx
mem_limit: 256m
cpus: 0.5
app:
image: myapp
mem_limit: 1g
cpus: 2
db:
image: postgres:15
mem_limit: 2g
cpus: 2
mem_reservation: 1g
LEVEL 5
Monitoring in Production
For production, use proper monitoring tools:
Prometheus + Grafana:
services:
prometheus:
image: prom/prometheus
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus-data:/prometheus
ports:
- "9090:9090"
grafana:
image: grafana/grafana
ports:
- "3000:3000"
volumes:
- grafana-data:/var/lib/grafana
cAdvisor (Container Advisor):
Collects resource usage metrics for all containers:
services:
cadvisor:
image: gcr.io/cadvisor/cadvisor:latest
volumes:
- /:/rootfs:ro
- /var/run:/var/run:ro
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
ports:
- "8080:8080"
View metrics at http://localhost:8080
Cloud monitoring:
- AWS CloudWatch (for ECS)
- Google Cloud Monitoring
- Azure Monitor
- Datadog
- New Relic