Backup and Disaster Recovery
LEVEL 0
The Problem
Your database container crashes. The volume gets corrupted. The host machine dies. A developer accidentally runs docker compose down -v.
Your data is gone.
Unless you have backups.
LEVEL 1
The Concept — The Safety Deposit Box and Copy
The Concept
Imagine important documents.
No backup: Documents at home. House burns down. Documents gone forever.
With backup:
- Original at home
- Copy in bank safety deposit box
- Copy at friend’s house
- Copy in cloud storage
One location fails, you have copies elsewhere.
Backups are copies of your data in different locations.
LEVEL 2
The Mechanics — Volume Backups
The Mechanics
Manual backup:
# Stop container to ensure consistent state
docker compose stop db
# Create backup
docker run --rm \
-v myproject_db-data:/data \
-v $(pwd)/backups:/backup \
alpine tar czf /backup/db-backup-$(date +%Y%m%d-%H%M%S).tar.gz /data
# Restart container
docker compose start db
Automated backup script:
#!/bin/bash
# backup.sh
BACKUP_DIR=/backups
DATE=$(date +%Y%m%d-%H%M%S)
BACKUP_FILE=db-backup-$DATE.tar.gz
# Backup database volume
docker compose exec -T db pg_dump -U postgres myapp > $BACKUP_DIR/dump-$DATE.sql
# Compress
gzip $BACKUP_DIR/dump-$DATE.sql
# Upload to S3
aws s3 cp $BACKUP_DIR/dump-$DATE.sql.gz s3://my-backups/db/
# Keep only last 7 days locally
find $BACKUP_DIR -name "dump-*.sql.gz" -mtime +7 -delete
echo "Backup completed: dump-$DATE.sql.gz"
Cron job:
# /etc/crontab
0 2 * * * /usr/local/bin/backup.sh
LEVEL 3
Database-Specific Backups
PostgreSQL:
services:
db:
image: postgres:15
volumes:
- db-data:/var/lib/postgresql/data
backup:
image: postgres:15
depends_on:
- db
volumes:
- ./backups:/backups
command: >
bash -c "
while true; do
sleep 86400; # Once per day
pg_dump -h db -U postgres myapp > /backups/backup-$$(date +%Y%m%d).sql;
find /backups -name 'backup-*.sql' -mtime +7 -delete;
done
"
MySQL:
docker compose exec -T db mysqldump -u root -ppassword myapp > backup.sql
MongoDB:
docker compose exec -T db mongodump --out /backup
LEVEL 4
Restore Procedures
From backup:
# Stop application
docker compose stop app
# Restore database
gunzip < backups/dump-20240115.sql.gz | \
docker compose exec -T db psql -U postgres myapp
# Restart application
docker compose start app
Disaster recovery checklist:
-
Identify what was lost
- Database? Specific tables?
- File storage? Which files?
- Configuration? Secrets?
-
Get latest backup
- From S3, backup server, etc.
- Verify backup integrity
-
Stop affected services
docker compose stop app -
Restore data
# Restore database gunzip < backup.sql.gz | docker compose exec -T db psql -U postgres myapp # Restore files tar xzf files-backup.tar.gz -C /path/to/restore -
Verify data
- Check row counts
- Spot-check critical data
- Run integrity checks
-
Restart services
docker compose start app -
Monitor
- Check logs for errors
- Verify application functionality
LEVEL 5
Advanced: Point-in-Time Recovery
PostgreSQL WAL archiving:
services:
db:
image: postgres:15
environment:
POSTGRES_INITDB_ARGS: "-d -c wal_level=replica -c archive_mode=on -c archive_command='test ! -f /archive/%f && cp %p /archive/%f'"
volumes:
- db-data:/var/lib/postgresql/data
- ./wal-archive:/archive
Enables:
- Continuous archiving of changes
- Restore to any point in time
- Not just “yesterday’s backup” but “10:37 AM yesterday”
Restore to specific time:
# Restore base backup
pg_basebackup -D /var/lib/postgresql/data
# Create recovery.conf
cat > /var/lib/postgresql/data/recovery.conf <<EOF
restore_command = 'cp /archive/%f %p'
recovery_target_time = '2024-01-15 10:37:00'
EOF
# Start database (will replay WAL to specified time)
docker compose start db