Runtime Security and Least Privilege
LEVEL 0
The Problem
Your image is secure. No vulnerabilities, trusted source, signed. Great!
But then you run it:
docker run --privileged -v /:/host myapp
You just gave the container:
- Full access to the host filesystem
- All Linux capabilities
- Ability to load kernel modules
- Essentially root on the host
A secure image can be run insecurely.
Runtime security is about configuring how containers run.
LEVEL 1
The Concept — The Employee Badge System
The Concept
Imagine a company with different security levels.
Scenario 1: Full Access (Insecure)
- Everyone gets a master keycard
- Access to all rooms: offices, server room, CEO office, vault
- Can modify security systems
- Can issue new keycards
This is running containers as root with --privileged.
Scenario 2: Least Privilege (Secure)
- Employees get badges based on role
- Developer: office, break room, meeting rooms
- Sysadmin: office, server room
- Visitor: lobby only
- Each badge has minimum necessary access
This is running containers with dropped capabilities, non-root user, and resource limits.
LEVEL 2
The Mechanics — Running as Non-Root
The Mechanics
Default: Containers run as root (UID 0)
FROM ubuntu
# Runs as root
Inside container:
whoami
# root
This is dangerous. If an attacker breaks out of the container, they’re root on the host.
Solution: Create and use a non-root user
FROM ubuntu
# Create user
RUN useradd -m -u 1000 appuser
# Switch to that user
USER appuser
# Now all subsequent commands run as appuser
COPY --chown=appuser:appuser app /app
Verify:
docker run myapp whoami
# appuser
For Alpine:
FROM alpine
RUN addgroup -g 1000 appuser && \
adduser -D -u 1000 -G appuser appuser
USER appuser
User namespaces (advanced)
Map root in container to non-root on host:
# /etc/docker/daemon.json
{
"userns-remap": "default"
}
Now root (UID 0) in container maps to UID 100000 on host. Limited damage if escape occurs.
LEVEL 3
Dropping Capabilities
Linux capabilities divide root privileges into pieces.
By default, Docker drops many dangerous capabilities but keeps some:
Kept:
CAP_CHOWN— change file ownershipCAP_NET_BIND_SERVICE— bind to ports < 1024CAP_SETUID/CAP_SETGID— change user/group- etc.
Drop more capabilities:
services:
app:
image: myapp
cap_drop:
- ALL # Drop everything
cap_add:
- NET_BIND_SERVICE # Add back only what's needed
Or with docker run:
docker run --cap-drop=ALL --cap-add=NET_BIND_SERVICE myapp
Never use —privileged:
docker run --privileged myapp # ❌ Gives ALL capabilities
This disables security features. Only use for very specific cases (Docker-in-Docker, device access).
LEVEL 4
Security Options
Read-only root filesystem:
services:
app:
image: myapp
read_only: true
tmpfs:
- /tmp # Writable tmpfs for temporary files
Prevents:
- Installing malware
- Modifying config files
- Persistence after compromise
No new privileges:
services:
app:
image: myapp
security_opt:
- no-new-privileges:true
Prevents processes from gaining additional privileges (e.g., via setuid binaries).
Seccomp profile:
Default profile blocks dangerous syscalls. Use custom profile:
services:
app:
image: myapp
security_opt:
- seccomp=/path/to/seccomp-profile.json
AppArmor profile:
services:
app:
image: myapp
security_opt:
- apparmor=docker-default
LEVEL 5
Complete Secure Configuration
version: '3.9'
services:
app:
build: .
image: myapp:latest
# Non-root user (set in Dockerfile)
# USER appuser already specified in Dockerfile
# Drop all capabilities, add only what's needed
cap_drop:
- ALL
cap_add:
- NET_BIND_SERVICE
# Read-only filesystem
read_only: true
tmpfs:
- /tmp
- /var/run
# Security options
security_opt:
- no-new-privileges:true
- seccomp=default
# Resource limits
mem_limit: 512m
cpus: 1
pids_limit: 100
# Network isolation
networks:
- app-network
# No privileged mode
privileged: false
# Restart policy
restart: unless-stopped
networks:
app-network:
driver: bridge
Dockerfile:
FROM node:18-alpine
# Create non-root user
RUN addgroup -g 1000 appuser && \
adduser -D -u 1000 -G appuser appuser
WORKDIR /app
# Install dependencies as root
COPY package*.json ./
RUN npm ci --only=production && \
npm cache clean --force
# Copy app files with correct ownership
COPY --chown=appuser:appuser . .
# Switch to non-root user
USER appuser
EXPOSE 3000
CMD ["node", "server.js"]