Union Filesystems — The Layering Magic

LEVEL 0

The Problem

We’ve learned that containers share the host’s kernel. They don’t each need their own full operating system.

But they do need their own filesystem. Each container needs to see its own /etc, /usr, /var, and all the files that make up the operating system userspace and the application.

If you run 10 nginx containers, each one needs the nginx binaries, configuration files, and libraries. Without some cleverness, you’d need 10 complete copies of all those files.

That would be wasteful. If all 10 containers use the same base image (like nginx:latest), they’re all using identical copies of most files. Why store the same data 10 times?

Even worse, imagine you have 50 containers all based on Ubuntu. Each one would need a full copy of Ubuntu’s filesystem—that’s gigabytes multiplied by 50.

This is the problem that union filesystems solve.

LEVEL 2

The Mechanics — How Layer Overlay Works

The Mechanics

Docker uses a storage driver (historically OverlayFS, aufs, or others) to create this layered filesystem.

How files are accessed:

Case 1: Reading an unchanged file

Container wants to read /bin/bash.

Check top layer (container-specific): Not there.
Check image layer 3: Not there.
Check image layer 2: Not there.
Check image layer 1: Found! Return this file.

The file is served from the lower layer. No copy is made.

Case 2: Modifying an existing file

Container wants to write to /etc/nginx/nginx.conf.

Check top layer: Not there yet.
Check lower layers: Found in image layer 2.
Copy-on-Write (CoW): Copy the file from layer 2 to the top layer.
Modify the copy in the top layer.

Now when the container reads /etc/nginx/nginx.conf, it gets the modified version from the top layer. The original in layer 2 is unchanged and still shared by other containers.

Case 3: Creating a new file

Container wants to create /var/log/myapp.log.

File doesn’t exist in any layer.
Create it directly in the top (writable) layer.

This file is unique to this container. No other container sees it.

Case 4: Deleting a file

Container wants to delete /etc/motd.

File exists in a lower (read-only) layer.
Can’t actually delete it from the lower layer (it’s read-only and shared).
Whiteout file: Create a special marker in the top layer that says “this file is deleted.”

When the container tries to access /etc/motd, the filesystem sees the whiteout marker and returns “file not found.” The file still exists in the lower layer (other containers can see it), but this container can’t.

LEVEL 3

How Docker Images Use Layers

Every Docker image is built from layers.

Example: Building an nginx image

FROM ubuntu:22.04        # Layer 1: Ubuntu base
RUN apt-get update       # Layer 2: Package index
RUN apt-get install nginx  # Layer 3: nginx installed
COPY config.conf /etc/nginx/  # Layer 4: Custom config

Each instruction creates a new layer:

Layer 4: config.conf added
Layer 3: nginx and dependencies installed
Layer 2: Package index updated
Layer 1: Ubuntu base filesystem

These layers are read-only. They’re stored once and shared across all containers created from this image.

When you run a container:

docker run -d nginx

Docker creates a new writable layer on top of the image layers. This is the “container layer.” All changes the container makes go here.

Container Layer (read-write, unique per container)
---
Image Layer 4 (read-only, shared)
Image Layer 3 (read-only, shared)
Image Layer 2 (read-only, shared)
Image Layer 1 (read-only, shared)

Storage savings:

If you run 10 nginx containers, you have:

1 copy of layers 1-4 (shared)
10 container layers (small, usually just logs and temporary files)

Instead of 10 full copies of nginx, you have 1 copy plus 10 tiny deltas.

LEVEL 4

Exploring Layers on Disk

Let’s see where these layers actually live.

Inspect an image’s layers:

docker image inspect nginx --format='{{.RootFS.Layers}}'

You’ll see a list of layer SHA256 hashes. Each hash represents a layer.

View storage driver:

docker info | grep "Storage Driver"

On modern Docker, this is usually overlay2.

Find layer data on disk:

ls /var/lib/docker/overlay2/

You’ll see directories with random IDs. Each directory is a layer or a container’s writable layer.

View a layer’s contents:

# Get a layer ID from the overlay2 directory
ls /var/lib/docker/overlay2/<layer-id>/diff

The diff directory contains the files added or changed in that layer.

View a container’s writable layer:

# Start a container
docker run -d --name demo nginx

# Find its layer
docker inspect demo --format='{{.GraphDriver.Data.UpperDir}}'

# This shows the path to the container's writable layer
ls <path-from-above>

Initially, this directory is nearly empty. As the container runs and creates/modifies files, they appear here.

Prove layer sharing:

# Run two containers from the same image
docker run -d --name c1 nginx
docker run -d --name c2 nginx

# Both share the same lower (image) layers
# But each has its own upper (container) layer

# Create a file in c1
docker exec c1 touch /test-file

# Check if c2 sees it
docker exec c2 ls /test-file
# File not found! Each container has its own writable layer.

Union Filesystems — The Layering Magic

The Problem

The Concept — The Transparent Overlay Sheets

The Mechanics — How Layer Overlay Works

How Docker Images Use Layers

Exploring Layers on Disk

Performance and Limitations