Post

Week 2 — Day 10: Docker Security Hardening

A full walkthrough of Docker security hardening — non-root users, read-only filesystems, dropped capabilities, multi-stage builds, minimal base images, and Docker Bench for Security.

Week 2 — Day 10: Docker Security Hardening

The Container Security Problem

Containers share the host kernel. A misconfigured container running as root can potentially escape to the host, access other containers’ data, or be leveraged to pivot across your infrastructure. Docker has strong isolation by default, but defaults are not hardened — you have to opt in.


Principle 1 — Never Run as Root

By default, processes inside a Docker container run as root (UID 0). If an attacker exploits your app and escapes the container, they land as root on the host.

Fix: create a non-root user in the Dockerfile.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
FROM node:20-alpine

# Create a non-root user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup

WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .

# Switch to non-root before the final CMD
USER appuser

EXPOSE 3000
CMD ["node", "server.js"]

Verify:

1
2
docker run --rm myapp whoami
# Should print: appuser (not root)

[SCREENSHOT]Terminal showing docker run whoami returning “appuser” not “root”

For existing images you can’t modify:

1
docker run --user 1001:1001 nginx

Principle 2 — Read-Only Filesystem

A writable root filesystem lets malware write new binaries, modify configs, or install persistence mechanisms. Make it read-only:

1
docker run --read-only myapp

If your app genuinely needs to write (logs, temp files), use targeted tmpfs mounts:

1
2
3
4
docker run \
  --read-only \
  --tmpfs /tmp:rw,noexec,nosuid,size=64m \
  myapp

[SCREENSHOT]Terminal showing docker run –read-only failing to write to /app/data (permission denied) but succeeding with –tmpfs /tmp mounted

In Docker Compose:

1
2
3
4
5
6
services:
  app:
    image: myapp
    read_only: true
    tmpfs:
      - /tmp

Principle 3 — Drop Capabilities

Linux capabilities break root’s all-or-nothing power into individual privileges. Docker containers start with a default set of ~14 capabilities. Drop everything not needed.

Drop all, add back only what’s required:

1
2
3
4
docker run \
  --cap-drop ALL \
  --cap-add NET_BIND_SERVICE \
  myapp

NET_BIND_SERVICE allows binding to ports below 1024. Most apps don’t even need this if you use ports above 1024.

Common capabilities and when to use them:

CapabilityNeeded for
NET_BIND_SERVICEBinding to ports < 1024
CHOWNChanging file ownership
DAC_OVERRIDEBypassing file permissions
SETUID / SETGIDChanging user/group ID
SYS_PTRACEDebugging with ptrace

[SCREENSHOT]Terminal showing docker run –cap-drop ALL –cap-add NET_BIND_SERVICE working correctly, vs a second run with no –cap-add failing when trying to bind port 80

In Docker Compose:

1
2
3
4
5
6
services:
  app:
    cap_drop:
      - ALL
    cap_add:
      - NET_BIND_SERVICE

Principle 4 — No Privileged Mode

Never run --privileged in production. Privileged mode gives the container almost all Linux capabilities and mounts the host’s devices — effectively bypassing container isolation.

1
2
3
4
5
6
7
# NEVER in production:
docker run --privileged myapp

# Also avoid:
docker run --pid=host myapp      # shares host PID namespace
docker run --network=host myapp  # shares host network namespace
docker run -v /:/host myapp      # mounts host root filesystem

Principle 5 — Minimal Base Images

Smaller images have fewer packages, fewer CVEs, and a smaller attack surface.

Base ImageSizeUse When
scratch0 MBStatically compiled Go binaries
alpine~5 MBGeneral purpose
distroless~20 MBProduction — no shell, no package manager
slim variantsvariesNode, Python apps needing some system libs

Alpine example:

1
2
3
4
FROM python:3.12-alpine

RUN apk add --no-cache gcc musl-dev
RUN pip install --no-cache-dir -r requirements.txt

Distroless (no shell = harder to exploit):

1
2
3
4
5
6
7
8
9
10
FROM python:3.12-slim AS builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt -t /app/deps

FROM gcr.io/distroless/python3
COPY --from=builder /app/deps /app/deps
COPY . /app
ENV PYTHONPATH=/app/deps
CMD ["/app/server.py"]

[SCREENSHOT]Terminal showing docker images listing showing the size difference between python:3.12 (~1GB), python:3.12-slim (~130MB), and python:3.12-alpine (~50MB)


Principle 6 — Multi-Stage Builds

Multi-stage builds keep build tools out of the final image — compilers, test frameworks, and dev dependencies don’t belong in production.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# Stage 1: Build
FROM node:20 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci                  # installs ALL deps including devDependencies
COPY . .
RUN npm run build           # compile TypeScript, bundle, etc.

# Stage 2: Production
FROM node:20-alpine AS production
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production   # only production deps
COPY --from=builder /app/dist ./dist   # only the built output

USER node
EXPOSE 3000
CMD ["node", "dist/server.js"]

The final image contains only the Alpine Node runtime, production deps, and compiled output. The build tools never make it in.

[SCREENSHOT]Terminal showing docker build output with two stages completing, followed by docker images showing the final image size is much smaller than a single-stage build


Principle 7 — No Secrets in Images

Never put secrets in Dockerfiles or image layers — they’re permanent in the image history.

1
2
3
4
5
6
7
# WRONG — this is stored in every layer and visible in docker history
ENV API_KEY=supersecret123
RUN curl -H "Authorization: $API_KEY" https://api.example.com

# WRONG — ARG values appear in docker history too
ARG API_KEY
RUN curl -H "Authorization: $API_KEY" https://api.example.com

Right approach: Use runtime secrets injection (environment variables from Secrets Manager, Vault, or Docker secrets).

1
2
# Runtime injection — never baked into the image
docker run -e API_KEY=$(aws secretsmanager get-secret-value ...) myapp

Check your image history for leaked secrets:

1
docker history --no-trunc myapp | grep -i "secret\|password\|key\|token"

[SCREENSHOT]Terminal showing docker history –no-trunc output on a clean image with no secrets visible in the layer commands


Docker Bench for Security

Docker Bench is an automated script that checks your Docker host and containers against CIS Docker Benchmark.

1
2
3
4
5
6
7
8
9
10
11
docker run --rm --net host --pid host --userns host --cap-add audit_control \
  -e DOCKER_CONTENT_TRUST=$DOCKER_CONTENT_TRUST \
  -v /etc:/etc:ro \
  -v /lib/systemd/system:/lib/systemd/system:ro \
  -v /usr/bin/containerd:/usr/bin/containerd:ro \
  -v /usr/bin/runc:/usr/bin/runc:ro \
  -v /usr/lib/systemd:/usr/lib/systemd:ro \
  -v /var/lib:/var/lib:ro \
  -v /var/run/docker.sock:/var/run/docker.sock:ro \
  --label docker_bench_security \
  docker/docker-bench-security

[SCREENSHOT]Terminal showing Docker Bench output with PASS (green), WARN (yellow), and INFO lines — showing checks like “Ensure a user for the container has been created” and “Ensure the container’s root filesystem is mounted as read only”

Each check is marked:

  • [PASS] — compliant
  • [WARN] — needs attention
  • [INFO] — informational
  • [NOTE] — not applicable

Work through the [WARN] items and fix them one by one.


Hardened Dockerfile Template

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

FROM node:20-alpine AS production

# Create non-root user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup

WORKDIR /app

# Copy only production artifacts
COPY package*.json ./
RUN npm ci --only=production
COPY --from=builder /app/dist ./dist

# Set ownership
RUN chown -R appuser:appgroup /app

# Switch to non-root
USER appuser

# Don't expose unnecessary env vars
ENV NODE_ENV=production

EXPOSE 3000

# Use exec form (no shell wrapper)
CMD ["node", "dist/server.js"]

Run it with:

1
2
3
4
5
6
7
docker run \
  --read-only \
  --cap-drop ALL \
  --cap-add NET_BIND_SERVICE \
  --tmpfs /tmp \
  --security-opt no-new-privileges \
  myapp

--security-opt no-new-privileges prevents the process from gaining new privileges via setuid binaries inside the container.


Key Takeaways

  • Non-root user is the single most impactful change — do this in every Dockerfile
  • Read-only filesystem + tmpfs for writable paths limits what an attacker can do post-exploitation
  • Drop ALL capabilities and add back only what’s needed
  • Multi-stage builds keep build tools out of production images
  • Never put secrets in Dockerfiles — they end up in image layers permanently
  • Run Docker Bench to get a full checklist of what needs fixing on your host

References


You can find me online at:

My signature image

This post is licensed under CC BY 4.0 by the author.