Introduction

Building efficient and secure Docker images requires following best practices that reduce image size, improve build times, and minimize security vulnerabilities. This guide covers essential techniques for production-ready containers.

Multi-Stage Builds

The Problem: Bloated Images

Before (single-stage build):

FROM node:18
WORKDIR /app

# Install dependencies
COPY package*.json ./
RUN npm install  # Includes devDependencies

# Copy source
COPY . .

# Build
RUN npm run build

# Runtime includes build tools and dependencies
CMD ["node", "dist/index.js"]

Result: 1.2GB image with unnecessary build tools and dependencies.

Solution: Multi-Stage Build

# Stage 1: Build
FROM node:18 AS builder
WORKDIR /app

COPY package*.json ./
RUN npm ci --only=production  # Production deps only

COPY . .
RUN npm run build

# Stage 2: Runtime
FROM node:18-alpine
WORKDIR /app

# Copy only necessary files from builder
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./

USER node
CMD ["node", "dist/index.js"]

Result: 180MB image with only runtime dependencies.

Real-World Examples

Go Application

# Build stage
FROM golang:1.21 AS builder
WORKDIR /app

# Cache dependencies
COPY go.mod go.sum ./
RUN go mod download

# Build
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o main .

# Runtime stage
FROM alpine:3.18
RUN apk --no-cache add ca-certificates

WORKDIR /root/
COPY --from=builder /app/main .

EXPOSE 8080
CMD ["./main"]

Size comparison:

  • Single stage (golang:1.21): 1.2GB
  • Multi-stage (alpine): 15MB
  • Reduction: 98.7%

Python Application

# Build stage
FROM python:3.11 AS builder
WORKDIR /app

# Install dependencies
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt

# Runtime stage
FROM python:3.11-slim
WORKDIR /app

# Copy dependencies from builder
COPY --from=builder /root/.local /root/.local
COPY . .

# Update PATH
ENV PATH=/root/.local/bin:$PATH

USER nobody
CMD ["python", "app.py"]

Java Application

# Build stage
FROM maven:3.9-eclipse-temurin-17 AS builder
WORKDIR /app

# Cache dependencies
COPY pom.xml .
RUN mvn dependency:go-offline

# Build
COPY src ./src
RUN mvn clean package -DskipTests

# Runtime stage
FROM eclipse-temurin:17-jre-alpine
WORKDIR /app

COPY --from=builder /app/target/myapp.jar .

EXPOSE 8080
ENTRYPOINT ["java", "-jar", "myapp.jar"]

Advanced Multi-Stage Patterns

Testing Stage

# Dependencies stage
FROM node:18 AS deps
WORKDIR /app
COPY package*.json ./
RUN npm ci

# Test stage
FROM deps AS test
COPY . .
RUN npm run lint
RUN npm run test
RUN npm run security-audit

# Build stage
FROM deps AS builder
COPY . .
RUN npm run build

# Production stage
FROM node:18-alpine AS production
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=deps /app/node_modules ./node_modules
CMD ["node", "dist/index.js"]

Build with testing:

docker build --target test -t myapp:test .
docker build --target production -t myapp:latest .

Image Optimization

1. Use Minimal Base Images

Size comparison:

# Full OS: 500MB
FROM ubuntu:22.04

# Minimal: 78MB
FROM debian:bookworm-slim

# Alpine: 7MB
FROM alpine:3.18

# Distroless: 2MB
FROM gcr.io/distroless/static

When to use each:

Alpine:

  • Small applications
  • Need package manager
  • OK with musl libc
FROM alpine:3.18
RUN apk add --no-cache ca-certificates

Distroless:

  • Maximum security
  • No shell or package manager needed
  • Production workloads
FROM gcr.io/distroless/static-debian11
COPY --from=builder /app/binary /
CMD ["/binary"]

Slim variants:

  • Need specific tools
  • Debian compatibility required
FROM python:3.11-slim

2. Layer Optimization

Bad (inefficient layers):

FROM node:18
WORKDIR /app

RUN apt-get update
RUN apt-get install -y git
RUN apt-get install -y curl
RUN apt-get clean

COPY package.json .
RUN npm install
COPY . .

Good (optimized layers):

FROM node:18
WORKDIR /app

# Combine related commands
RUN apt-get update && \
    apt-get install -y --no-install-recommends \
        git \
        curl && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# Copy package.json first (cache optimization)
COPY package*.json ./
RUN npm ci --only=production

# Copy source last (changes most frequently)
COPY . .

3. .dockerignore

Create .dockerignore:

# .dockerignore
node_modules
npm-debug.log
.git
.gitignore
README.md
.env
.env.local
.vscode
.idea
*.md
tests
__pycache__
*.pyc
.pytest_cache
.coverage
dist-test
.dockerignore
Dockerfile
docker-compose.yml

Impact:

# Without .dockerignore
COPY . .  # Copies 500MB

# With .dockerignore
COPY . .  # Copies 10MB

4. Build Cache Optimization

Leverage build cache:

FROM python:3.11-slim

WORKDIR /app

# 1. Copy only dependency files first
COPY requirements.txt .

# 2. Install dependencies (cached if requirements.txt unchanged)
RUN pip install --no-cache-dir -r requirements.txt

# 3. Copy source code last (changes frequently)
COPY . .

CMD ["python", "app.py"]

BuildKit cache mounts:

# syntax=docker/dockerfile:1

FROM golang:1.21 AS builder

WORKDIR /app

COPY go.mod go.sum ./

# Use cache mount for Go modules
RUN --mount=type=cache,target=/go/pkg/mod \
    go mod download

COPY . .

# Use cache mount for build cache
RUN --mount=type=cache,target=/go/pkg/mod \
    --mount=type=cache,target=/root/.cache/go-build \
    go build -o main .

FROM alpine:3.18
COPY --from=builder /app/main .
CMD ["./main"]

Enable BuildKit:

export DOCKER_BUILDKIT=1
docker build -t myapp .

5. Minimize Layers

Too many layers:

RUN apt-get update
RUN apt-get install -y git
RUN apt-get install -y curl
RUN apt-get clean
RUN rm -rf /var/lib/apt/lists/*

Optimized:

RUN apt-get update && \
    apt-get install -y --no-install-recommends git curl && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

6. Remove Unnecessary Files

# Install dependencies and clean up in same layer
RUN apt-get update && \
    apt-get install -y --no-install-recommends \
        build-essential \
        python3-dev && \
    pip install -r requirements.txt && \
    apt-get purge -y build-essential python3-dev && \
    apt-get autoremove -y && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

Security Best Practices

1. Run as Non-Root User

Bad (runs as root):

FROM node:18
WORKDIR /app
COPY . .
CMD ["node", "app.js"]  # Runs as root

Good (runs as non-root):

FROM node:18
WORKDIR /app
COPY . .

# Create non-root user
RUN groupadd -r appuser && useradd -r -g appuser appuser
RUN chown -R appuser:appuser /app

USER appuser
CMD ["node", "app.js"]

Using existing user (alpine):

FROM alpine:3.18

# Use nobody user
USER nobody
CMD ["/app/binary"]

Node.js with built-in user:

FROM node:18-alpine
WORKDIR /app
COPY --chown=node:node . .
USER node
CMD ["node", "app.js"]

2. Scan for Vulnerabilities

Using Trivy:

# Install Trivy
brew install aquasecurity/trivy/trivy

# Scan image
trivy image myapp:latest

# Scan with severity filter
trivy image --severity HIGH,CRITICAL myapp:latest

# Fail build on high severity
trivy image --exit-code 1 --severity HIGH,CRITICAL myapp:latest

Integrate in CI/CD:

# .github/workflows/docker.yml
name: Docker Build

on: [push]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Build image
        run: docker build -t myapp:${{ github.sha }} .

      - name: Scan with Trivy
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: myapp:${{ github.sha }}
          format: 'sarif'
          output: 'trivy-results.sarif'
          severity: 'HIGH,CRITICAL'

      - name: Upload results
        uses: github/codeql-action/upload-sarif@v2
        with:
          sarif_file: 'trivy-results.sarif'

Using Snyk:

# Install Snyk
npm install -g snyk

# Authenticate
snyk auth

# Scan image
snyk container test myapp:latest

# Monitor image
snyk container monitor myapp:latest

3. Use Specific Image Tags

Bad:

FROM node:latest  # Unpredictable
FROM python       # Uses latest

Good:

FROM node:18.17.1-alpine3.18     # Specific version
FROM python:3.11.5-slim-bookworm # Specific version

Pin with digest:

FROM node:18.17.1-alpine3.18@sha256:abc123...
# Guarantees exact same image

4. Minimize Attack Surface

Remove shells and tools:

# Distroless (no shell)
FROM gcr.io/distroless/nodejs18-debian11
COPY --from=builder /app/dist /app/dist
CMD ["/app/dist/index.js"]

Read-only filesystem:

FROM alpine:3.18
RUN adduser -D appuser
USER appuser

# Use read-only root filesystem
# Specify writable volumes in docker run:
# docker run --read-only -v /tmp myapp

Drop capabilities:

# docker-compose.yml
services:
  app:
    image: myapp:latest
    cap_drop:
      - ALL
    cap_add:
      - NET_BIND_SERVICE

5. Secrets Management

Bad (secrets in image):

# DON'T DO THIS
ENV API_KEY=abc123
COPY .env .

Good (runtime secrets):

# Pass at runtime
docker run -e API_KEY=$API_KEY myapp

# Use Docker secrets
echo "abc123" | docker secret create api_key -
docker service create --secret api_key myapp

Build-time secrets (BuildKit):

# syntax=docker/dockerfile:1

FROM alpine:3.18

# Use secret mount (not stored in image)
RUN --mount=type=secret,id=github_token \
    TOKEN=$(cat /run/secrets/github_token) && \
    git clone https://${TOKEN}@github.com/private/repo.git
# Build with secret
docker build --secret id=github_token,src=./token.txt -t myapp .

6. Lint Dockerfiles

Using Hadolint:

# Install
brew install hadolint

# Lint Dockerfile
hadolint Dockerfile

# Ignore specific rules
hadolint --ignore DL3018 Dockerfile

Common issues detected:

  • Missing --no-cache in apk/apt-get
  • Using ADD instead of COPY
  • Using latest tag
  • Running as root
  • Not cleaning package cache

Example output:

Dockerfile:5 DL3018 Pin versions in apk add
Dockerfile:12 DL3059 Multiple consecutive RUN instructions
Dockerfile:20 DL3020 Use COPY instead of ADD

CI/CD integration:

# .github/workflows/lint.yml
name: Lint

on: [push]

jobs:
  hadolint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: hadolint/[email protected]
        with:
          dockerfile: Dockerfile

Complete Example: Production-Ready Dockerfile

Node.js Application

# syntax=docker/dockerfile:1

# Stage 1: Dependencies
FROM node:18.17.1-alpine3.18 AS deps

WORKDIR /app

# Install security updates
RUN apk upgrade --no-cache

# Copy dependency files
COPY package*.json ./

# Install production dependencies
RUN npm ci --only=production && \
    npm cache clean --force

# Stage 2: Build
FROM node:18.17.1-alpine3.18 AS builder

WORKDIR /app

# Copy dependencies from deps stage
COPY --from=deps /app/node_modules ./node_modules

# Copy source
COPY . .

# Build application
RUN npm run build

# Stage 3: Test
FROM builder AS test

# Install dev dependencies
RUN npm ci

# Run tests
RUN npm run lint
RUN npm run test
RUN npm audit --audit-level=high

# Stage 4: Production
FROM node:18.17.1-alpine3.18

# Install security updates
RUN apk upgrade --no-cache && \
    apk add --no-cache dumb-init

WORKDIR /app

# Copy built application
COPY --from=builder --chown=node:node /app/dist ./dist
COPY --from=deps --chown=node:node /app/node_modules ./node_modules
COPY --chown=node:node package.json ./

# Use non-root user
USER node

# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
    CMD node -e "require('http').get('http://localhost:3000/health', (res) => { process.exit(res.statusCode === 200 ? 0 : 1); });"

# Use dumb-init to handle signals
ENTRYPOINT ["dumb-init", "--"]

# Expose port
EXPOSE 3000

# Start application
CMD ["node", "dist/index.js"]

Python FastAPI Application

# syntax=docker/dockerfile:1

# Build stage
FROM python:3.11.5-slim-bookworm AS builder

WORKDIR /app

# Install system dependencies
RUN apt-get update && \
    apt-get install -y --no-install-recommends \
        gcc \
        python3-dev && \
    rm -rf /var/lib/apt/lists/*

# Install Python dependencies
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt

# Runtime stage
FROM python:3.11.5-slim-bookworm

# Install security updates
RUN apt-get update && \
    apt-get upgrade -y && \
    apt-get install -y --no-install-recommends \
        curl && \
    rm -rf /var/lib/apt/lists/*

WORKDIR /app

# Copy Python dependencies
COPY --from=builder /root/.local /root/.local

# Copy application
COPY . .

# Create non-root user
RUN useradd -m -u 1000 appuser && \
    chown -R appuser:appuser /app

USER appuser

# Update PATH
ENV PATH=/root/.local/bin:$PATH

# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s \
    CMD curl -f http://localhost:8000/health || exit 1

EXPOSE 8000

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

Image Size Comparison

Before Optimization

FROM node:18
WORKDIR /app
COPY . .
RUN npm install
RUN npm run build
CMD ["node", "dist/index.js"]

Size: 1.2GB

After Optimization

FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build

FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
USER node
CMD ["node", "dist/index.js"]

Size: 180MB (85% reduction)

Build Performance Tips

1. Use BuildKit

# Enable BuildKit
export DOCKER_BUILDKIT=1

# Or in daemon.json
{
  "features": {
    "buildkit": true
  }
}

Benefits:

  • Parallel builds
  • Build cache improvements
  • Secret mounts
  • SSH mounts

2. Cache Mounts

# syntax=docker/dockerfile:1

FROM golang:1.21

WORKDIR /app

# Use cache mount
RUN --mount=type=cache,target=/go/pkg/mod \
    --mount=type=cache,target=/root/.cache/go-build \
    go build -o main .

3. Parallel Stages

# Stages run in parallel when possible
FROM base AS test-unit
RUN npm run test:unit

FROM base AS test-integration
RUN npm run test:integration

FROM base AS lint
RUN npm run lint

# Final stage depends on all
FROM base AS final
COPY --from=test-unit /app/coverage ./coverage-unit
COPY --from=test-integration /app/coverage ./coverage-integration

Monitoring and Debugging

Image Analysis

# Inspect image layers
docker history myapp:latest

# Show layer sizes
docker history myapp:latest --human --format "table {{.Size}}\t{{.CreatedBy}}"

# Use dive for interactive analysis
dive myapp:latest

Runtime Debugging

# Override entrypoint
docker run -it --entrypoint /bin/sh myapp:latest

# Check running processes
docker exec myapp ps aux

# Check resource usage
docker stats myapp

# Inspect container
docker inspect myapp

Checklist for Production Images

Security:

  • Run as non-root user
  • Scan for vulnerabilities
  • Use specific image tags
  • No secrets in image
  • Minimal base image
  • Security updates installed

Optimization:

  • Multi-stage build
  • .dockerignore configured
  • Minimal layers
  • Build cache optimized
  • Small base image

Reliability:

  • Health check configured
  • Proper signal handling
  • Resource limits set
  • Logging to stdout/stderr

Maintainability:

  • Dockerfile linted
  • Labels added
  • Documentation included
  • Versioned properly

Conclusion

Building production-ready Docker images requires attention to:

  1. Size optimization - Multi-stage builds, minimal base images
  2. Security - Non-root users, vulnerability scanning, no secrets
  3. Performance - Build cache, layer optimization, BuildKit
  4. Reliability - Health checks, proper signals, resource limits
  5. Maintainability - Linting, documentation, versioning

Key takeaways:

  • Always use multi-stage builds for compiled languages
  • Pin image versions with specific tags
  • Scan images for vulnerabilities in CI/CD
  • Run as non-root user
  • Optimize layer caching
  • Use .dockerignore

Following these practices results in smaller, faster, and more secure containers ready for production deployment.