Docker Interview — Image Layer Caching and Build Optimization

Your GitHub Actions CI runs `docker build` for a Node.js app 40 times/day. Despite unchanged package.json, npm install is re-run every build (3m45s). Build time: 4m10s. You run `docker build --no-cache` intentionally for reproducibility, so cache never helps. How do you balance reproducibility with layer caching across CI runs?

The issue: `--no-cache` bypasses all caching. Instead, persist BuildKit cache across runs: use `docker buildx` with `--cache-to type=registry` and `--cache-from type=registry`. In GitHub Actions: `- uses: docker/setup-buildx-action@v2` then `docker buildx build --cache-from type=registry,ref=myrepo/app:buildcache --cache-to type=registry,ref=myrepo/app:buildcache,mode=max -t myrepo/app:latest .`. This stores cache layers in registry between runs. For reproducibility without losing cache, use `--pull` to refresh base image digest but keep dependency layers: `docker buildx build --pull --cache-from type=registry`. Test: first build 4m10s (no cache). Second build 45s (cache hit on npm). Third build with code change: npm cached (45s), only app layer rebuilt (20s). Verify cache: `docker buildx du` shows cache size.

Follow-up: Your registry can't store 12 image buildcaches (50GB). How do you prioritize which services cache in registry vs local CI runner disk?

Your Dockerfile: `FROM ubuntu:22.04 (78MB) → RUN apt-get update && apt-get install -y build-essential (450MB) → RUN ./configure && make → COPY src . → RUN make` (Makefile depends on src changes). You change a single file in src. Docker rebuilds the entire Makefile (3m00s). Why does this happen and how do you preserve the make cache?

The issue: `COPY src .` happens before `RUN make`. Any change to src invalidates that layer and all downstream layers—including the make output. Docker doesn't understand incremental make; it re-runs from scratch. Solution: use BuildKit's `RUN --mount=type=cache` to persist make's .o files and build artifacts. Rewrite: `RUN --mount=type=cache,target=/build/obj ./configure && make`. This caches obj files across builds. Only recompile changed .c files. Result: single file change → 3m00s → 12s (only changed file recompiled). Test: `docker buildx build --platform linux/amd64 -t app .` (first: 3m). Change src/main.c, rebuild (second: 15s, obj files cached). Verify: `docker exec app ls -la /build/obj/` shows .o files present. For CI: BuildKit cache should be persisted to registry or local storage as above.

Follow-up: Your team edits Makefile rules. Incremental caching gets stale—removing a rule doesn't clean up orphaned .o files. How do you invalidate make cache safely?

Go service: `FROM golang:1.21 (850MB) → WORKDIR /app → COPY go.mod go.sum . → RUN go mod download → COPY . → RUN go build -o /app/bin/service`. First build: 2m30s (downloads 200MB deps). Code change, rebuild: 1m50s (deps cached, recompiles). But if you change go.mod, full rebuild: 2m30s (re-downloads all deps). Optimize so adding one dependency doesn't re-download all 200 packages.

The issue: `go mod download` re-runs if go.mod changes—correct behavior, but slow. Minimize it: use a go mod cache layer. Split into fine-grained steps: (1) `COPY go.mod go.sum .`, (2) `RUN go mod download` (caches this layer if go.mod unchanged), (3) `COPY src/ . && go build`. When you add one dep to go.mod, step 1 invalidates cache, step 2 re-downloads (unavoidable—new dep must fetch). But with BuildKit, use `--mount=type=cache,target=/go/pkg/mod`: `RUN --mount=type=cache,target=/go/pkg/mod go mod download`. Go-mod-graph is cached. When adding a dep, only that dep downloads, others are reused from cache. Result: add 1 dep → 45s (download 1 package) vs 2m30s (re-download all). Verify: `docker buildx build --progress=plain . | grep 'RUN go mod'` shows cache hit/miss. Test: `go get github.com/newpkg@latest >> go.mod && docker build . && echo $?` tracks time.

Follow-up: Your go.mod and go.sum are 2MB combined (mono-repo with 400+ internal packages). The cache layer hit is useless because the file changes every commit. How do you cache go mod without repo churn?

Python microservice Dockerfile: `FROM python:3.11-slim → COPY requirements.txt . → RUN pip install -r requirements.txt (127 packages, 310MB, takes 2m40s) → COPY . → RUN pytest → RUN python -m py_compile .`. You add a test for a new feature (changes test_new.py). Docker re-installs all 127 packages (2m40s). But requirements.txt is unchanged. Why the miss?

The issue: you're copying all files AFTER pip install (correct). But the pip layer gets invalidated by an upstream layer change. Check what changes invalidate it. Run `docker build --verbose 2>&1 | grep -A5 'Step.*RUN pip'` to see if base image digest changes (new python:3.11-slim patch). If base image changed, pip layer rebuilds even if requirements.txt is identical. Solution: (1) pin base image by digest: `FROM python:3.11-slim@sha256:abc123...` instead of tag, so patch updates don't auto-invalidate. (2) Use BuildKit: `docker buildx build --cache-from type=registry,ref=myrepo/app:buildcache .` to fetch cache from previous builds. (3) Use `--mount=type=cache,target=/root/.cache/pip` to cache pip packages: `RUN --mount=type=cache,target=/root/.cache/pip pip install -r requirements.txt`. Result: base image unchanged, requirements.txt unchanged, cache hit (5s). Test: `docker build . -t app1`, add comment to test file, `docker build . -t app2`, diff times (should be equal or app2 slightly faster due to layer caching).

Follow-up: Your security team mandates updating base image weekly for patches. How do you batch requirement-rebuilds across all services without hammering PyPI?

Your monorepo: services A, B, C share one Dockerfile template. Each service's build context is 500MB (includes full mono-repo, but each service uses only 10MB of its own code). Docker builder sends full 500MB context to daemon. Sending 500MB × 3 services × 10 builds/day = 15GB/day over network. How do you minimize build context transfer?

Use `.dockerignore` to exclude irrelevant files (like .git, node_modules of other services, unused documentation). Example: service-a/.dockerignore excludes `service-b/ service-c/ .git node_modules/`. Reduces context 500MB → 50MB. Build context transfer: 50MB × 3 × 10 = 1.5GB/day. For more: use `BuildKit` with `--progress=plain` and track context transfer. Add to Dockerfile: use COPY with src-limited glob: `COPY service-a/ /app/` vs `COPY . /app/` (copies everything). Or restructure: build each service in its own subdirectory, ship dedicated Dockerfile per service. In CI, compute build context hash: `tar -czf context.tar.gz --exclude-from=.dockerignore . && stat context.tar.gz` to verify size. Result: 1.5GB/day → 150MB/day. For local dev: mount git repo as `docker build --build-context git_context=/repo .` and reference in Dockerfile with build args to avoid recopying. Verify: `docker build --progress=verbose . | grep -i context` shows size.

Follow-up: Your .dockerignore excludes .git (saves 800MB), but the app needs git commit hash for version info. How do you pass git metadata without shipping .git?

Kubernetes cluster: 200 nodes, Kubelet image pull policy = Always. Your service image (500MB) is pulled 200 times in parallel on every deployment. Network I/O: 100GB in 30 seconds. Your node network saturates (1Gbps → 800Mbps sustained). Other pods experience 2-5s latency spikes. Design a caching strategy that prevents re-pulling unchanged images.

Set image pull policy to `IfNotPresent` (Kubernetes Pod spec) so Kubelet skips re-pull if image already cached locally. For rollouts, tag new builds uniquely: use git commit SHA or semver, not `latest` tag. Kubelet caches by image ID (digest), not tag. If you re-tag image to `latest` but don't change digest, K8s may skip pull (depends on pull-policy). Best practice: use digest-based pulls: `image: myrepo/app@sha256:abc123...` instead of `myrepo/app:v1.2.3`. This prevents tag-reuse ambiguity. Additionally, use image garbage collection: `--image-gc-high-threshold=80 --image-gc-low-threshold=40` so old images are evicted from cache when disk usage exceeds 80%, keeping cache warm for recent images. Use a local container image cache DaemonSet (e.g., kind/local-path-provisioner or image-builder-cache) to pre-warm images on nodes before deployment. Result: 100GB parallel pull (every node fetches independently) → 0GB (if cached) or tiered pull (pull on 5 nodes, distribute via node-to-node sync). Verify: `kubectl get nodes -o json | jq '.items[].status.images[]' | grep app` shows cached image digest on each node.

Follow-up: Your deployment uses Blue-Green strategy (deploy new version to 100 new nodes, switch traffic, delete old nodes). How do you pre-warm the Blue nodes' image cache to avoid thundering herd on registry?

Your CI system allows conditional builds: if only Dockerfile changes (no code changes), skip app rebuild, reuse old binary. Your Dockerfile has 4 stages; you change stage 3 (test/lint stage, doesn't affect final artifact). Should the full build rebuild? Design a cache strategy for conditional stage builds.

Stage 3 (test) doesn't contribute to final image if only stage 4 is exported. Use BuildKit's `--target` flag: `docker buildx build --target builder .` builds only up to stage "builder", skipping test and final stages. In CI, detect changes: if only stage 3 changes (test rules, lint config), export only stage 3 for validation, skip final image build. Use git to detect: `git diff HEAD~1 -- Dockerfile | grep -E '(FROM|RUN pytest|RUN flake8)'` to identify changed stages. Implementation: (1) first build with `--target builder` (fast, reuses deps). (2) Run tests as separate Docker container, validate. (3) If tests pass and app code unchanged, reuse builder artifact from cache. (4) Build final stage only if app code changed. Result: Dockerfile lint update → 30s (test stage rebuilds only) vs 3m (full rebuild). For this, use BuildKit's cache-from registry feature to preserve builder stage across CI runs. Verify: `docker buildx build --target builder --progress=verbose . 2>&1 | grep -i cache` shows stage hits/misses.

Follow-up: Your test stage needs app binary from builder stage. If builder cache expires, test stage can't run independently. How do you handle orphaned stage dependencies in cache?

Registry mirror: Your team operates a private Docker registry (Harbor) with 800 images cached. Daily, you mirror public images (node:latest, golang:latest, ubuntu:latest). Public image publishers tag as "latest" → you auto-mirror. But latest is re-pushed daily (different digest, same tag). Each re-mirror invalidates all dependent images' build cache in your registry. Result: team builds see cache misses daily. Design a cache-stable mirroring strategy.

The issue: `latest` is unstable—same tag, different digest. When you mirror, dependent services rebuild from scratch. Solution: (1) Pin base images by digest in your Dockerfile: `FROM node@sha256:abc123... # latest as of 2025-01-15`. Use a script to periodically update digests (quarterly, not daily). (2) Tag mirrored images by digest: mirror `node:latest` as `node:latest` AND `node:20.12.0-sha256-abc123` in your Harbor. Have Dockerfile reference the stable digest tag. (3) Use Harbor's tag retention rules: mirror once, tag it `myrepo/node-mirror:18-stable`, keep only latest 3 versions. Reference in Dockerfile: `FROM myrepo/node-mirror:18-stable`. Stable tag (no digest changes) → no cache miss. (4) Implement build cache re-validation: on push to registry, checksum all layers; if layer digest unchanged, reuse cache even if base tag changed. Result: daily latest re-mirror → cache miss fixed by pinning digest. Verify: `docker inspect myrepo/node:18-stable` shows digest; change to new mirror, inspect again, digest should differ only if actual content changed. Test builds see cache hit despite mirror update.

Follow-up: You mirror 50 public images daily. Building digest-mapping at scale is slow. How do you batch digest lookups to avoid API rate-limits on Docker Hub?