Your docker-compose.yml has depends_on configured correctly, but your app container starts before PostgreSQL is accepting connections. The app crashes with "Connection refused" and exits. Docker Compose doesn't restart it because the service "started." The entire stack fails. How do you ensure the app waits for the database to be truly ready?
depends_on only ensures container startup order, not readiness. PostgreSQL's container may start before the database server is ready to accept connections. Solution: (1) Use a wait script in your app entrypoint that polls the database until it's ready. In docker-compose.yml: command: sh -c "dockerize -wait tcp://postgres:5432 -timeout 20s && app.sh" (requires dockerize tool). (2) Implement application-level retry logic: when the app starts, it tries to connect to the database with exponential backoff. If connection fails, retry up to N times before giving up. (3) Use depends_on with condition: service_healthy: depends_on: postgres: condition: service_healthy. This requires the postgres service to define a health check that passes. (4) Add a health check to PostgreSQL: healthcheck: test: ["CMD-SHELL", "pg_isready -U user -d db"], interval: 5s, timeout: 5s, retries: 5. Example: in your app, wrap database initialization in a retry loop: for i in range(5): try: db.connect(); break; except: time.sleep(2^i). This ensures your app doesn't crash if the database isn't ready yet—it will connect once the database is accepting connections.
Follow-up: What's the difference between a container starting and a service being ready? How do you detect true readiness?
Your docker-compose stack has 5 services: web, api, postgres, redis, and elasticsearch. You've implemented depends_on with condition: service_healthy on all of them. But Elasticsearch takes 15 seconds to start and report healthy. Meanwhile, your entire stack is waiting. The startup time is now 45 seconds for a local dev stack. How do you optimize startup sequence without removing dependencies?
Over-strict dependency ordering causes slow startup. Optimize by: (1) Identify which dependencies are actually critical. Does the web service need Elasticsearch on startup? Probably not—it's nice to have. Does it need Redis? Maybe not immediately. Does it need the API? No, they're peers. Restructure: web and api can start in parallel. They connect to postgres (if needed) and fail gracefully if Redis/Elasticsearch are missing. (2) Use soft dependencies: remove condition: service_healthy for non-critical services. Instead, implement application-level fallback: if Redis is missing, use in-memory cache; if Elasticsearch is missing, use database search. (3) Parallelize: services with no interdependency should start concurrently. Docker Compose does this automatically if you don't add depends_on. (4) Use depends_on only for hard requirements: api depends on postgres (hard); web and api don't depend on each other (they're parallel). (5) For services that are slow to become healthy (Elasticsearch, Kafka), don't use condition: service_healthy. Instead, implement health checks in the application: if Elasticsearch is unavailable, the app logs a warning and continues—it will connect when Elasticsearch is ready. docker-compose.yml example: web and api start immediately; they connect to postgres (which starts with condition: service_healthy); redis and elasticsearch start in background without blocking. This reduces startup time to 15 seconds instead of 45.
Follow-up: How do you handle circular dependencies in docker-compose? Is it even possible?
In production, you can't use docker-compose, so you're using Kubernetes. You've defined a Deployment for your app and a StatefulSet for PostgreSQL. Your app starts immediately and crashes with "Database not ready." depends_on doesn't exist in Kubernetes. How do you enforce startup ordering in Kubernetes?
Kubernetes has no built-in depends_on. Solutions: (1) Use init containers: they run before the main app container and can wait for dependencies. Example: initContainers: - name: wait-for-db, image: busybox, command: ['sh', '-c', "until nc -z postgres 5432; do sleep 1; done"]. This blocks the pod from starting until PostgreSQL is reachable. (2) Use a Job to initialize the database before deploying the app: run db migrations as a Job, wait for it to complete, then deploy the app. (3) Implement application-level readiness probes: your app's startup may take time, and the kubelet will wait for readiness probes to pass before marking the pod ready. (4) Use Helm or Kustomize to define deployment order: deploy postgres first (wait for rollout), then deploy the app. (5) Implement retry logic in your app: on startup, it tries to connect to the database with exponential backoff. If the database isn't ready, it fails the readiness probe, and Kubernetes restarts the pod. This is simpler than explicit ordering. Best practice: combine init containers (quick sanity check) + application retry logic (graceful degradation). This is more resilient than strict ordering—if PostgreSQL goes down later, the app doesn't crash; it retries and reconnects.
Follow-up: Why doesn't Kubernetes have a depends_on feature? What's the design philosophy?
Your docker-compose database setup includes an initialization script: volumes: - ./init.sql:/docker-entrypoint-initdb.d/init.sql. The app container starts before the init script finishes running. The database is empty and the app crashes. You add depends_on but it still happens intermittently. Why doesn't depends_on guarantee the init script runs first?
depends_on ensures container startup order but not script completion. PostgreSQL's entrypoint runs the init script after the database starts, but this can take time. The postgres container may report "healthy" before init scripts finish. Fix: (1) Add a health check to PostgreSQL that validates the schema exists: healthcheck: test: ['CMD-SHELL', 'psql -U user -d db -c "SELECT COUNT(*) FROM information_schema.tables;"'], interval: 2s, timeout: 5s, retries: 5. This ensures the health check passes only after init scripts complete and tables are created. (2) Use depends_on with condition: service_healthy: app: depends_on: postgres: condition: service_healthy. (3) Verify the health check logic: it should query for the presence of tables created by your init script. (4) Increase retries if init script is slow: retries: 10 or 15. (5) In your app, implement fallback: if tables don't exist, run migrations in the app itself. Example health check: psql -U user -d db -c "SELECT * FROM users LIMIT 1;" only passes if the users table exists and is populated. This ensures the postgres service is marked healthy only after initialization is complete.
Follow-up: How do you debug docker-compose startup order issues? What tools can you use?
You have a microservices stack: api-v1 and api-v2 should run in parallel. Both depend on postgres. But your docker-compose file has depends_on: api-v1 depends on api-v2, so they start sequentially. This is wrong and slow. How do you refactor to express true dependencies and allow parallelism?
Incorrect depends_on introduces false ordering and serializes startup. Refactor: (1) Remove artificial dependencies: api-v1 and api-v2 don't depend on each other, so neither should depend on the other. (2) Both should independently depend on postgres only: services: api-v1: depends_on: postgres: condition: service_healthy; api-v2: depends_on: postgres: condition: service_healthy. (3) This allows Docker Compose to start api-v1 and api-v2 concurrently once postgres is healthy. (4) If they do have a dependency (e.g., api-v2 uses api-v1's API), express this explicitly: api-v2: depends_on: postgres: condition: service_healthy; api-v1: condition: service_healthy. But implement graceful fallback: if api-v1 isn't ready, api-v2 should retry. (5) Use depends_on to express direct dependencies only—don't create chains. Graph your dependencies: if A depends on C, and B depends on C, don't make B depend on A. Let them both depend on C. Visualization: Use docker-compose config | grep -A 10 depends_on to audit your dependency graph. Apply topological sorting principles: identify the critical path (postgres -> api-v1 -> api-v2) and parallelize everything else. This reduces startup time by allowing services to start concurrently when they have no interdependency.
Follow-up: How do you represent complex multi-level dependencies in docker-compose? Is there a better way than depends_on?
Your app needs to connect to a remote RDS database (not a local PostgreSQL container). You can't add depends_on for an external service that docker-compose doesn't manage. Your app crashes on startup because RDS isn't reachable. How do you handle external dependencies in docker-compose?
docker-compose can only manage services it creates. External dependencies require different handling. Solutions: (1) For external services, implement application-level retry logic: your app tries to connect with exponential backoff. If RDS is unreachable, the app fails the startup readiness check and restarts. Docker Compose will restart it until RDS is available. (2) Use environment variables for external service endpoints: services: app: environment: - DATABASE_URL=postgres://user:pass@rds-endpoint:5432/db. At startup, the app verifies it can reach this endpoint. (3) Use health checks inside the app to detect external dependency issues: expose an endpoint that checks connectivity to RDS. Docker health checks can verify this endpoint. (4) Use a sidecar or init container to validate external connectivity before starting the app. Example init script: until pg_isready -h $RDS_ENDPOINT -p 5432; do sleep 1; done. Then start the app. (5) For local dev, use docker-compose to spin up a local PostgreSQL; for production, use external RDS. Use docker-compose overrides to switch: docker-compose -f docker-compose.yml -f docker-compose.prod.yml up. This allows testing dependency ordering locally without relying on external services during development.
Follow-up: How do you manage secrets for external services (RDS credentials) in docker-compose safely?
You're using docker-compose for integration tests. Your test suite needs the app, postgres, and redis to be ready before running tests. Currently, tests start immediately after docker-compose up completes, but services aren't fully initialized yet. Tests fail intermittently due to timing issues. Design a robust test startup sequence.
Test orchestration requires tighter coordination than production. Implement: (1) Set up docker-compose with proper health checks for all services: postgres: healthcheck: test: ['CMD-SHELL', 'pg_isready -U user'], interval: 2s, timeout: 5s, retries: 5. redis: healthcheck: test: ['CMD-SHELL', 'redis-cli ping'], interval: 2s, timeout: 5s, retries: 5. (2) Add a test-ready service that waits for all dependencies: test-setup: image: busybox, command: 'sh -c "echo waiting for services" && sleep 1', depends_on: app: condition: service_healthy, postgres: condition: service_healthy, redis: condition: service_healthy. (3) Run tests as a docker-compose service: test: build: .; command: npm test, depends_on: app: condition: service_healthy, postgres: condition: service_healthy, redis: condition: service_healthy. volumes: - ./test-results:/app/test-results. (4) Use docker-compose up and wait for the test service to complete: docker-compose up --abort-on-container-exit test. (5) Capture test output: docker-compose logs test. This ensures all dependencies are healthy before tests run, and tests can't start prematurely. The depends_on with condition: service_healthy enforces strict ordering: postgres and redis start first, app waits for them, tests wait for all three. This eliminates flaky tests due to timing issues.
Follow-up: How do you handle test cleanup and data isolation between test runs with docker-compose?
You're migrating from docker-compose to Kubernetes. Your compose file has complex depends_on logic with multiple condition: service_healthy checks. Kubernetes doesn't have an equivalent feature. How do you replicate this behavior in Kubernetes without relying on orchestration features?
Kubernetes doesn't support depends_on, so rely on application resilience and Kubernetes primitives: (1) Use init containers for blocking waits: initContainers: - name: wait-postgres, image: busybox, command: ['sh', '-c', 'until nc -z postgres 5432; do echo waiting; sleep 2; done']. (2) Use readiness probes to signal when your app is ready to receive traffic: readinessProbe: httpGet: path: /ready, port: 8080, initialDelaySeconds: 10, periodSeconds: 5. (3) Implement health checks in your app that detect unready dependencies and return non-ready status: if database isn't reachable, return 503 from /ready. (4) Use Pod readiness to enforce implicit ordering: a Pod won't be added to the service until its readiness probe passes. So if app's readiness depends on postgres connectivity, the app Pod won't be ready until postgres is up. (5) For initialization that must happen once per cluster (like database migrations), use a Job: apiVersion: batch/v1, kind: Job, spec: containers: - name: migrate, image: app, command: [./migrate.sh]. Run the Job before deploying the app Deployment. (6) Use StatefulSets for databases (postgres) to ensure they start first and are stable: postgres is deployed as StatefulSet with ordinal 0; app Deployment depends on postgres being ready via init containers. This approach is more distributed and resilient than compose's strict ordering—it relies on applications handling transient failures gracefully.
Follow-up: What's the best practice for handling database migrations in Kubernetes? Should they run in a Job or in init containers?