MongoDB Interview — Connection Pooling and Driver Internals

Your Node.js application creates a new MongoDB client connection for each request: `const client = new MongoClient(uri); const result = client.db("mydb").collection("items").find().toArray(); await client.close();`. This works but as load increases to 1000 req/sec, you see "ECONNREFUSED" errors. MongoDB logs show "too many open connections". After checking, you have 10,000 TCP connections open. Your MongoDB server has `maxIncomingConnections: 65536` (plenty). Why are connections exhausting?

Creating a new client per request is connection pooling anti-pattern. Each MongoClient has internal connection pools. When you create 1000 clients for 1000 concurrent requests, you also create 1000 connection pools. Even if each pool has min 1 connection, that's 1000 connections for just pooling overhead. Real queries add more—total 10K connections.

The OS also has limits: file descriptor limit (ulimit -n) on Linux defaults to 1024. Even if MongoDB allows 65K connections, your application process can only open 1024 file descriptors—each connection is a file descriptor.

Fix: Use connection pooling correctly. Create ONE MongoClient at application startup, reuse it for all requests: `const client = new MongoClient(uri, {maxPoolSize: 100}); await client.connect(); // reuse for all requests`

Configuration: `maxPoolSize` (default 100) sets max connections per pool. With 1000 concurrent requests and pool size 100, MongoDB queues requests, and connections are reused as queries complete. Optimal: `maxPoolSize = (total_concurrent_requests / num_application_instances) * 1.5`. If 1000 concurrent across 10 app instances = 100 per instance, set maxPoolSize to 150 for headroom.

Verify: Monitor connection count with `db.serverStatus().connections`. Should show ~150 connections (not 10K) when under load. Check application logs for connection reuse metrics.

Follow-up: If you have 50 application instances and want to support 10K concurrent requests with latency <100ms, how would you size the connection pools?

Your MongoDB driver (Node.js) has `maxPoolSize: 100` configured. Under load, you see slow queries (2-3 seconds) even though individual query explain shows <100ms. Application logs show "connection pool exhausted" messages. You increase `maxPoolSize` to 500 but see diminishing returns: latency drops to 2 seconds, then plateaus. Increasing further to 1000 doesn't help. Why doesn't increasing pool size improve latency?

Pool exhaustion is a symptom, not the root cause. If increasing pool size doesn't improve latency, the bottleneck has shifted. Common causes: (1) MongoDB server is CPU-saturated: even with plenty of connections, server can only process queries_per_second. More connections don't help if server is already maxed out. Check `db.serverStatus().opcounters` to see queries/sec; (2) Lock contention: connections are available but compete for document locks. Increasing pool size just queues more waiting requests; (3) Slow queries are blocking the pool: if some queries take 10 seconds but your timeout is 30 seconds, connections hold for 10 seconds. With pool of 1000, you might have 100 slow queries holding connections simultaneously; (4) Network latency: if you have high latency to MongoDB (100ms round-trip), even with 1000 connections, you can only do 10 queries/sec per connection = 10K queries/sec total (if you have that many connections).

Investigation: (1) Check `db.currentOp()` to see active operations—are they actually running or waiting? If waiting, that's lock contention; (2) Run `db.serverStatus()` to see CPU usage, operation latencies; (3) Profile slow queries: `db.setProfilingLevel(1, {slowms: 100})` to log all queries >100ms, then analyze; (4) Monitor network latency from app to MongoDB.

Fix depends on bottleneck: CPU saturation -> scale with sharding; lock contention -> optimize queries/indexes; network latency -> use connection pooling with multiplexing or move app closer to DB.

Follow-up: Design a system to automatically detect connection pool exhaustion and alert/scale before it impacts users.

Your MongoDB client uses connection pooling with `minPoolSize: 10` and `maxPoolSize: 100`. During off-peak hours (10 PM), active connections drop to 2 (most are idle). During peak hours (10 AM), connections jump to 95. You're billed per connection hour by MongoDB Cloud Atlas. The 85 idle connections at night waste ~$500/month. How would you reduce idle connection cost while maintaining performance?

Idle connections in pool cost money (each connection holds memory on server even if unused). Solutions: (1) Lower `minPoolSize`: set `minPoolSize: 1` to reduce baseline connections from 10 to 1 during off-peak. At startup, pool maintains only 1 connection. When traffic spikes, pool grows to max. Trade: when first request arrives after idle period, it might need to create new connection (slight latency spike first request); (2) Connection timeout: set `maxIdleTimeMS: 60000` (connection expires after 60 seconds of idle). Pool automatically closes idle connections after timeout. When traffic returns, new connections are created as needed; (3) Use serverless/on-demand pricing: AWS Lambda or serverless database options bill per-request, not per-connection. If switching is possible, can be more cost-efficient for variable load; (4) Scale down at night: orchestration (Kubernetes, Terraform) can reduce replicas during off-peak, scaled up at peak (requires infrastructure investment).

Recommended: Combine minPoolSize reduction + connection timeout. Set `minPoolSize: 1, maxPoolSize: 100, maxIdleTimeMS: 30000`. Pool maintains 1 connection at baseline, grows on demand to 100 during peak, closes idle connections after 30s. Cost reduction: 85 idle connections * $5/month per connection (Atlas pricing) = $425/month savings.

Verify: Monitor `db.serverStatus().connections.current` and `connections.idle` during off-peak and peak. Confirm current drops to ~10 during off-peak after the timeout expires.

Follow-up: Design a cost optimization system for MongoDB Cloud Atlas that automatically adjusts pool size based on load and time-of-day patterns.

Your application uses async/await with MongoDB driver. You write: `const result = await collection.find().toArray();` which fetches all results at once. For a query returning 1M documents (~1GB), this causes: (1) Entire result loaded into application memory; (2) Long GC pauses (stop-the-world garbage collection for 1GB heap growth); (3) Long latency (waits for all 1M docs to be serialized/transmitted over network). How would you optimize large result sets?

Fetching 1M documents at once is inefficient. Cursor streaming is the answer: `collection.find().forEach(doc => process(doc))` processes docs as they arrive, not loading all into memory. Alternatively, batch fetching: `for await (const batch of collection.find().batch(10000)) { processBatch(batch); }`

Memory efficiency: streaming keeps only N documents in memory at a time (configurable batch size), instead of 1M. With batch size 10K, memory usage is 10MB per batch, not 1GB total.

Network efficiency: documents arrive in TCP packets. Streaming allows application to start processing while subsequent docs are still in-flight over network. Total latency = time to receive first batch + processing time, not time to receive all docs first.

Database efficiency: instead of holding all 1M docs in memory on server and sending simultaneously, server sends in batches. This allows connection reuse and prevents server from buffering entire result set.

Example (Node.js): `const cursor = collection.find(); while (await cursor.hasNext()) { const doc = await cursor.next(); process(doc); }` or `cursor.stream().on("data", doc => process(doc)).on("end", () => done());`. This streams 1M docs in constant memory.

Alternative: paginate instead of streaming: `find().limit(1000).skip(0)`, then next query `skip(1000)`, etc. Less efficient than streaming (repeated index traversal) but simpler to implement and allows "Page 5 of 100" UX.

Follow-up: Design a cursor management system for large result sets that handles network interruptions and allows resume-from-position.

You implement a health check for your MongoDB connection pool: every 60 seconds, you run a simple query to confirm connection is alive. However, you notice that during health check, legitimate client queries see increased latency (20% slower). The health check query is `db.admin.command({ping: 1})` which should be instant. Why does a concurrent health check impact other queries?

Connection pooling multiplexes multiple queries over shared connections. If you have 100 connections in pool and 100 queries running, each query uses 1 connection. When you add health checks on top (even if fast), they compete for connections: health check takes a connection from pool, slightly reducing pool size for application queries. Under high load, health checks cause "connection starvation"—legitimate queries wait for connections.

Additionally, health checks add overhead: (1) Thread/task scheduling: health check task needs CPU cycles; (2) Network round-trip: even ping takes ~1ms network latency; (3) Lock contention on connection pool: acquiring a connection from pool requires lock (brief), health check lock adds to contention.

Solutions: (1) Use connection monitoring instead of polling: MongoDB driver has built-in connection monitoring (events). Monitor "connectionClosed" events instead of active polling; (2) Run health checks on dedicated connection: create a separate connection pool (size 1) for health checks, don't share with application queries. Health check uses dedicated connection, doesn't compete; (3) Reduce health check frequency: instead of every 60s, run every 300s or on-demand when pool is idle; (4) Batch health checks with application queries: if you're running queries anyway, don't need separate health check (queries prove connection is alive); (5) Use server-side monitoring: instead of client pinging server, have MongoDB server monitor itself and expose metrics that client periodically polls.

Recommended: Use driver's built-in connection monitoring. Most MongoDB drivers (Node.js, Python, Java) emit events like "connectionCheckedOut" and "connectionCheckedIn"—monitor these instead of active health checks.

Follow-up: Design a health checking system for a connection pool that detects server unavailability within 5 seconds without impacting application performance.

Your MongoDB replica set has 3 nodes. Your driver is configured for `readPreference: "secondary"`. However, you notice that about 2% of queries are routed to PRIMARY even though secondary is available. Meanwhile, application sees occasional "no suitable servers" errors when all secondaries are temporarily down (e.g., during maintenance). How does driver server selection work, and why aren't secondaries always used?

Driver uses server selection algorithm: (1) Discover servers by querying topology (via isMaster command); (2) For readPreference "secondary", filter to only SECONDARY nodes; (3) If no secondaries available, some drivers fallback to PRIMARY; (4) If no suitable servers, throw "no suitable servers" error after timeout (e.g., 30s).

2% queries hitting PRIMARY likely because: (1) Secondary latency is high: if secondary falls behind by >1 second, driver might deprioritize it (with `secondaryPreferredMaxStalenessSeconds` setting); (2) Transient topology changes: during replica set reconfig or network hiccup, driver thinks secondary is UNKNOWN temporarily, falls back to PRIMARY; (3) Initial discovery: on first connect, driver might not know about secondaries yet, sends to PRIMARY, then discovers secondaries on next isMaster refresh.

"No suitable servers" during maintenance: when you restart a secondary, it goes UNKNOWN (from driver's perspective). Driver temporary has 1 PRIMARY + 1 UNKNOWN + 1 RECOVERING (starting up). If `readPreference: "secondary"` is strict, only UNKNOWN is left—driver throws error. This is by design (prevents stale reads).

Mitigation: (1) Use `readPreference: "secondaryPreferred"` instead (tries secondary, falls back to PRIMARY if no secondary available)—more forgiving; (2) Set `serverSelectionTimeoutMS: 5000` and `connectTimeoutMS: 3000` to retry faster if server unavailable initially; (3) Implement retry logic in application: if "no suitable servers", retry after 1 second (secondary might be back online); (4) Monitor driver logs to see which servers are selected for each query—identify patterns.

Follow-up: Design a read preference strategy for a 3-region replica set (primary in us-east, secondaries in us-west and eu-west) that minimizes latency while maintaining consistency.

You configure your MongoDB driver with `serverSelectionTimeoutMS: 30000` (wait up to 30 seconds for a suitable server). Under normal operation, query latency is 50ms. However, when MongoDB is briefly unavailable (network hiccup, 5-second outage), queries hang for 30 seconds before failing. Users experience 30-second timeout instead of immediate failure. How would you design fast failover?

serverSelectionTimeoutMS is per-query timeout: if no suitable server found within 30s, query fails. But during outage, driver is waiting the full 30s before declaring servers unreachable. This creates slow user experience.

Better design: (1) Use shorter timeout + retry logic: `serverSelectionTimeoutMS: 5000` with exponential backoff retry in application code. If first query fails after 5s, retry immediately (might succeed if server recovered). If second also fails, wait 1s then retry: `setTimeout(() => retry(), 1000)`; (2) Use connection pool health checks to proactively detect failures: configure `heartbeatFrequencyMS: 5000` (check server every 5s). If heartbeat fails, driver immediately marks server as UNKNOWN (doesn't wait for full timeout); (3) Implement circuit breaker pattern: track query failures. After 5 consecutive failures, assume server is down and fail fast without attempting (until recovery detected); (4) Use monitoring webhooks: if you control the MongoDB deployment, emit alerts when server goes down. Application can proactively fail over based on alert, not waiting for timeout.

Recommended: short serverSelectionTimeoutMS (5s) + retry with backoff in application. Code: `async function queryWithRetry() { for (let attempt = 0; attempt < 3; attempt++) { try { return await collection.find().toArray(); } catch (err) { if (attempt < 2) await sleep(Math.pow(2, attempt) * 1000); else throw; } } }`

Follow-up: Design a MongoDB client library that automatically detects server failures and fails fast within 1 second for <95% of queries during outage.

Your application uses Connection Pooling with TLS/SSL to MongoDB. You set `minPoolSize: 10`, but on startup, the application takes 15 seconds to fully initialize (before handling requests). You measure and find TLS handshake for each connection takes ~1.5 seconds. With 10 connections * 1.5s = 15s initialization delay. Your customers expect <2 second startup. How would you speed up connection pool initialization?

TLS handshake for each connection is expensive (cryptographic operations). With 10 sequential connections at 1.5s each, startup is 15s total. Solutions: (1) Parallelize handshakes: instead of creating connections sequentially, create all 10 in parallel using async operations. With parallel handshakes, total time = max(1.5s) = 1.5s (instead of 15s); (2) Lazy initialization: set `minPoolSize: 0` and only create connections on-demand. First request might trigger connection creation (slight latency spike), but startup is <100ms. Trade: first queries are slower; (3) Pre-warm pool in background: create connections asynchronously after app is ready to serve. Start with `minPoolSize: 0`, app serves requests immediately, connections populate in background. By the time traffic arrives, pool is full; (4) Reduce `minPoolSize`: set `minPoolSize: 2` instead of 10. Startup: 3 seconds (2 connections), grow to 10 on demand as traffic arrives; (5) Connection reuse from previous run: if using persistent volume (Docker, Kubernetes), save pool state and reuse connections from previous run. But TLS requires fresh handshake, so limited benefit.

Recommended: Parallelize + lazy initialization. Set `minPoolSize: 0, maxPoolSize: 100`. App starts in 100ms without connections. First request might wait 1.5s for connection creation (TLS handshake), then subsequent requests use pool. Meanwhile, background task creates more connections in parallel. After 2-3 requests, pool is likely fully seeded.

Follow-up: Design a connection pooling strategy for a serverless/FaaS environment where startup time is critical and connections can't be kept warm between invocations.