Redis Interview — Big Key Detection and Remediation

You run redis-cli --bigkeys and discover a key "user:profile:12345" is 500MB. The value is a hash with 10M fields, each field ~50 bytes. This key is blocking SLOWLOG operations. You need to split it urgently. But it's actively used—splitting without downtime is critical. How do you refactor?

500MB hash is a severe problem: operations like HGETALL, HKEYS block Redis for seconds. Refactor strategy: (1) zero-downtime migration: (a) create new sharded keys: user:profile:12345:shard:0, user:profile:12345:shard:1, ... (10 shards * 1M fields each = 50MB per shard, much smaller). (b) dual-write: app writes to both old key and new shards simultaneously (for new data). (c) backfill: migrate existing data from old key to shards using SCAN + HMGET + HMSET. (d) read hybrid: app reads from new shards first (fast), fallback to old key if not found (for in-flight data). (e) cleanup: after all data migrated and no new data on old key, delete old key. (2) batch migration: use HSCAN to iterate old key without blocking. HSCAN user:profile:12345 0 COUNT 1000 returns 1000 fields at a time. Migrate each batch: batch_fields = HSCAN result; for field in batch_fields: HMSET user:profile:12345:shard: field . (3) verify migration: after batches complete, compare HLEN of old key vs sum of new shards. Should match. (4) handle concurrent modifications: if app is still writing to old key during migration, use Lua script to migrate atomically: EVAL 'for field, value in pairs(HGETALL(KEYS[1])) do redis.call("HSET", KEYS[2]..":" .. (hash(field) % 10), field, value) end' 1 user:profile:12345 user:profile:12345:shard' (atomically migrates with concurrent writes blocked). Implementation: (1) create shards with initial HSET: for i in 0..9: redis.cli HSET user:profile:12345:shard: dummy "" (creates shards). (2) migrate incrementally: run HSCAN in background loop during off-peak. (3) update app code: change HGET user:profile:12345 field to: HGET user:profile:12345:shard: field. (4) test before deployment: run script on replica, verify shards are populated correctly.

Follow-up: If the 500MB key is frequently modified (thousands of writes/sec) during migration, how would you handle write conflicts?

redis-cli --bigkeys reports 20 keys in the 100-200MB range (set with millions of members). These are causing memory issues and slow SMEMBERS/SUNION operations. You can't feasibly shard millions of set members (complexity). What's the best remediation?

Large sets (100-200MB with millions of members) are problematic if you perform unions/intersections (SUNION, SINTER are O(N*M) for N sets with M members each). Options: (1) use Redis Cluster with sharding: shard the set data across multiple nodes. Each node holds subset. SUNION must query all nodes and merge (more complex). (2) accept the size but optimize operations: (a) avoid SMEMBERS (loads all into memory). Instead, use SSCAN to iterate. (b) use SCARD to get size without loading. (c) use SISMEMBER for membership checks (O(1) vs O(N)). (d) avoid SUNION/SINTER if possible. If you need set operations, pre-compute and store results (e.g., SUNION result stored separately, updated daily). (3) change data structure: if set is membership check (e.g., "is user X in this group"), use Sorted Set with SCORE=1. Query using ZSCORE (O(1)). Or use Redis Bloom Filter module for even smaller memory. (4) compress members: if members are strings, compress representation. E.g., store hash of member instead of full member (sacrifices certainty, gains 10x memory). (5) archive old data: if set has stale members, SREM them or archive to external storage (S3). Prevention: (1) measure set size periodically: for each large set, SCARD and alert if > 10M. (2) monitor operations: log SMEMBERS, SUNION, SINTER calls. Alert if performed on large sets. (3) redesign: evaluate if set is appropriate. Alternative: use Streams for versioned membership, HyperLogLog for cardinality-only. For immediate relief: (1) identify which sets are 100-200MB: redis-cli DBSIZE and redis-cli SCAN 0 MATCH "..." to find candidates. (2) evaluate usage: are these sets read-heavy or write-heavy? (3) archive: if unused, DEL to free memory. (4) compress: if members are IDs, encode as integers instead of strings (e.g., integer 12345 vs "user:12345").

Follow-up: If you archive large sets to S3, how would you keep a smaller cache of hot members in Redis?

Your application creates temporary keys (session data, caching) with TTL (e.g., SETEX key 3600 value). After 1 year, you have accumulated 10M expired keys that haven't been deleted yet (Redis hasn't run expiration cleanup). Memory is bloated with ghost data. How do you clean up without downtime?

Redis uses lazy deletion: expired keys are only deleted when accessed (to save CPU). If keys are rarely accessed, they linger in memory. This is the problem. However, Redis 4.0+ added active expiration: background thread periodically scans keys and deletes expired ones. Check: CONFIG GET activerehashing and CONFIG GET hz (sampling rate). To clean up: (1) force active expiration: CONFIG SET activerehashing yes and increase hz: CONFIG SET hz 10 (background samples 100 keys per/sec, vs default 1 key/sec). Redis will scan and delete expired keys more aggressively. (2) SCAN and delete: manually SCAN for expired keys. Use Lua script: EVAL 'local keys = redis.call("KEYS", ARGV[1]); for _, key in ipairs(keys) do local ttl = redis.call("TTL", key); if ttl == -1 then redis.call("DEL", key) end end; return #keys' 0 'pattern' (identifies keys with TTL==-1, should be deleted). (3) temporary keys identified by pattern: if all temp keys have pattern "sess:*", SCAN sess:* MATCH and check TTL. Delete expired. (4) archive before expiring: for critical data, before TTL expiry, archive to S3. Then delete from Redis. (5) reduce TTL: if keys are supposed to last 3600s but aren't accessed, set shorter TTL (e.g., 600s) to accelerate cleanup. Prevention: (1) monitor expired keys: SCAN all keys and count those with TTL > 0 (not expired) vs TTL == -1 (expired). Alert if expired keys > 10% of total. (2) set appropriate TTL: don't use arbitrary long TTLs. Match actual cache lifetime. (3) enable CONFIG SET CONFIG SET appendonly yes, so expired keys are cleaned up and persisted. Implementation: run MEMORY DOCTOR which recommends TTL cleanup actions. Also check INFO stats > evicted_keys to see if eviction is happening (sign of memory pressure).

Follow-up: If you need to clean up expired keys faster, what's the trade-off with Redis CPU usage?

You detect a 50GB sorted set key storing leaderboard data (scores for millions of players). Operations like ZRANGE, ZREVRANGE are causing 500ms+ latency. This is a fundamental design issue: a single sorted set with millions of members isn't scalable. How do you refactor without losing rank data?

50GB sorted set (millions of members) is problematic because ZRANGE is O(N log N) and ZRANK is O(log N) but returns count and value (slow for large cardinality). Refactor options: (1) sharded leaderboards: instead of 1 global zset, shard by game, region, or time. Each shard has <100K members. ZRANGE on shard is fast. Global rank: approximate or compute by summing shard ranks. (2) use Redis Cluster: shard across multiple nodes. Each node holds subset of players. ZRANGE on node is fast. Global ranking requires aggregation across nodes. (3) hierarchical ranks: store top 100 globally (frequently accessed), rest in regional/game-specific sorted sets. Top-100 is fast, rest is paginated. (4) time-based: if leaderboard updates daily, don't store entire history. Only store current week (much smaller). Archive history to S3. (5) approximate ranking: use Sorted Set cardinality (ZCARD) to estimate rank of a player without fetching all. Example: ZRANK gives exact rank, ZCARD / 2 estimates median. (6) pagination: instead of ZRANGE 0 -1 (all), return ZRANGE 0 99 (top 100), then paginate: ZRANGE 100 199, 200 299, etc. Client loads on-demand. Recommendation for your case: (1) test current latency: measure ZRANGE 0 99 (top 100) vs ZRANGE 0 -1 (all). If top-100 is <10ms, use pagination. (2) implement sharding: shard by game or region. Each shard has separate zset. Global rank = sum ranks from all shards (approximate). (3) update app code: instead of redis.zrange(leaderboard, 0, 99), use redis.zrange(leaderboard:shard:1, 0, 99) + redis.zrange(leaderboard:shard:2, 0, 99) + ... (merge and sort). Implementation: (1) create sharded keys during next leaderboard update. (2) backfill by SCAN: ZSCAN leaderboard 0 COUNT 1000, migrate to shards using ZADD leaderboard:shard: score member. (3) verify: sum ZCARD of all shards == original ZCARD. (4) test: measure latency before/after sharding. Expected: 10-100x faster.

Follow-up: If you shard leaderboards and a player can be in multiple games, how would you store and query across shards?

redis-cli --bigkeys finds a 1GB string key storing serialized JSON (entire database snapshot). This is anti-pattern (Redis isn't a DB backup). But the app relies on it (slow queries on primary DB, so snapshot is cached in Redis). How do you eliminate this without breaking the app?

1GB string is extremely inefficient: (1) any operation (GET, serialization) blocks Redis and uses huge memory. (2) no granular access (must fetch entire 1GB to read one field). Refactor: (1) decompose into granular keys: instead of 1 snapshot key, store individual records: redis.json.set record:123 $ '{"name": "Alice", ...}'. Client can GET individual records. (2) use Redis as cache, not DB: for primary DB queries, cache results in Redis. Pattern: app queries DB, caches result in Redis with TTL. Avoid caching entire DB snapshot. (3) use Redis Streams for time-series DB snapshots: instead of single 1GB key, XADD snapshot-stream timestamp . Clients can retrieve snapshots by time-range. (4) pagination: if snapshot must exist, paginate it. Store snapshot:page:0 (1MB), snapshot:page:1 (1MB), ..., snapshot:page:1000 (1GB total). Client fetches pages on demand. (5) compression: if 1GB is JSON, compress with gzip: zlib.compress(json) might reduce to 100-200MB. Trade-off: decompression CPU vs memory. (6) use different storage: 1GB snapshot doesn't belong in Redis. Use S3 for archive, keep only active/hot data in Redis. Implementation: (1) identify what's in the snapshot: use redis-cli DUMP and decompress (if possible) to see schema. (2) assess usage: which parts of snapshot are frequently accessed? Focus caching efforts on hot keys. (3) migrate: (a) create granular keys for hot parts, (b) app reads from granular keys, (c) only fallback to 1GB snapshot if granular key missing, (d) remove 1GB snapshot gradually as coverage improves. (4) test: measure app performance before/after. Expected: latency decreases if accessing single records instead of 1GB snapshot. Prevention: set CONFIG SET maxmemory-policy allkeys-lfu so large keys are candidates for eviction if memory pressure.

Follow-up: If the 1GB snapshot is accessed frequently (100 times/sec), how would you serve it efficiently without keeping it in Redis?

You discover a list key with 100M elements (list:session-queue). Each element is a session ID (50 bytes). Total size ~5GB. LPUSH/RPOP operations are fast (O(1)), but LRANGE or LLEN are expensive (Redis must count/iterate). The list is used as a queue. RPOP QPS is high (100K QPS) but can't keep up—list keeps growing. How do you debug and fix?

100M-element list = 5GB. LLEN is O(1) fast, but if list is slow, it's due to: (1) high RPOP volume: 100K QPS consumer can't keep up with producers. List backlog grows. (2) memory pressure: 5GB list causes eviction or OOM. (3) replication lag: if list is replicated, replica can't keep up (slow network). Debug: (1) LLEN to see current queue depth. Expected: if RPOP is 100K QPS and steady-state, LLEN should be bounded. Growing LLEN means producers > consumers. (2) INFO stats > total_commands_processed to estimate total QPS. Compare producers (LPUSH) vs consumers (RPOP). (3) MEMORY USAGE list:session-queue to verify actual size. If <5GB, there might be other large keys. To fix: (1) increase consumers: add more RPOP clients if constrained by single consumer. 100K QPS per consumer is high; scale to multiple consumers. (2) use Redis Streams instead: Streams are better for queues. XREAD handles backpressure better, consumer groups manage offset. (3) switch to blocking RPOP: BRPOP list:session-queue 1 (waits for element, doesn't busy-loop). (4) implement priority queues: if some sessions are critical, use multiple lists (queue:high, queue:low) and serve high-priority first. (5) archive: if list represents sessions, and sessions are old (>1 hour), archive to DB and delete from Redis. LREM to remove archived sessions. Prevention: (1) monitor queue depth: LLEN every 10 seconds. Alert if > 1M elements (growing backlog). (2) measure throughput: RPOP QPS vs LPUSH QPS. Alert if producers consistently > consumers. (3) use BRPOP TIMEOUT to prevent busy-loop on empty queue. Implementation: (1) add consumers: spawn N worker threads, each runs BRPOP. (2) test: redis-benchmark -t LPUSH,RPOP with simulated traffic. Measure queue depth under load. (3) switch to Streams: XADD for producers, XREAD for consumers. Measure improvement in throughput and latency.

Follow-up: If the 100M list contains duplicate session IDs and you need to ensure unique processing, how would you deduplicate without loading entire list?