AWS Interview — CloudFront Cache Behaviors and Origin Shield

Your CloudFront distribution in front of an ALB origin is being hammered despite caching. CloudWatch shows origin request rate of 5000 req/sec, while CloudFront receives 50,000 req/sec. Cache hit ratio is 20%, meaning 80% of requests go to origin. Your origin is throwing 502 errors. Why is caching so ineffective, and how do you fix?

Low cache hit ratio (20%) despite high traffic suggests: (1) Varying query strings or headers — if your origin returns different content for the same path based on query string parameters (e.g., `?v=123` vs `?v=124`), CloudFront treats them as different cache keys. Check your cache key settings: `aws cloudfront get-distribution-config --id DISTRIBUTION_ID | jq '.DistributionConfig.CacheBehaviors[0].CacheKeyAndOriginRequestPolicy'`. If it includes `QueryStringsConfig: {QueryStringBehavior: "all"}`, disable it for non-essential params: use a custom cache policy to include only required query strings. (2) Headers not normalized — if your origin sets different headers (Set-Cookie, User-Agent, etc.) for the same content, CloudFront might cache them separately. Restrict headers in cache key: set `HeadersConfig: {HeaderBehavior: "whitelist", Headers: ["Host"]}`. (3) Cache TTL too low — default TTL is 86400 seconds (1 day), but if your origin sends `Cache-Control: max-age=0` or `no-cache`, CloudFront doesn't cache the object. Override origin headers: `aws cloudfront create-cache-policy --cache-policy-config '{...,"DefaultTTL": 3600, "MaxTTL": 86400}'`. (4) No-cache pragma or Expires headers — origin might be sending headers that prevent caching. Override in CloudFront: set `"Compress": true, "DefaultTTL": 3600`. (5) Cookies or user-specific content — if origin sets cookies per user, CloudFront caches per cookie value, creating thousands of cache entries for the same content. Whitelist only essential cookies: use cache policy to exclude user-tracking cookies. Fix: (1) Create a cache policy that includes only essential cache keys — `aws cloudfront create-cache-policy --cache-policy-config file://cache-policy.json` with minimal QueryStrings and Headers. (2) Set aggressive TTL — `DefaultTTL: 3600, MaxTTL: 86400, MinTTL: 1`. (3) Use CloudFront compression — reduces origin requests by compressing content. (4) Enable Origin Shield (see below) to further reduce origin requests. Testing: check cache hit ratio in CloudFront console — it should be > 80% for static content, > 50% for dynamic content with few query params.

Follow-up: You optimize cache key (exclude query strings, reduce headers, set TTL to 3600). Cache hit ratio improves to 60%. But origin is still receiving 20,000 req/sec (out of 50,000 total). Why 40% requests still going to origin?

You enable CloudFront Origin Shield to reduce origin load. Origin Shield is an additional cache layer between CloudFront edges and your origin. After enabling, your origin request rate drops from 5000 req/sec to 500 req/sec (90% reduction). But user-facing latency increases by 50ms. Is the trade-off worth it?

Origin Shield adds a network hop (request goes edge → Origin Shield → origin), which adds ~20-50ms latency depending on distance. The benefit is origin protection and increased cache hit ratio. Decision: (1) If origin frequently overloads or is hard to scale (e.g., expensive RDS query, external API call), the latency trade-off is worth it. Origin Shield absorbs request bursts, preventing cascading failures. (2) If origin is cheap to scale (e.g., stateless app in ASG), the 50ms latency adds up for user experience. Quantify: 50ms * 50M requests/month = 2.5M milliseconds = 2500 seconds = 41 minutes of additional latency per month. If your SLA is < 500ms P99, 50ms might push you over. (3) Cost consideration — Origin Shield costs $0.01 per 10,000 requests. For 50M requests/month = 500K requests/day = 500,000 * $0.01 / 10,000 = $50/month. Small cost. If origin is already scaled, Origin Shield is not necessary. If origin is fragile, it's worth $50/month + 50ms latency. To measure impact: (1) Enable Origin Shield on half your distribution (50% traffic), monitor edge cache hit ratio before/after. (2) Check origin error rate — if it drops, Shield is helping. (3) Measure user latency with Real User Monitoring (RUM) — compare P99 with/without Shield. If P99 increases > 100ms, consider disabling. Most common scenario: enabling Shield for spiky traffic or during deployments. Permanently enable Shield if origin is at capacity; disable if origin scales easily. For production, recommended approach: enable Shield during peak hours (Lambda scheduled rule) or use it only on critical endpoints.

Follow-up: You enable Origin Shield permanently. Origin request rate stabilizes at 500 req/sec. But every 12 hours, requests spike to 2000 req/sec for 10 minutes, then drop. What's causing this spike?

Your CloudFront distribution serves both static assets (CSS, JS, images) and dynamic HTML pages. You want to cache assets aggressively (long TTL) but not cache HTML (or cache with short TTL to serve new versions). How do you configure separate caching behaviors?

CloudFront supports multiple cache behaviors, each with different cache policies, based on URL path patterns. Configuration: (1) Create cache behaviors in CloudFront distribution: (a) For `*.css, *.js, *.png, *.jpg` — use cache policy with TTL 86400 (1 year), because assets have versioned filenames (e.g., `app.abc123.js` changes on each deploy). (b) For `/index.html` and `/*` (catch-all HTML) — use cache policy with TTL 300 (5 minutes), so users get new versions quickly. (c) For `/api/*` — use cache policy with TTL 0 (no caching), because API responses are dynamic. CloudFront evaluates behaviors in order; first match wins. Configure with priority: (1) API paths (priority 1): `PathPattern: "/api/*"`, TTL: 0. (2) Static assets (priority 2): `PathPattern: "*.css"`, TTL: 86400. (3) Catch-all HTML (priority 3): `PathPattern: "*"`, TTL: 300. Example AWS CLI: `aws cloudfront update-distribution --id DISTRIBUTION_ID --distribution-config file://config.json` with distribution config having multiple `CacheBehaviors`. (2) For asset versioning, include a version string in the filename — when you deploy, old files (`app.abc.js`) remain in cache, new files (`app.xyz.js`) are fetched from origin. This allows aggressive caching with instant updates. (3) For HTML, use `Cache-Control: public, max-age=300` header from origin, or override in CloudFront cache policy. (4) Use CloudFront invalidation to force refresh specific paths: `aws cloudfront create-invalidation --distribution-id DISTRIBUTION_ID --paths "/index.html" "/api/*"`. Invalidation takes 5-10 minutes to propagate, so not ideal for immediate updates. Best practice: Versioned static assets (1-year cache) + short TTL HTML (5 min) + API no-cache + invalidation for urgent updates. This balances performance (assets stay in cache) with freshness (HTML updates quickly).

Follow-up: You configure separate behaviors for static assets (1-year TTL) and HTML (5-min TTL). During deployment, you invalidate `/*` (all paths). Invalidation takes 10 minutes. Users accessing the site during this window still see old HTML and old CSS. Why?

You use CloudFront with a custom origin pointing to your API server (not an S3 bucket or ALB). Requests are taking 500ms P99 latency when API should respond in 100ms. CloudFront cache hit ratio is 90%, so caching is working. Where's the 400ms coming from?

The 400ms latency is likely in the request path: edge → origin (not through cache). Possible causes: (1) Cache misses — 10% of requests hit origin. Network latency from CloudFront edge to your origin (if origin is far from edge) can be 100-200ms round-trip, plus 100ms processing = 200-300ms. If your origin is in us-east-1 but edge is in ap-southeast-1, latency is high. (2) Origin Shield — if enabled, adds 20-50ms latency per request (even cache hits if Shield cache misses). (3) Custom origin headers — CloudFront adds headers like `X-Forwarded-For`, `X-Forwarded-Proto` which your origin processes. If your API parses these inefficiently, it adds latency. (4) Health checks — CloudFront periodically checks origin health with extra requests (not cached), adding overhead. (5) Connection pooling — each request might open a new connection to origin, adding TCP handshake + TLS negotiation (~50ms). CloudFront should reuse connections, but misconfiguration can prevent it. Debug: (1) Check edge-to-origin latency — compare RTT to origin from different CloudFront edges. Use `curl -w @format.txt` to measure time_connect, time_appconnect, time_pretransfer, time_starttransfer. (2) Check if cache hits are coming from CloudFront or needing origin — use CloudFront cache statistics: `X-Cache: Hit from cloudfront` vs `X-Cache: Error from cloudfront`. (3) Enable Origin Shield to see if latency adds up — compare without Shield (direct edge-to-origin) vs with Shield (edge-to-Shield-to-origin). If Shield latency is minimal (< 20ms), then edge-to-origin is the bottleneck. (4) Use AWS Global Accelerator in front of your origin — GA uses AWS's private network to route traffic, reducing edge-to-origin latency by 30-50%. Configure: `aws globalaccelerator create-accelerator --name my-ga --ip-address-type IPV4`. Most likely: edge is far from origin. Solutions: (1) Move origin closer to edges (e.g., use API Gateway regional endpoints + CloudFront), (2) use Global Accelerator for faster edge-to-origin, (3) optimize origin response time below 100ms.

Follow-up: Your origin is in us-east-1, CloudFront edges in multiple regions. You check `curl` timing from an edge in eu-west-1 to origin: time_starttransfer = 150ms (reasonable for transatlantic latency). But P99 latency is 500ms. This suggests some requests are taking 350ms longer than typical. What could cause such variance?

Your CloudFront distribution serves dynamic content (HTML pages with personalized data). You set up two cache behaviors: (1) /public/* with TTL 3600, (2) /private/* with TTL 0 (no caching). A user requests /private/dashboard (should not cache) and gets a response with `Set-Cookie: sessionId=xyz`. The next user requests the same path and receives the first user's session cookie. Why, and how do you fix?

The issue is that CloudFront is caching the response including the Set-Cookie header, even though the cache behavior is set to TTL 0. Possible causes: (1) Origin is setting `Cache-Control: public, max-age=0` instead of `no-cache` or `private`, which might be misinterpreted. (2) CloudFront's cache policy is including Set-Cookie in the cache key or allowing it to be cached. (3) You set TTL to 0 in the cache behavior, but the origin's Cache-Control header overrides it. To fix: (1) Set CloudFront cache policy to respect origin headers but enforce minimum TTL of 0: use a cache policy with `OriginOverrides: false` and `MinTTL: 0`. (2) Ensure origin sends `Set-Cookie` response header with `Cache-Control: private, no-cache` or `no-store`. (3) Configure CloudFront to not cache responses with Set-Cookie header — use a cache policy that excludes cookies from the cache key or use a custom Lambda@Edge function to strip Set-Cookie before caching. (4) Most effective: Add `Vary: Cookie` header in origin response and set CloudFront cache policy to include cookies in cache key: `HeadersConfig: {HeaderBehavior: "whitelist", Headers: ["Cookie"]}`. This creates separate cache entries per cookie, preventing session leakage. However, this defeats caching if cookies vary by user. Alternative: Disable caching for /private/* entirely by setting cache behavior with TTL 0 and `AllowedMethods: GET, HEAD` only (no POST/PUT). For dynamic content with sessions, the safest approach is to return `Cache-Control: private, no-store` from origin and ensure CloudFront's cache policy has `MinTTL: 0, MaxTTL: 0` for those endpoints. This ensures no caching. If you need to cache some dynamic content, use cache busting (versioned URLs) or tag-based invalidation rather than relying on Set-Cookie logic.

Follow-up: You configure the cache behavior for /private/* with MinTTL: 0, MaxTTL: 0. Testing shows responses are no longer cached (good). But you notice CloudFront is sending more requests to origin for /private/*, even though traffic hasn't increased. Why?

You use CloudFront with an origin request policy that passes specific headers (Authorization, User-Agent) to the origin. One of your headers is 100+ KB (a serialized JWT with nested claims). CloudFront returns 413 Payload Too Large error. What's happening and how do you fix?

The 413 error is returned by CloudFront when the total request size (headers + body) exceeds CloudFront's limit (typically 20 KB for headers, 20 MB for body). A 100 KB JWT header exceeds the limit. To fix: (1) Instead of passing large headers to origin, move the data to request body (if using POST) — body has a 20 MB limit, much larger than headers. (2) Compress the JWT before passing — use gzip compression in the request header: `Content-Encoding: gzip`. However, this requires origin to decompress. (3) Use DynamoDB or ElastiCache to store JWT claims server-side and pass only a JWT ID or reference. Origin looks up the ID to fetch full claims. (4) Reduce JWT payload — strip unnecessary claims before sending to CloudFront. Use AWS Lambda@Edge to transform the JWT. (5) Use HTTP/2 Server Push or request streaming if supported by origin. Most practical: Option (3) — store claims server-side and pass only ID/reference. This reduces header size dramatically. Example: Instead of passing full JWT (100 KB), pass only `X-JWT-ID: abc123` (10 bytes). Origin looks up the ID in DynamoDB to fetch the full claims. This also improves caching because different users with same request path can share the cache. (6) Alternatively, restructure your application — if you're passing large JWTs in CloudFront request, consider handling auth at the edge using Lambda@Edge. Lambda@Edge can verify and transform auth headers without passing them to origin.

Follow-up: You move large JWT data to DynamoDB and pass only JWT-ID header (10 bytes). Requests are now successful. But origin latency increased by 100ms because origin must now query DynamoDB for each request. How do you avoid the DynamoDB lookup overhead?