Kubernetes Interview Questions

CNI Networking and Overlay Models

questions
Scroll to track progress

Your cluster runs Flannel VXLAN overlay. You're scaling to 500 nodes across AWS regions. Bandwidth costs are spiraling—$12K/month in inter-region traffic. Your network team says "Why are you encapsulating everything?" You realize Flannel encapsulation is adding 50+ bytes to every packet. Can you switch to Calico/BGP routing without rebuilding the cluster?

Yes, but it requires careful planning. Flannel VXLAN vs Calico BGP represent different architectural philosophies: overlay vs underlay. The migration has network implications.

Phase 1: Understand current state
kubectl get daemonset -n kube-system -o wide | grep flannel kubectl describe daemonset kube-flannel -n kube-system | grep Image kubectl exec -n kube-system kube-flannel-xxxxx -- ip route show kubectl exec -n kube-system kube-flannel-xxxxx -- cat /etc/kube-flannel/net-conf.json
Confirm VXLAN is active:
ssh node-1 ip link show | grep vxlan ssh node-1 bridge fdb show | head -10

Phase 2: Plan Calico migration
Option A: Rolling replacement (preferred for large clusters)
- Install Calico components alongside Flannel (Calico as policy controller, Flannel continues routing)
- Drain nodes one by one, uninstall Flannel, install Calico CNI
- Validate pod-to-pod connectivity after each node

Option B: Create a new cluster, migrate via service mesh (safest but expensive)

Step 1: Install Calico operator and resources in monitoring mode:
kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.28.0/manifests/tigera-operator.yaml kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.28.0/manifests/custom-resources.yaml

Step 2: Configure Calico to coexist (CNI chaining):
apiVersion: projectcalico.org/v1alpha1 kind: CNIConfiguration metadata: name: default spec: containerRuntime: containerd cniSchemaVersion: 1.0
Verify coexistence:
kubectl get daemonset -n calico-system kubectl get pods -n calico-system -o wide

Step 3: Drain and migrate nodes (one per hour to monitor impact):
kubectl drain node-1 --ignore-daemonsets --delete-emptydir-data ssh node-1 sudo systemctl stop kubelet ssh node-1 sudo rm -rf /var/lib/cni/flannel /etc/cni/net.d/*flannel* ssh node-1 sudo systemctl start kubelet # Wait for Calico to initialize sleep 30 kubectl describe node node-1 | grep -E 'Ready|network'
Test pod connectivity:
kubectl run test-pod-1 --image=alpine -it --rm -- ping test-pod-2.default.svc.cluster.local

Phase 3: Switch Calico to BGP (underlay) mode
Edit BGP configuration:
kubectl apply -f - <
Configure BGP peers (your network routers):
kubectl apply -f - <

Verify BGP peering:
kubectl exec -n calico-system calico-node-xxxxx -- calicoctl node status
Expected output: "Calico process is running" and "BGP status: up"

Phase 4: Monitor and validate
Compare bandwidth before/after:
ssh node-1 sar -n DEV 1 5 | grep -E 'eth0|vxlan' # Check packet overhead reduction ping -c 100 -s 1472 pod-ip # Max before fragmentation
Expected: bandwidth usage drops 5-10% due to eliminated VXLAN encapsulation (50-byte header).
Cost savings: 500 nodes × 10% reduction = ~$1.2K/month saved.

Rollback plan: Keep Flannel manifests in GitOps repo with version tag. If issues occur:
git checkout flannel-v0.21.0 kubectl apply -f flannel-daemonset.yaml kubectl drain node-1 --ignore-daemonsets ssh node-1 sudo systemctl stop kubelet && sudo rm -rf /var/lib/cni/calico && sudo systemctl start kubelet

Follow-up: BGP requires your infrastructure team to configure the routers. What happens if they refuse? Design a hybrid approach that reduces costs without requiring router changes.

You've just switched from Flannel to Cilium. Pod-to-pod connectivity works fine, but now kube-proxy is gone and some services are broken. Your NodePort services don't respond. What happened and how do you debug?

Cilium replaces kube-proxy entirely, but the replacement isn't automatic. Cilium needs specific configuration to handle LoadBalancer services and NodePorts. If services are broken, you likely have a Cilium service proxy misconfiguration or a mismatch between Cilium's eBPF and your service topology.

Debug flow:

1. Verify Cilium replaced kube-proxy:
kubectl get pods -n kube-system -l app=kube-proxy # Should return nothing
Verify Cilium agents are running:
kubectl get daemonset -n cilium kubectl get pods -n cilium -o wide

2. Check Cilium configuration for services:
kubectl get configmap cilium-config -n cilium -o yaml | grep -E 'kube-proxy|service-proxy-name|bpf-map-dynamic-size-ratio'
Ensure key configs are set:
bpf-map-dynamic-size-ratio: "0.25" # Allocate eBPF maps enable-host-port: "true" # For NodePorts enable-node-port: "true" # For NodePort services

3. Test NodePort directly:
kubectl get svc | grep NodePort kubectl exec -it debug-pod -- curl http://node-ip:node-port # If it fails, the service proxy isn't working

4. Check Cilium service map:
kubectl exec -n cilium cilium-xxxxx -- cilium service list kubectl exec -n cilium cilium-xxxxx -- cilium service get 1234
If the service isn't listed, it wasn't programmed into eBPF.

5. Inspect eBPF maps directly:
kubectl exec -n cilium cilium-xxxxx -- bpftool map show kubectl exec -n cilium cilium-xxxxx -- bpftool map dump name cilium_lb_services_v4 | head -20
Verify your service IP is in the map.

6. Check Cilium logs for errors:
kubectl logs -n cilium -l k8s-app=cilium --tail=100 | grep -i 'service\|lb\|proxy'
Common error: "failed to create service entry" or "eBPF map full"

Fix: If eBPF map is full, increase map sizes
kubectl set env daemonset/cilium -n cilium BPF_MAP_DYNAMIC_SIZE_RATIO=0.5 kubectl rollout restart daemonset/cilium -n cilium sleep 30

Restart Cilium agents:
kubectl rollout restart daemonset/cilium -n cilium # Monitor for completion kubectl rollout status daemonset/cilium -n cilium

7. Validate NodePort again:
kubectl exec -it debug-pod -- curl http://node-ip:node-port -v # Should succeed now

Prevention: When migrating from kube-proxy to Cilium, always:
1. Run compatibility check first: cilium preflight connectivity-check
2. Enable Cilium's monitoring: hubble observe --verdict DROPPED
3. Test all service types (ClusterIP, NodePort, LoadBalancer) in audit mode before cutover

Follow-up: How would you handle session affinity (sticky sessions) without kube-proxy? Design a solution that works for gRPC and WebSocket traffic.

You're running Calico on a 100-node cluster. Monitoring shows high CPU on calico-node pods and slow pod startup times (45 seconds vs normal 5 seconds). The calico-node pods are consuming 800m CPU each. What's the bottleneck and how do you investigate?

High CPU in calico-node typically indicates: policy reconciliation storms, BGP churn, or eBPF map contention. Pod startup slowness suggests the CNI plugin is blocking on IP allocation or policy programming.

Debug sequence:

1. Correlate CPU spike with events:
kubectl top pods -n calico-system -l k8s-app=calico-node --containers kubectl describe pod calico-node-xxxxx -n calico-system | grep -A 10 Events
Check if spikes correlate with pod deployments, node additions, or policy updates.

2. Check BGP stability:
kubectl exec -n calico-system calico-node-xxxxx -- calicoctl node status # Expected: BGP status: up # If showing "down" or frequent changes, BGP is thrashing
Monitor BGP peering flaps:
kubectl logs -n calico-system -l k8s-app=calico-node | grep -E 'bgp.*state|Peer.*Up|Peer.*Down' | tail -50
High volume of Up/Down events = peering instability.

3. Check policy reconciliation load:
kubectl exec -n calico-system calico-node-xxxxx -- calicoctl get policies | wc -l # Count number of policies calicoctl get networkpolicies | wc -l # Count NetworkPolicies
If you have 1000+ policies, reconciliation becomes expensive.
Profile policy processing:
kubectl logs -n calico-system -l k8s-app=calico-node --tail=500 | grep -E 'Reconcile|ProcessUpdate' | wc -l

4. Monitor IP allocation performance:
kubectl describe daemonset calico-node -n calico-system | grep -A 5 "Limits\|Requests" # Check memory and CPU limits
Run a deployment spike and measure pod startup time:
time kubectl run test-{1..10} --image=alpine --overrides='{"spec":{"terminationGracePeriodSeconds":0}}' # Measure end-to-end time kubectl get pods -o wide | grep test- | wc -l

5. Check eBPF map usage:
ssh node-1 sudo bpftool map show | grep -E 'cali|felix' ssh node-1 sudo bpftool map dump name cali_v4 | wc -l
If maps are at 95%+ capacity, Calico can't program new routes efficiently.

Common fixes:

Fix 1: Increase calico-node resource limits
kubectl set resources daemonset calico-node -n calico-system --limits=cpu=1,memory=512Mi --requests=cpu=500m,memory=256Mi

Fix 2: Reduce policy churn by batching updates
kubectl patch configmap calico-config -n calico-system --type merge -p '{"data":{"policy_update_batch_size":"50"}}'

Fix 3: Enable Felix CPU profiling to identify exact hotspots
kubectl set env daemonset/calico-node -n calico-system FELIX_CPUPROFILINGFILE=/tmp/felix.prof kubectl rollout restart daemonset/calico-node -n calico-system sleep 120 kubectl exec -n calico-system calico-node-xxxxx -- cat /tmp/felix.prof > felix.prof go tool pprof felix.prof # top10 to see hottest functions

Fix 4: Split policies into smaller, more specific rules
Instead of:
spec: podSelector: {} # Matches all pods ingress: - from: - podSelector: {} # 1000 rules evaluated per pod
Use labeled tiers:
spec: podSelector: matchLabels: tier: api # Narrower scope, fewer rules to evaluate

Follow-up: How do you scale a single Calico deployment to handle 1000+ nodes? At what point do you need to switch architectures?

Your cluster spans 3 availability zones in the same region. You're using Flannel with VXLAN overlay. Pod A in AZ1 pings Pod B in AZ3—latency is 35ms instead of expected 2-3ms. Network engineers say the underlay is fine. Why is the overlay adding so much latency and how do you fix it?

VXLAN overlay encapsulation can introduce latency through multiple mechanisms: increased packet size causing fragmentation, MTU mismatches, or additional processing in the VXLAN tunnel endpoints.

Investigate latency source:

1. Verify underlay latency is good:
ssh node-az1 ping -c 100 node-az3-ip | grep avg # Expected: 1-3ms

2. Check pod-to-pod latency in detail:
kubectl run latency-test-az1 --image=nicolaka/netshoot -n default --node-selector zone=us-east-1a kubectl run latency-test-az3 --image=nicolaka/netshoot -n default --node-selector zone=us-east-1c kubectl exec latency-test-az1 -- bash # Inside pod: for i in {1..100}; do ping -c 1 latency-test-az3; done | tee latency.txt grep time= latency.txt | awk -F'time=' '{print $2}' | awk -F' ' '{print $1}' | sort -n | tail -1
Isolate latency: measure pod-to-node, node-to-node, node-to-pod to find bottleneck.

3. Check MTU and fragmentation:
kubectl exec latency-test-az1 -- ping -c 3 -M do -s 1472 latency-test-az3 # If "Frag needed but DF set", MTU is too small
Check current MTU on nodes:
ssh node-az1 ip link show | grep mtu
VXLAN adds 50-byte overhead (14 + 20 + 8 + 8 = 50). If physical MTU is 1500, VXLAN MTU should be 1450.
ssh node-az1 ip link set dev flannel.1 mtu 1450
Verify on Flannel config:
kubectl get daemonset kube-flannel -n kube-system -o yaml | grep -A 5 args

4. Enable Flannel VXLAN UDP port optimization:
kubectl get daemonset kube-flannel -n kube-system -o yaml | grep -E 'Backend|Type|Directrouting'
If Directrouting is disabled, enable it:
kubectl edit daemonset kube-flannel -n kube-system # Edit net-conf.json: # "Backend": { # "Type": "vxlan", # "Directrouting": true # Direct routing for same-AZ pods # }

5. Measure VXLAN processing overhead:
ssh node-az1 ethtool -S eth0 | grep -E 'rx_csum|tx_csum|rx_packets|tx_packets' # Compare before/after enabling hardware offload
Enable TSO (TCP Segmentation Offload) and GSO (Generic Segmentation Offload) if supported:
ssh node-az1 ethtool -K eth0 tso on gso on

6. Check if cross-AZ traffic is being unnecessarily routed through a NAT/gateway:
traceroute latency-test-az3-ip # Verify direct node-to-node path, not through a gateway

7. Consider switching to Calico BGP (underlay) if latency is critical
With BGP, packets aren't encapsulated—they're routed directly by the underlay network. Latency drops to underlay baseline (1-3ms).

Quick fix ranking by impact:
1. Enable Directrouting (immediate, ~5ms reduction)
2. Fix MTU mismatch (immediate, if fragmentation is happening)
3. Enable hardware offload (2-3ms reduction)
4. Migrate to BGP/underlay (5-10ms reduction, but requires architecture change)

Follow-up: Your latency-sensitive trading application needs sub-1ms pod-to-pod latency. Which CNI would you choose and why? Design the network architecture.

You're choosing between Calico, Cilium, and Flannel for a new production cluster. Your requirements: 300 nodes, multi-region, policy enforcement, load balancing, and cost control. You have 2 weeks to decide and 4 weeks to deploy. Which do you pick and why? Walk through your evaluation criteria.

Evaluation framework (score each on scale 1-10):

Criterion 1: Operational Complexity
Flannel: 9/10 (simple, fewer moving parts)
Calico: 6/10 (more config, especially for BGP)
Cilium: 3/10 (eBPF learning curve, requires kernel expertise)
Winner: Flannel if you want low ops burden; Cilium if you're willing to invest.

Criterion 2: Policy Enforcement & Observability
Flannel: 3/10 (no native policies, requires Calico overlay)
Calico: 8/10 (rich policy language, but not east-west observability)
Cilium: 10/10 (Hubble provides packet-level flow visibility, L7 policies)
Winner: Cilium for security/compliance; Calico for policy-heavy workloads.

Criterion 3: Multi-Region Support
Flannel: 5/10 (VXLAN works, but high bandwidth cost across regions)
Calico: 9/10 (BGP with route reflection, designed for multi-region)
Cilium: 7/10 (Cilium Mesh exists, still maturing)
Winner: Calico for cost-efficient multi-region.

Criterion 4: Load Balancing (replacing kube-proxy)
Flannel: 0/10 (requires kube-proxy)
Calico: 0/10 (requires kube-proxy)
Cilium: 9/10 (eBPF-based, replaces kube-proxy, supports session affinity)
Winner: Cilium if you want modern service proxy.

Criterion 5: Resource Overhead
Flannel: 9/10 (20-50m CPU, 100-200m memory per node)
Calico: 6/10 (150-300m CPU, 300-500m memory)
Cilium: 4/10 (500-800m CPU, 1G memory, but includes kube-proxy)
Winner: Flannel for resource-constrained; Cilium competitive if you discount kube-proxy savings.

Criterion 6: Community & Production Maturity
Flannel: 9/10 (stable for years)
Calico: 10/10 (widely deployed, mature)
Cilium: 8/10 (growing adoption, still some instability reports)
Winner: Calico; Flannel is safe; Cilium is modern.

For a 300-node multi-region cluster, I'd recommend:
Primary choice: Calico (BGP mode) if cost and stability are priorities
Alternative: Cilium if you need advanced observability and want kube-proxy replacement
Skip: Flannel for multi-region (high bandwidth costs)

Decision logic:
IF (policy_enforcement == "critical" AND observability == "high") THEN Cilium
ELSE IF (multi_region == TRUE AND cost == "critical") THEN Calico
ELSE IF (simplicity == "priority") THEN Flannel (but single-region only)

Deployment timeline for Calico:
Week 1: Lab testing (3 nodes, 2 regions)
Week 2: BGP peer config with network team, policy design
Week 3: Staging deployment (50 nodes, shadow traffic)
Week 4: Production rollout (rolling 50 nodes/week)

Risk mitigation:
- Keep kube-proxy as fallback (don't remove immediately)
- Test policy updates in canary namespace first
- Monitor BGP stability during first 2 weeks
- Maintain Flannel manifests for rollback

Follow-up: Your cluster has mixed workloads: latency-sensitive services (1ms requirement) and batch jobs (cost-optimized). Can you use different CNIs for different workload types? Design this hybrid architecture.

You've deployed Cilium with eBPF on a cluster running older Linux kernels (4.9). Services work sometimes. You're seeing random packet drops and sporadic connection resets. Cilium logs show "bpf verifier error." What's happening and how do you recover?

eBPF programs require specific Linux kernel features. Older kernels (pre-5.x) have incomplete eBPF support, missing helpers, and verifier limitations. This causes runtime failures and unpredictable packet loss.

Diagnosis:

1. Check kernel version on nodes:
kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.nodeInfo.kernelVersion}{"\n"}{end}'
If you see 4.9.x or 4.14.x, you've identified the problem.

2. Verify eBPF verifier errors:
kubectl logs -n cilium -l k8s-app=cilium | grep -i 'verifier' # Look for: "invalid memory size" or "unreachable instructions"

3. Check eBPF program load status:
kubectl exec -n cilium cilium-xxxxx -- cilium bpf list # Check if programs loaded successfully
If programs show "FAILED", eBPF isn't actually active.

4. Confirm kernel capabilities:
ssh node-1 cat /boot/config-$(uname -r) | grep -E 'CONFIG_BPF|CONFIG_HAVE_EBPF_JIT|CONFIG_BPF_EVENTS' # Should all be =y
On older kernels, many of these will be missing or =m (module).

Recovery options:

Option A: Run Cilium in "legacy" mode without eBPF (immediate, but loses performance benefits)
kubectl set env daemonset/cilium -n cilium CILIUM_DEVICE=eth0 CILIUM_DISABLE_EBPF=true # Or via helm: helm upgrade cilium cilium/cilium \ --set ebpf.enabled=false \ --set kubeProxyReplacement=partial kubectl rollout restart daemonset/cilium -n cilium
This falls back to iptables-based packet processing (kube-proxy-like). Performance degradation: ~10-15% throughput loss.

Option B: Update nodes to newer kernel (2-3 hours downtime per node)
ssh node-1 sudo apt-get update && sudo apt-get install -y linux-image-5.15 ssh node-1 sudo reboot # Wait for node to rejoin cluster kubectl wait --for=condition=Ready node/node-1 --timeout=5m
After kernel upgrade, restart Cilium:
kubectl set env daemonset/cilium -n cilium CILIUM_DISABLE_EBPF=false kubectl rollout restart daemonset/cilium -n cilium

Option C: Replace with Calico (safer, but requires CNI switch)
kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.28.0/manifests/tigera-operator.yaml # See Network Policies question for full migration steps

Immediate fix (stabilize cluster):
1. Enable Cilium legacy mode NOW to restore stability
2. Plan kernel upgrades for this weekend
3. Test eBPF mode in staging with new kernels
4. Gradually migrate nodes: drain node → upgrade kernel → rejoin → enable eBPF

Prevention for future:

- Document minimum kernel requirement (5.10+) in runbook
- Add pre-flight check: cilium preflight kernel-check
- Include kernel version in node provisioning script:
#!/bin/bash MIN_KERNEL_VERSION="5.10" CURRENT=$(uname -r | cut -d. -f1,2) if [ "$CURRENT" -lt "$MIN_KERNEL_VERSION" ]; then echo "ERROR: Kernel too old for Cilium eBPF" exit 1 fi

Follow-up: You're pinned to old kernel due to legacy workload dependencies. How would you run Cilium alongside a kernel that doesn't support eBPF? Design a workaround.

You have a legacy application that requires IP spoofing capability (custom network stacks, real-time packet shaping). Your CNI plugin (Calico) normally prevents this for security. How do you safely enable IP spoofing for specific pods while keeping default deny for others?

IP spoofing is a privileged capability. CNIs block it by default via reverse-path filtering (rp_filter) and network namespacing. To allow selective spoofing, you need to bypass the CNI's restrictions at the pod level while maintaining cluster security.

Approach: Use pod security policies + custom eBPF rules + network namespace overrides.

Step 1: Create a SecurityPolicy for spoofing-enabled pods
apiVersion: v1 kind: Namespace metadata: name: legacy-network-apps labels: require-spoofing: "true" --- apiVersion: policy.k8s.io/v1beta1 kind: PodSecurityPolicy metadata: name: allow-spoof namespace: legacy-network-apps spec: privileged: false allowPrivilegeEscalation: true capabilities: add: - NET_RAW # Required for IP spoofing - NET_ADMIN fsGroup: rule: 'MustRunAs' ranges: - min: 1 max: 65535

Step 2: Create RBAC to restrict which pods can get NET_RAW
apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: spoof-pod-creator namespace: legacy-network-apps rules: - apiGroups: [""] resources: ["pods"] verbs: ["create", "get", "list"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: spoof-pod-binding namespace: legacy-network-apps roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: spoof-pod-creator subjects: - kind: ServiceAccount name: spoof-app namespace: legacy-network-apps

Step 3: Configure reverse-path filter bypass for these pods
Use an init container to disable rp_filter inside the pod's network namespace:
apiVersion: v1 kind: Pod metadata: name: legacy-app namespace: legacy-network-apps annotations: requires-spoofing: "true" spec: serviceAccountName: spoof-app initContainers: - name: disable-rp-filter image: busybox:latest command: - /bin/sh - -c - | sysctl -w net.ipv4.conf.all.rp_filter=0 sysctl -w net.ipv4.conf.default.rp_filter=0 securityContext: privileged: true containers: - name: app image: your-legacy-app:latest securityContext: capabilities: add: - NET_RAW - NET_ADMIN runAsUser: 1000

Step 4: Network policy: Isolate spoofing pods
Even with spoofing enabled, restrict their network access to prevent lateral movement:
apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: isolate-spoof-pod namespace: legacy-network-apps spec: podSelector: matchLabels: app: legacy-app policyTypes: - Ingress - Egress ingress: - from: - namespaceSelector: matchLabels: name: monitoring ports: - protocol: TCP port: 9090 egress: - to: - namespaceSelector: matchLabels: name: external-networks ports: - protocol: UDP port: 53 # DNS only - to: - ipBlock: cidr: 10.20.0.0/16 # Specific destination for packet shaping

Step 5: Verify spoofing capability
kubectl exec legacy-app -- cat /proc/sys/net/ipv4/conf/all/rp_filter # Should return: 0 (disabled)
Test IP spoofing:
kubectl exec legacy-app -- scapy >>> from scapy.all import IP, send >>> send(IP(src="192.168.1.100", dst="10.0.0.1")/ICMP()) # Should send packets with spoofed source IP

Step 6: Monitoring & Alerting
Log spoofing activity for compliance:
kubectl exec legacy-app -- tcpdump -i eth0 'ip.src != $(hostname -i)' -l | tee /var/log/spoofed-packets.log
Alert if spoofing pod sends traffic to unexpected destinations:
- alert: UnauthorizedSpoofedTraffic expr: rate(egress_packets{pod_label_requires_spoofing="true",destination_namespace!="external-networks"}[5m]) > 100 annotations: summary: "Spoofing pod {{$labels.pod_name}} sending traffic outside approved range"

Security audit trail:
kubectl get events -n legacy-network-apps | grep -E 'NET_RAW|privileged' kubectl audit logs | grep 'legacy-network-apps' | jq '.requestObject.spec.securityContext'

Follow-up: How would you monitor for unauthorized IP spoofing attempts across your cluster? Design a detection system that flags suspect network activity.

You've deployed Cilium in a cluster with thousands of pods. After a week, you notice pod-to-pod DNS queries are failing intermittently (1-2% of requests). The issue is DNS resolution timing out. Your infrastructure team says "Network is fine, check your CNI." Cilium's DNS proxy might be the culprit. How do you debug and fix this?

Cilium includes a DNS proxy for security and observability. If it's misconfigured or overloaded, DNS queries timeout and pods can't reach services by hostname.

Diagnosis:
1. Verify DNS is failing: kubectl run debug-pod --image=nicolaka/netshoot -it --rm -- nslookup kubernetes.default # Try multiple times: some succeed, some timeout?

  1. Check Cilium DNS proxy status: kubectl exec -n cilium cilium-xxxxx – cilium config | grep -i dns

Should show: dns-proxy-enabled=true (or similar)

  1. Monitor DNS query metrics: kubectl port-forward -n cilium svc/cilium-agent 6543:6543 & curl http://localhost:6543/metrics | grep -i dns | head -20

Look for: cilium_dns_queries_total, cilium_dns_failures_total

  1. Check Cilium logs for DNS errors: kubectl logs -n cilium -l k8s-app=cilium --tail=100 | grep -i 'dns|proxy'

    Root causes (most common):
    1. DNS proxy is CPU-saturated (high load, small replicas)
    2. Upstream DNS server (kube-dns/coredns) is slow
    3. DNS query caching is misconfigured
    4. DNS proxy pod is overloaded with too many queries

    Fix 1: Increase Cilium DNS proxy resources
    kubectl set resources daemonset/cilium -n cilium \

–limits=cpu=1000m,memory=1Gi
–requests=cpu=500m,memory=512Mi

Fix 2: Enable DNS query caching:
kubectl exec -n cilium cilium-xxxxx – cilium config
–set dns-cache-enabled=true
–set dns-cache-min-ttl=300
–set dns-cache-max-ttl=86400

Fix 3: Monitor upstream DNS performance:
kubectl run dns-perf-test --image=alpine -it --rm –
time nslookup kubernetes.default

Measure query time, should be <100ms

If upstream is slow, scale coredns:
kubectl scale deployment coredns -n kube-system --replicas=3 kubectl get deployment coredns -n kube-system

Prevention:
- alert: DNSQueryLatency expr: histogram_quantile(0.95, cilium_dns_query_latency_seconds_bucket) > 0.5 for: 5m annotations: summary: "DNS queries p95 latency > 500ms"

  • alert: DNSProxyErrors expr: rate(cilium_dns_failures_total[5m]) > 0.01 annotations: summary: "DNS proxy error rate {{ $value }}/sec"

Follow-up: How would you troubleshoot DNS resolution if you suspect the problem is with the application's DNS client (retry behavior, timeout settings) vs. the CNI?