AWS Interview — EC2 Instance Types and Nitro Platform

You're provisioning compute for a high-performance database workload and need to choose between c6i.8xlarge (Intel, 3.5 GHz, $1.36/hr on-demand) and c7g.8xlarge (Graviton2, 3.0 GHz, $1.18/hr on-demand). Both are on the Nitro platform. What are the tradeoffs?

Both are Nitro instances, but they differ in CPU architecture and performance characteristics: c6i.8xlarge (3rd Gen Intel Xeon) — more raw GHz (3.5 vs 3.0), but older microarchitecture, higher power consumption, better compatibility with legacy software that's x86-optimized. c7g.8xlarge (Graviton2) — lower GHz, but newer microarchitecture, 5-7% better per-cycle efficiency, lower power (ARM-based), 13% cheaper. For database workloads: (1) If using MySQL/PostgreSQL compiled for ARM with proper optimization flags, c7g is better value (lower cost, similar perf). (2) If using Oracle or other proprietary DBs with x86 licensing, c6i might be contractually required. (3) If running encryption-heavy workloads, c6i has Intel AES-NI, but Graviton2 also has crypto extensions. (4) Memory bandwidth — both support DDR4, but Nitro architecture abstracts this, so real-world performance depends on workload. Recommendation: Benchmark both on your actual workload (e.g., TPCC for databases). For most new deployments, choose Graviton (c7g/r7g/m7g) — AWS is pushing Graviton, they're cheaper, and performance is equal or better. Use c6i if you have binary compatibility issues. To provision: `aws ec2 run-instances --image-id ami-xxxxx --instance-type c7g.8xlarge --key-name my-key --count 1`. Monitor performance with CloudWatch `CPUUtilization`, `NetworkIn/Out`, `DiskReadOps`. If CPU-bound query performance is 5-10% worse on c7g, scale up one instance size; the cost savings usually compensate.

Follow-up: You benchmark both, and c7g shows 8% worse query latency than c6i. The GHz difference explains 3%, but where's the remaining 5%?

Your application uses m5.2xlarge instances (Nitro, previous gen Intel). AWS is recommending you upgrade to m7i.2xlarge (Nitro, latest gen Intel). What concrete improvements can you expect, and is the migration worth the downtime?

Nitro platform improvements from m5 to m7: (1) CPU generation — m5 is 2nd Gen Xeon (Cascade Lake, 2019), m7i is 4th Gen Xeon (Sapphire Rapids, 2023). Microarchitecture improvements give ~15-20% per-core performance gain. (2) Memory — m5 has DDR4, m7i has DDR5, which is faster and higher bandwidth. This benefits memory-intensive workloads. (3) I/O throughput — m7i has higher EBS throughput (up to 19 Gbps vs 14 Gbps on m5). (4) Network — m7i supports up to 30 Gbps vs 20 Gbps on m5. For most applications, the gains are: compute-bound workloads 10-20% improvement, memory-bound 5-15%, I/O-bound 10-25%. Migration strategy: (1) Launch a test m7i.2xlarge with the same config as m5, run your workload benchmarks. (2) Compare metrics: CPU%, memory%, EBS latency, network throughput. (3) If performance gain is >10% or cost reduction (m7i is usually ~5% cheaper), migrate. (4) If downtime is acceptable, terminate m5 and run m7i. If you need zero-downtime, use a load balancer — spin up m7i in a new ASG, shift traffic gradually, then terminate m5. Cost: m5.2xlarge is $0.384/hr, m7i.2xlarge is $0.367/hr (roughly 4% cheaper). Payback period for migration effort is ~1 week for a 1-2 instance environment. For large fleets, migration is justified. Command: `aws autoscaling update-auto-scaling-group --auto-scaling-group-name my-asg --instance-type m7i.2xlarge` and update the launch template.

Follow-up: You migrate to m7i.2xlarge, and performance improves 12%, but latency variance increases (P99 latency increases by 15%). Why?

Your ec2 instances experience unpredictable performance degradation during the day. Some hours, CPU-bound workloads run at full clock speed, others they're throttled to 2.5 GHz. You're using on-demand m6i instances. What's causing this?

On-demand instances can experience CPU throttling due to: (1) Thermal limits — if the physical server's CPU core temperature reaches thermal threshold (typically ~90C), the hypervisor throttles all instances on that server to cool it down. This is rare on well-maintained AWS infrastructure but can happen during heat waves or if the data center HVAC is under stress. (2) CPU credit system — if using T-series instances (burstable), throttling is by design. But m6i is not burstable, so this doesn't apply. (3) Host-level resource sharing — m6i instances are not fully isolated; if other instances on the same physical host are CPU-intensive, the hypervisor might not throttle your instance, but shared cache/memory bandwidth could degrade performance. (4) Power limits — if the physical server's power supply is reaching limits, the hypervisor throttles CPUs to reduce power draw. Diagnosis: (1) Check CloudWatch `CPUUtilization` metric — if it's not hitting 100% but workload is maxed, that's odd. (2) Check for CPU throttling signals — use `aws ec2 describe-instances --instance-ids i-xxxxx` and look at `CpuOptions`, which shows the instance's CPU count and core count. (3) Inside the instance, run `cat /proc/cpuinfo | grep MHz` to see current core frequencies. If some cores show 2.5 GHz vs 3.5 GHz rated, throttling is active. (4) Check `/var/log/messages` or system logs for thermal or power warnings. Fix: (1) If thermally limited, request instance replacement — AWS may move you to a cooler host. (2) If power-limited, reduce CPU load or upgrade to a larger instance. (3) If sharing bandwidth, use Dedicated Hosts to isolate from other customers. For most cases, this is temporary; AWS resolves it. Escalate to AWS support with evidence of throttling.

Follow-up: You check /proc/cpuinfo and all cores show the rated frequency. But CloudWatch shows CPU% is lower than expected for your workload. What could be the cause?

You need to choose between r6i.2xlarge (memory-optimized, Intel, 1 vCPU : 2.1 GiB RAM) and r7i.2xlarge (memory-optimized, Intel, 1 vCPU : 1.8 GiB RAM) for an in-memory cache workload. The r7i is 10% cheaper. Why does the r7i have less RAM per vCPU, and which should you pick?

The RAM-per-vCPU ratio differs because: (1) r6i and r7i are different product lines with different pricing models. r6i is optimized for consistent memory per CPU, r7i is designed for cost-efficiency and power efficiency. (2) r7i uses DDR5 memory (faster, newer), so fewer DIMMs are needed to achieve similar bandwidth. r6i uses DDR4 (older, slower), so more DIMMs are needed for bandwidth. (3) Graviton-based instances (r7g) have even different ratios. AWS calibrates RAM ratios based on the actual memory subsystem performance, not just quantity. For cache workloads, the question is: do you need all 614 GiB on r6i.2xlarge, or would r7i.2xlarge's 524 GiB suffice? (1) If your cache set fits in 524 GiB, r7i saves 10% cost. (2) If you need all 614 GiB, stay with r6i. (3) Consider memory bandwidth — DDR5 on r7i is faster, so per-GB throughput is better. For Memcached/Redis, throughput matters more than quantity. Benchmark: Measure cache hit rate and latency on r7i. If no regression, switch. If your working set requires 600+ GiB, r6i is necessary. Decision: Start with r7i for new deployments. If you hit memory limit, upgrade to r7i.4xlarge (768 GiB) rather than switching to r6i.2xlarge. Usually, r7i.4xlarge is cheaper than r6i.2xlarge per GB. Command: `aws ec2 run-instances --instance-type r7i.2xlarge --image-id ami-xxxxx`. Monitor with CloudWatch `MemoryUtilization` (via CloudWatch agent). If consistently >80%, you'll thrash; if <50%, you're over-provisioned and should downsize.

Follow-up: You switch to r7i.2xlarge and see lower cache hit rate than r6i despite the same working set size. Why?

Your application is latency-sensitive (< 100ms P99). You're running on c5.2xlarge instances. AWS recommends upgrading to c6i or c7i, both on Nitro. Which Nitro generation should you choose for the lowest latency?

For latency-sensitive workloads, the choice depends on: (1) CPU architecture — c6i (Intel Xeon) vs c7i (Intel Xeon, newer gen). Newer gen (c7i) has lower latency-critical microarchitecture improvements: better branch prediction, lower cache miss penalties, faster memory access. (2) Nitro generation — both c6i and c7i are on Nitro, but c7i benefits from a newer Nitro hypervisor with better interrupt handling and context-switch overhead reduction. (3) Clock speed — c6i has higher base clock (3.5 GHz), c7i lower (3.4 GHz). For latency, microarchitecture matters more than raw GHz. (4) L3 cache — c7i has larger L3 cache per core, reducing memory access latency. Recommendation: For latency-sensitive < 100ms, choose c7i. The newer microarchitecture reduces cache misses and context-switch overhead, which directly impacts tail latency (P99). c7i is also ~5% cheaper, so win-win. However, if you're currently on c5, the jump to c7i should be ~25-30% latency improvement due to: (1) Nitro hypervisor improvements (~10%), (2) CPU microarchitecture improvements (~15%). Test: (1) Spin up c7i.2xlarge alongside c5.2xlarge with identical config. (2) Run load test, capture latency distribution (P50, P95, P99). (3) If P99 improves >20%, migrate fully. Command: `aws ec2 run-instances --instance-type c7i.2xlarge --image-id ami-xxxxx`. Monitor with CloudWatch `TargetResponseTime` (from ALB) or application-side latency tracking. If you're already on c6i, upgrading to c7i is marginal (5-10% improvement), unless you're struggling with tail latency.

Follow-up: You test c7i vs c5, and P99 improves only 8%, not 25-30%. You check utilization — CPU is 85%, memory is 40%. What's limiting the latency improvement?

Your application runs on x1e.2xlarge instances (Nitro, Intel, high-memory). It's a real-time analytics system consuming 500+ GB of data. You're seeing memory pressure and hitting swap. AWS suggests using Graviton3-based instances (x2gd). Should you switch? What's the catch?

Graviton3-based x2gd instances offer (1) More memory per vCPU than x1e (e.g., x2gd.2xlarge has 512 GiB vs x1e.2xlarge with 250 GiB), (2) 40% lower cost than x1e, (3) Better power efficiency. However, the catch: (1) Graviton3 is ARM-based, not x86. If your analytics tool is x86-only or has binary incompatibilities, it won't run. (2) Graviton3 is newer and less mature than Intel Xeon. Some edge-case workloads may have unexpected issues. (3) You must recompile/rebuild your application or use ARM-native containers. Decision: (1) Check if your analytics stack supports ARM — most modern tools (Python, Java, Node.js, PostgreSQL, Elasticsearch) have ARM builds. (2) If yes, x2gd is a clear win: more memory, lower cost, better perf. (3) If no, stick with x1e or migrate to x1 (older Intel, cheaper than x1e). To migrate: (1) Build ARM-native Docker image of your application. `docker build --platform linux/arm64 -t my-app:arm64 .`. (2) Test on a small x2gd.xlarge instance. (3) If stable, scale to x2gd.2xlarge or larger. (4) If performance issues, profile CPU/memory, compare to x1e. Command: `aws ec2 run-instances --instance-type x2gd.2xlarge --image-id ami-xxxxx-arm64`. Benchmark: Run the same analytics query on both, measure time and memory usage. x2gd should be 10-20% faster per query due to better memory bandwidth and lower swap pressure.

Follow-up: You test x2gd.2xlarge with your ARM-compiled analytics app. Performance is 25% faster than x1e.2xlarge, but the application is using 550 GiB out of 512 GiB available (x2gd), hitting swap. Why, if x2gd has more memory?

Your company has strict compliance requirements: all instances must run on single-tenant hardware (no hypervisor oversubscription). You're considering using Dedicated Hosts (c6i Dedicated Host) vs Dedicated Instances (c6i.2xlarge dedicated). Both are Nitro. What's the difference, and which fits compliance?

Dedicated Hosts vs Dedicated Instances: (1) Dedicated Hosts — you rent the entire physical server. Only your instances run on it. You manage the host, can license software per-core, and have full control. Perfect for compliance requiring zero other customers' code running on shared hardware. (2) Dedicated Instances — you rent instances on a physically isolated server, but AWS still manages the underlying hardware. No other customer instances run on your hardware, but AWS could consolidate multiple dedicated instances from different accounts on the same physical host (unlikely, but possible in theory). For strict compliance (e.g., healthcare, finance), Dedicated Hosts are the right choice. For typical compliance (e.g., SOC2 requiring isolation), Dedicated Instances suffice and are cheaper. Performance: both are Nitro with identical CPU/memory/network. Nitro hypervisor overhead is the same. Cost: Dedicated Hosts cost ~2x Dedicated Instances due to the full-host purchase model, plus licensing flexibility. Recommendation: (1) Review compliance documentation — if it explicitly says "no other customer code can run on hardware," use Dedicated Hosts. (2) If it says "logically isolated," Dedicated Instances are fine. (3) Dedicated Hosts are also useful for BYOL (bring your own license) software where you need per-core licensing control. To provision Dedicated Host: (1) `aws ec2 allocate-hosts --instance-type c6i --quantity 1 --availability-zone us-east-1a` (reserves the host). (2) `aws ec2 run-instances --instance-type c6i.2xlarge --placement HostId=h-0xxxxx --image-id ami-xxxxx` (launches on the host). Monitor host utilization with CloudWatch to ensure you're not wasting the full host. If a single c6i.2xlarge on a host is only 50% of capacity, you're paying for unused compute.

Follow-up: You allocate a Dedicated Host for c6i instances. You launch one c6i.2xlarge and monitor it. Host CPU is at 20%, but the instance CPU is at 95%. Why the discrepancy?