Skip to main content

AWS Interview Questions for 2026: 40+ Questions Across EC2, S3, Lambda, VPC, IAM, and the Scenarios Interviewers Actually Ask

AWS interview questions in 2026 split into three buckets: service knowledge (EC2 / S3 / Lambda / VPC), scenario design (how would you architect X), and cost plus security trade-offs. The new-grad pain point is the gap between book knowledge and the production decisions interviewers care about. This guide gives 40+ questions across the buckets, the scenarios you'll actually be asked, and how to compensate when you've never run a production AWS account.

By Alex Chen, Founder, InterviewChamp.AI · Last updated

29 min read

What AWS interview questions actually test in 2026

AWS interview questions in 2026 test three things in roughly this order: whether you know the major services well enough to talk about them without reading from a slide, whether you can sketch an architecture for a vague prompt without freezing, and whether you reason about cost and security trade-offs when the prompt is ambiguous. The service-knowledge floor is non-negotiable. The scenario-design layer is where most interviews are won or lost. The trade-off layer is the senior signal even at the entry level.

The 2026 hiring environment has tightened. Cloud Engineer, DevOps, and SRE roles are still hiring entry-level talent at a meaningfully higher rate than pure SWE roles. The catch is that hiring managers now expect candidates to have spent real free-tier time, not just read the documentation. A new grad who can describe S3 but has never created a bucket with versioning enabled reads as someone who studied for the interview but didn't do the work. That gap is the single biggest filter at the entry-level AWS interview.

Distribution of question types most new-grad candidates report seeing in their AWS loops:

  • 40-50% service knowledge (EC2, S3, Lambda, VPC, IAM, common follow-ups)
  • 25-35% scenario design (architect this system, sketch this data flow)
  • 10-20% cost and security trade-offs (storage class choice, IAM least privilege, multi-AZ vs single-AZ)
  • 5-10% troubleshooting (here's a 500 error, walk me through how you'd debug it)

The scenario-design slice is the one most underprepared. It also disproportionately determines outcome. A candidate who nails 25 service-knowledge questions but freezes on the scenario prompt usually loses the round. A candidate who fumbles three service-knowledge questions but draws a credible architecture out loud usually advances.

The 3 buckets of AWS interview questions

Before drilling specific questions, sort the surface area into the three buckets. Most candidates spend too much time on bucket one and not enough on buckets two and three.

Bucket 1: Service knowledge. What is S3. What is the difference between EBS and EFS. What does Lambda do. These are the questions that filter out candidates who haven't read the AWS documentation. They're the floor, not the ceiling. Expect 12-15 of them in a typical loop.

Bucket 2: Scenario design. Design a URL shortener. Design a system that processes 10 million log events per hour. Design a photo-sharing app. These are the questions that filter candidates who can recite services from candidates who can actually use them. Expect 1-2 of them in a typical loop, but each one weighs heavier than the bucket-one questions combined.

Bucket 3: Cost and security trade-offs. When would you pick S3 Glacier over S3 Standard. How do you secure a VPC. What does least privilege mean in IAM. These probe whether you reason about non-functional requirements (cost, security, compliance) the way a real engineer does, instead of just picking services from a menu. Expect 3-5 of them in a typical loop.

Honest call here: if you only have a weekend before the AWS round, spend it on bucket two. Sketch 6-8 scenario answers out loud. Bucket-one knowledge is what most candidates over-prepare. Bucket-two scenario fluency is what most candidates wing.

AWS service-knowledge questions (15 Q)

The 15 questions below cover the surface area of an entry-level AWS interview. Memorize the answer outlines. Adapt the language to your own voice. The structure is the load-bearing part.

EC2 interview questions (3 Q)

Q1. What is EC2 and what are the main instance types?

EC2 (Elastic Compute Cloud) is AWS's virtual server offering. You launch instances of various sizes and pay by the second. Instance families group sizes by workload type. General-purpose (M and T families) for balanced workloads. Compute-optimized (C family) for CPU-heavy work. Memory-optimized (R and X families) for in-memory databases and caches. Storage-optimized (I and D families) for high-IOPS local storage. GPU instances (P and G families) for ML training and rendering. The interview signal: knowing the family letters and matching them to workload types. Not memorizing every size in every family.

Q2. What is the difference between On-Demand, Reserved, and Spot instances?

On-Demand is pay-by-the-second, no commitment. Use for unpredictable workloads. Reserved (now called Savings Plans for newer commitments) gives you a discount of 30-70% in exchange for a 1- or 3-year commitment. Use for steady-state production workloads. Spot lets you bid on spare EC2 capacity for up to 90% off, with the catch that AWS can terminate the instance with 2 minutes' notice. Use for fault-tolerant batch jobs, CI/CD runners, and big-data processing where interruptions are tolerable.

Q3. What is an AMI?

An AMI (Amazon Machine Image) is a snapshot of an EC2 instance's root volume. Includes the OS, installed software, configuration. You can launch new instances from an AMI to get an identical starting state. The 2026 best practice is to build AMIs with tools like Packer or EC2 Image Builder, version them, and reference the version in your Launch Template. That's how teams achieve repeatable infrastructure without configuration drift.

S3 interview questions (3 Q)

Q4. What is S3 and what are its main use cases?

S3 (Simple Storage Service) is AWS's object storage. You store files (objects) in buckets, each with a globally unique name. Common use cases: static website hosting, backups, data lake storage, distribution origin for CloudFront, software artifact storage, log archival. The 11-nines durability number (99.999999999%) is the headline. The realistic latency for first-byte reads is 100-200ms, which matters when you're putting S3 directly behind a user-facing endpoint.

Q5. What are the main S3 storage classes?

Five tiers. S3 Standard for frequently accessed data. S3 Standard-IA (Infrequent Access) for monthly-or-less reads with quick retrieval. S3 One Zone-IA for the same access pattern when you can tolerate single-AZ durability. S3 Glacier Instant Retrieval for quarterly access. S3 Glacier Deep Archive for compliance archives kept for years. The trade-off: cheaper storage costs more per retrieval. The cost math matters when you're storing terabytes. For small data it doesn't.

Q6. What's the difference between versioning, lifecycle rules, and replication in S3?

Versioning keeps every version of an object so you can restore after an accidental delete or overwrite. Lifecycle rules automatically transition objects between storage classes (e.g., Standard to Glacier after 90 days) or delete them after N days. Replication copies objects to another bucket, either in the same region (SRR) or across regions (CRR). The interview-relevant case: a real production bucket usually has all three. Versioning for safety, lifecycle for cost, replication for disaster recovery.

Lambda interview questions (3 Q)

Q7. What is Lambda and what's the maximum execution time?

Lambda runs your code in response to events without you managing servers. You upload a function, AWS handles the runtime, scaling, and patching. The maximum execution time per invocation is 15 minutes. The function can run in Python, Node.js, Java, .NET, Go, Ruby, or any custom runtime via a container image. Pricing is per invocation plus per millisecond of execution time, rounded up. Memory is configurable from 128 MB to 10,240 MB, and CPU scales with memory.

Q8. What's a cold start and how do you mitigate it?

A cold start happens when AWS spins up a new container to run your function. The runtime initializes, dependencies load, your handler begins. For Python and Node, cold start is typically 100-500ms. For Java and .NET, often 1-3 seconds. Mitigations: keep packages small (every imported library adds initialization time), use Provisioned Concurrency to keep N containers always warm, pick a lighter runtime for latency-sensitive APIs, and avoid putting the function in a VPC unless you need to. Late 2025 brought the VPC cold-start penalty way down, but it's still measurable.

Q9. How does Lambda handle concurrency?

Lambda automatically scales to handle concurrent invocations, up to the account's regional concurrency limit (default 1,000, raisable to tens of thousands). Each concurrent invocation gets its own container. If your code holds state in memory between invocations, you're betting on container reuse (warm starts), which is not guaranteed. Use Provisioned Concurrency when you need predictable concurrency for low-latency APIs. Use Reserved Concurrency when you want to cap a specific function's concurrency to protect downstream systems.

VPC interview questions (3 Q)

Q10. What is a VPC and what's a subnet?

A VPC (Virtual Private Cloud) is your own isolated network inside one AWS region. It has a CIDR range (e.g., 10.0.0.0/16). A subnet is a smaller CIDR range within the VPC, tied to one Availability Zone. Subnets are public if they have a route to an Internet Gateway, private if they don't. The typical pattern is to create public subnets for load balancers and bastion hosts, and private subnets for application servers and databases.

Q11. What's the difference between a Security Group and a Network ACL?

A Security Group is an instance-level firewall. Stateful (return traffic is allowed automatically), supports allow rules only, attaches to ENIs (so to EC2 instances, RDS, Lambda, etc.). A Network ACL is a subnet-level firewall. Stateless (you must allow return traffic explicitly), supports both allow and deny rules, applies to all traffic crossing the subnet boundary. In practice, almost everyone configures Security Groups and leaves Network ACLs at defaults. Network ACLs are the second line of defense, useful for blocking specific CIDR ranges at the subnet level.

Q12. What's a NAT Gateway and when do you need one?

A NAT Gateway lets instances in private subnets reach the internet for outbound traffic (e.g., to call an external API or download packages) without exposing them to inbound traffic. It lives in a public subnet, gets a public IP, and translates outbound traffic from private subnets. The cost trap: NAT Gateway charges per GB of data processed, which adds up fast. For high-egress workloads, VPC endpoints (for AWS services like S3 and DynamoDB) cut the bill by keeping the traffic inside AWS.

IAM interview questions (3 Q)

Q13. What's the difference between an IAM user, role, and policy?

An IAM user is a person or application with long-term credentials. An IAM role is a set of permissions assumed temporarily by users, services, or other AWS accounts. An IAM policy is a JSON document defining what actions are allowed on what resources. Roles and users attach policies to gain permissions. The 2026 best practice is to avoid IAM users (especially with long-lived access keys) and grant permissions through roles whenever possible. IAM Identity Center (formerly AWS SSO) handles human access. EC2 instance profiles and Lambda execution roles handle service access.

Q14. What does least privilege mean and how do you enforce it?

Grant the minimum permissions required to do the job. A Lambda function that reads from one S3 bucket should have an IAM role that allows only s3:GetObject on that specific bucket ARN. Not s3:. Not Resource ''. Enforcement: write narrow policies from the start, use IAM Access Analyzer to find unused permissions, review and tighten policies quarterly. The hard part is operational. Developers often start with overly permissive policies during development and forget to tighten them before production.

Q15. What's the difference between an IAM policy and a resource-based policy?

An IAM policy is attached to a user or role and defines what that identity can do. A resource-based policy is attached to a resource (like an S3 bucket or a Lambda function) and defines who can access that resource. The two work together. To access an S3 object, the requesting identity needs both an IAM policy allowing s3:GetObject AND a bucket policy that doesn't deny the request. Resource-based policies are essential for cross-account access where you don't have an IAM identity in the other account to attach a policy to.

AWS scenario-based interview questions (10 Q)

Scenario questions are where the entry-level AWS interview gets won or lost. Each question below is an open-ended architecture prompt. The sample answer outline is the bones of a passing response, not a full canned answer. Adapt the language to your own voice.

Scenario 1: Design a URL shortener that handles 10K daily users.

API Gateway in front of Lambda for the write endpoint (POST /shorten) and read endpoint (GET /:code). DynamoDB to store the short-code-to-long-URL mapping with the short code as the partition key. CloudFront in front of the read endpoint to cache popular redirects. Generate short codes by hashing the long URL or by base62-encoding a counter from DynamoDB. State the trade-off: hash is simpler but can collide. Counter requires a hot partition for the counter row but guarantees uniqueness.

Scenario 2: Design a photo-sharing app that supports uploads, thumbnails, and feeds.

S3 for original images (uploaded via presigned URL directly from the browser, never through your backend). Lambda triggered on S3 upload to generate thumbnails. DynamoDB for the user-to-photo metadata table. CloudFront in front of S3 for global image delivery. API Gateway plus Lambda for the feed endpoint, which queries DynamoDB. Cognito for user authentication. Trade-off discussion: presigned URLs save your backend from being a file-upload bottleneck. The thumbnail generation could be sync (in the upload Lambda) or async (via S3 event to a separate Lambda). Async is more scalable.

Scenario 3: Design a system that processes 10 million log events per hour.

10M events per hour is about 2,800 per second. Kinesis Data Streams or Managed Streaming for Kafka (MSK) for the ingestion layer. Lambda or Kinesis Data Firehose to consume from the stream. Firehose for batch-and-load into S3 in Parquet format. Athena or Redshift for ad-hoc querying. OpenSearch for full-text search if needed. Trade-off: Firehose is simpler but introduces 60+ seconds of latency. Streams plus Lambda gives lower latency but more code to maintain.

Scenario 4: Design a real-time chat backend.

API Gateway with WebSocket API for the persistent connection. Lambda to handle connect, disconnect, and message events. DynamoDB to store the connection-ID-to-user mapping (so you can route messages to the right connection). DynamoDB or another store for message history. SNS or DynamoDB Streams for fan-out when one user sends a message to multiple recipients. Trade-off: API Gateway WebSocket has a 10-minute idle timeout and a 2-hour total connection limit, which forces the client to handle reconnects gracefully. For chat with millions of concurrent connections, the AppSync alternative or a custom EC2-based WebSocket server might be cheaper at scale.

Scenario 5: Design a video transcoding pipeline.

S3 for upload bucket. S3 event triggers a Lambda that submits a job to MediaConvert (the managed transcoding service). MediaConvert outputs multiple resolutions to another S3 bucket. DynamoDB to track job status. CloudFront in front of the output bucket for delivery. If MediaConvert is over-engineered for the scale, an EC2 fleet running ffmpeg with SQS for job queueing is the older-school approach. Discuss the cost trade-off: MediaConvert is operationally simpler but more expensive per minute of video.

Scenario 6: Design an e-commerce checkout system.

ALB in front of an Auto Scaling Group of EC2 (or Fargate) running the checkout service. RDS Aurora Multi-AZ for the orders table with strong consistency guarantees. DynamoDB for the cart (it can be eventually consistent and high throughput). SQS for the order-placed event queue, with Lambda consumers handling email confirmation and inventory updates. ElastiCache for session storage. Trade-off discussion: RDS over DynamoDB for orders because the access pattern needs transactional integrity. DynamoDB for cart because the access pattern is single-key read/write at high volume.

Scenario 7: Design a real-time leaderboard for a mobile game.

ElastiCache for Redis with sorted sets for the leaderboard (ZADD to update score, ZREVRANGE to get top N). Lambda or API Gateway as the write path. DynamoDB or RDS as the durable store for player records. Trade-off discussion: Redis sorted sets are the canonical leaderboard data structure. Doing this on DynamoDB is possible (Global Secondary Index sorted by score) but slower at high write throughput. ElastiCache Redis is the right call here unless cost is the dominant constraint.

Scenario 8: Design an IoT data ingestion pipeline.

IoT Core to receive MQTT messages from devices. Kinesis Data Streams to buffer the data. Lambda to process and enrich. Firehose to land batches in S3 in Parquet. Timestream or DynamoDB for time-series queries. Athena for ad-hoc analysis. Trade-off: Timestream is purpose-built for time-series and meaningfully cheaper than DynamoDB at this access pattern, but DynamoDB is more familiar to most teams. Discuss the choice based on team expertise.

Scenario 9: Design a centralized logging system.

CloudWatch Logs as the default destination. Subscription filters to Firehose to land logs in S3 in Parquet. OpenSearch for searchable indexing of recent logs (7-30 days). Lifecycle rules on the S3 bucket to move older logs to Glacier. Trade-off: OpenSearch is expensive. Many teams skip it and use Athena directly against S3 for older logs, accepting slower query latency. For real-time alerting, CloudWatch metric filters with alarms.

Scenario 10: Design a multi-region disaster recovery setup.

Route 53 with health checks and failover routing as the entry point. Active-passive: primary region serves all traffic, secondary region is warm-standby. Aurora Global Database for the database tier (sub-second replication lag). S3 Cross-Region Replication for static assets. Application Auto Scaling Groups configured in both regions, scaled to zero in the standby region until failover. Trade-off discussion: active-passive is cheaper but slower to recover. Active-active is faster recovery but doubles compute cost and complicates write consistency. For a CS-new-grad scenario, active-passive is the right starting answer.

AWS DevOps interview questions (5 Q)

DevOps loops add CI/CD, infrastructure-as-code, and monitoring on top of the service-knowledge floor. Five questions worth rehearsing.

Q16. What's the difference between CloudFormation and Terraform?

CloudFormation is AWS's native infrastructure-as-code tool, using YAML or JSON templates. Terraform is HashiCorp's multi-cloud tool using HCL syntax. Trade-offs in 2026: CloudFormation is tightly integrated with AWS and supports new services first. Terraform has better state management and module ecosystem, plus it works across providers. Most non-AWS-only teams use Terraform. AWS-only teams split roughly 50-50, often based on what the team is used to.

Q17. How would you design a CI/CD pipeline on AWS?

CodePipeline as the orchestrator. CodeCommit (or GitHub) as the source. CodeBuild for the build stage. CodeDeploy for the deployment. Each stage triggers the next. For container workloads, ECR holds the images and ECS or EKS handles the deploy. For serverless, deploy via CloudFormation or SAM templates. The interview signal isn't naming the services. It's discussing the trade-offs: rolling vs blue-green vs canary deploys, manual approval gates between staging and prod, rollback strategy.

Q18. What's the difference between CloudWatch Logs, CloudWatch Metrics, and X-Ray?

CloudWatch Logs ingests log lines from your applications. CloudWatch Metrics tracks numerical time-series (CPU, request count, custom metrics). X-Ray traces requests across distributed services so you can see where latency comes from. The 2026 senior signal: knowing that CloudWatch Logs Insights lets you query logs with a SQL-like syntax, and that you can extract metrics from logs using metric filters. Without that, you end up paying for both Logs and a separate Metrics ingestion of the same data.

Q19. What's an Auto Scaling Group and how does it work with CloudWatch?

An ASG maintains a target number of EC2 instances across one or more AZs. CloudWatch metrics (CPU utilization, custom metrics) trigger scaling policies that add or remove instances. Step scaling adds N instances when a metric crosses a threshold. Target tracking scales to keep a metric at a target value (e.g., keep average CPU at 50%). Predictive scaling forecasts demand and pre-scales. The interview signal: knowing that ASG works for EC2 but not for Lambda (Lambda scales automatically without an ASG concept).

Q20. How do you deploy a new version of a containerized service with zero downtime?

Two main patterns. Rolling deploy: gradually replace old containers with new ones, keeping the load balancer healthy. Simple but old and new code run simultaneously during the rollout. Blue-green: spin up an entire new fleet with the new version, swap the load balancer target group when ready, keep the old fleet for rollback. More resource-intensive but cleaner cutover. Canary: route a small percentage of traffic to the new version, monitor, gradually increase. The safest for risky changes. ECS, EKS, and Lambda all support these patterns natively with different mechanisms.

AWS Solutions Architect interview questions (5 Q)

Solutions Architect loops emphasize scenario design, trade-off articulation, and the Well-Architected Framework. Five questions worth rehearsing if you're targeting an SA-Associate path.

Q21. What are the pillars of the Well-Architected Framework?

Six pillars as of late 2025: Operational Excellence (runbooks, observability, automation), Security (least privilege, encryption, defense in depth), Reliability (multi-AZ, recovery objectives, fault isolation), Performance Efficiency (right-sizing, scaling, monitoring), Cost Optimization (instance pricing models, lifecycle policies, unused-resource cleanup), and Sustainability (energy efficiency, region selection). The interview signal isn't reciting all six. It's bringing one or two into a scenario discussion: "I'd pick Aurora over RDS here for the Reliability pillar, but I'd also consider the cost trade-off."

Q22. How do you choose between RDS, Aurora, and DynamoDB?

RDS is managed relational databases (Postgres, MySQL, MariaDB, SQL Server, Oracle). Aurora is AWS's cloud-native re-architecture of MySQL and Postgres with better performance and scalability. DynamoDB is a managed NoSQL key-value and document store. Decision tree: if the access pattern requires complex SQL with joins, pick RDS or Aurora. If it's key-value or document with predictable single-key access, pick DynamoDB. Aurora over RDS for any new workload unless the team has a strong Postgres-specific reason. DynamoDB scales to billions of items if your access pattern fits. RDS scales vertically with a hard ceiling.

Q23. How do you secure data at rest and in transit on AWS?

At rest: enable encryption on every storage service (S3 bucket-level encryption, EBS volume encryption, RDS storage encryption, DynamoDB encryption). Use KMS-managed keys for centralized key management and audit. In transit: TLS everywhere. Force HTTPS on CloudFront and ALB, use VPC endpoints to keep AWS-service traffic off the public internet, use TLS for RDS connections. The Well-Architected Security pillar prioritizes both as defaults, not opt-ins.

Q24. How would you migrate an on-premises application to AWS?

Six R's of migration: Rehost (lift and shift), Replatform (lift and reshape, e.g., move to RDS), Repurchase (switch to a SaaS), Refactor (re-architect for cloud-native), Retire (kill it), Retain (keep on-prem). The interview signal: knowing that rehost is fastest but leaves cost optimization on the table, and refactor is slowest but gets the best long-term economics. For most enterprise migrations, the answer is a portfolio: rehost the time-pressed apps first to clear the data center, then refactor the highest-cost or most-strategic apps over the following 12-24 months.

Q25. How do you design for the AWS Shared Responsibility Model?

AWS is responsible for security OF the cloud (physical data centers, host operating systems, network infrastructure). Customer is responsible for security IN the cloud (your data, your IAM policies, your application code, your guest OS patching on EC2). For managed services (Lambda, S3, DynamoDB), AWS handles more. For self-managed services (EC2, ECS on EC2), you handle more. The interview signal: knowing where the line falls for each service type and what you, as the customer, are still on the hook for.

AWS interview questions for freshers (without production AWS experience)

The hard truth: most CS new grads applying to AWS-adjacent roles in 2026 don't have production AWS experience. Hiring managers know this. They calibrate accordingly. What separates the candidates who advance from the ones who don't isn't years of experience. It's whether the candidate has done the free-tier work to make their answers credible.

Five honest framings that work for new grads at the AWS interview:

Framing 1: "I've worked through the free tier on three small projects."

Best opener for the experience question. Specific, honest, gives the interviewer a hook to dig into. Follow up with one specific detail from one of the projects: "On the URL shortener, I learned the hard way that DynamoDB partition-key design matters because my counter row became a hot partition during testing." That kind of specific failure-and-learning anecdote turns "no production experience" into "has actually used the services."

Framing 2: "I haven't run anything at production scale, but here's what I've thought about."

Use when the interviewer asks about scaling, multi-region, or production operational concerns. Don't pretend. Pivot to the trade-off conversation. "I haven't operated this at scale, but the trade-off I'd think about is X vs Y, and I'd lean toward Y because Z." That answer demonstrates reasoning even when the experience isn't there.

Framing 3: Lean on your portfolio repo.

If you have one GitHub repo where you built a real-ish project using four AWS services, reference it. "I built a small note-taking API using API Gateway, Lambda, DynamoDB, and Cognito. The README walks through the architecture." Interviewers love specifics they can verify. The repo is the credibility anchor.

Framing 4: Talk through someone else's architecture postmortem.

Cite an AWS architecture talk or case study you've read. "I read the recent AWS re:Invent talk on how Netflix uses Spinnaker for deploys, and one thing that stuck with me was the trade-off between rolling and blue-green deploys at their scale." Citing external sources signals that you're engaged with the community, not just memorizing for the interview.

Framing 5: Ask the right clarifying questions.

When the interviewer poses a scenario, don't dive into services. Ask. "Before I sketch this, can I confirm: are we optimizing for cost or latency? Is this a global user base or a single region? Do we have a budget constraint?" Two or three clarifying questions buy you 60 seconds of thinking time and signal that you understand requirements drive architecture, not the other way around. This single behavior changes the interviewer's read of you more than any service-knowledge answer.

The honest reality I'd add as a founder: AWS interviews for entry-level roles in 2026 are friendly to candidates who did the free-tier work. They're brutal to candidates who didn't. The differentiator isn't years of experience. It's three weeks of hands-on time before the interview. Spend that time.

How to prepare for an AWS interview when you've never run prod AWS

A focused three-week prep plan, calibrated for a CS new grad whose AWS is at the "read the landing pages, never built anything" starting point. Adjust if you're further along.

  1. Week 1, day 1: Open a free-tier account and lock it down. Sign up for AWS, enable MFA on the root account, create an IAM user for yourself with admin permissions, stop using the root account. Set up a billing alarm at $5 so you don't get a surprise. This is the boring half-hour that prevents every other hour from going wrong.

  2. Week 1, days 2-5: Hands-on with the five major services. EC2: launch a t2.micro, SSH into it, install nginx, terminate it. S3: create a bucket, upload a file, enable versioning, write a lifecycle rule, delete the bucket. Lambda: write a Python function that returns "hello world", invoke it via API Gateway, then trigger it from an S3 event. VPC: create a VPC with a public and private subnet, launch an EC2 in the private subnet, configure a NAT Gateway for outbound internet. IAM: create a role for your Lambda that allows only read access to one specific S3 bucket. Tear everything down each night.

  3. Week 1, days 6-7: Build a small end-to-end project. Pick something simple. A note-taking API with API Gateway, Lambda, DynamoDB, and Cognito. Document the architecture in a README on GitHub. This becomes your portfolio anchor. The interview answer changes from "I've read about these services" to "I built a small system using these four services and here's what I learned."

  4. Week 2: Read AWS architecture content and one Well-Architected pillar per day. Read two or three AWS re:Invent talks on systems similar to your target roles. Read the Well-Architected Framework whitepaper, one pillar per day. The vocabulary you absorb here is what makes your scenario answers sound credible. Without it, your answers will read as "I memorized the AWS landing pages."

  5. Week 3, days 15-19: Scenario design drills. Sketch architectures out loud for the 10 scenarios in this guide. Eight minutes per scenario. Whiteboard or shared screen, narrating as you draw. Time yourself and review. Practice the trade-off conversation at the end of each: "I'd start with X. If Y becomes the bottleneck, I'd add Z." That's the senior signal even at the entry level.

  6. Week 3, day 20-21: Two timed mock interviews. First mock: service-knowledge focus, 25 questions across EC2, S3, Lambda, VPC, IAM. Second mock: scenario design, one extended prompt and one cost / security trade-off discussion. Narrate everything out loud. Use a peer, a paid mock-interview service, or an AI-driven tool. The first run feels brutal. By the second, you'll know where the gaps are.

The non-negotiable week is week one. AWS interviewers can tell within 60 seconds whether you've used the console or only read about the services. There's no shortcut for that calibration.

AWS interview format by role type

The same AWS surface area gets tested differently depending on what role you're interviewing for. The breakdown for the five most common AWS-adjacent roles hiring new grads in 2026:

RoleService-knowledge depthScenario design weightDevOps toolingCost / security focusDomain extras
Cloud EngineerHigh (VPC, networking, IAM)HighMedium (CloudFormation, basic CI/CD)High (least privilege, cost optimization)Linux, networking, scripting
DevOps EngineerMedium-HighMediumHigh (CodePipeline, Terraform, observability)MediumCI/CD, Kubernetes, containers
SREMediumMedium-High (reliability scenarios)High (monitoring, alerting, runbooks)Medium (reliability cost trade-offs)Linux, Python, on-call mindset
Solutions ArchitectHighVery High (the entire interview)Low-MediumHigh (Well-Architected pillars)Pre-sales communication, customer empathy
SWE with AWSMediumMediumLowLow-MediumAlgorithm rounds, system design, language-specific depth

Two patterns to notice. First, every AWS-adjacent role tests the service-knowledge floor. EC2, S3, Lambda, VPC, IAM are universal. Second, scenario design weights more heavily as the role becomes architecture-oriented. Solutions Architect is almost entirely scenarios. SWE with AWS is mostly algorithms plus some service-knowledge spot-checks.

If you're targeting a Cloud Engineer role at a mid-market employer, lean prep toward networking and IAM. If you're targeting an SRE role, lean toward CloudWatch, X-Ray, and the reliability pillar. The role tier you're targeting changes the prep mix more than most candidates realize.

Common AWS interview mistakes for new grads

The seven most-reported mistakes from new-grad AWS interviews in the 2025-2026 hiring cycle, in roughly the order of frequency:

Memorizing service descriptions instead of trade-offs. A candidate who can recite "S3 is object storage" but can't explain when to pick S3 vs EFS vs EBS reads as someone who studied for the interview but didn't internalize the material. Interviewers grade reasoning, not recall. Drill the comparison questions, not the definition questions.

Overdesigning every scenario. A new grad asked to design a simple CRUD API who proposes API Gateway plus Lambda plus DynamoDB plus ElastiCache plus CloudFront plus Route 53 plus WAF reads as someone who learned the services from a buzzword list. Start simple. Add complexity only when the requirements demand it. State the simpler version first, then say "if we needed X, I'd add Y."

Naming services without depth. A resume that lists EC2, S3, Lambda, VPC, IAM, RDS, DynamoDB, CloudFront, Route 53, CloudWatch, ECS, EKS, Kinesis, MSK, SNS, SQS, SES, MediaConvert, Athena, Glue, Redshift, EMR, Step Functions, EventBridge, AppSync, Cognito but can't pass a basic question on any of them reads as a buzzword resume. List three or four services and be able to talk about all of them at meaningful depth.

Skipping the free-tier hands-on work. The single most-cited gap in entry-level AWS interview feedback. Candidates who haven't used the console freeze on follow-up questions that require knowing which checkbox does what. Three weeks of free-tier time closes 80% of this gap.

Not knowing the IAM basics. Bombing the "what's a role vs a user vs a policy" question is the fastest way to fail an AWS interview. IAM is the gatekeeper to every other service. Interviewers test it as a litmus check. If you can't explain it, you don't get to the interesting questions.

Forgetting that cost is part of the trade-off conversation. New grads default to "what's the most scalable architecture" and miss that the actual question is usually "what's the right architecture for these constraints." If the workload is 100 requests per minute, the answer is Lambda plus DynamoDB, not ECS plus RDS Multi-AZ. Talking about cost out loud signals seniority.

Confusing the AWS account / region / AZ hierarchy. An account contains regions. Each region contains multiple AZs. A VPC lives in one region and spans AZs within it. S3 buckets are globally named but stored in one region. Lambda functions are regional. RDS is regional with multi-AZ as an option. New grads who don't have this hierarchy clear in their head fumble basic follow-ups. Draw the hierarchy on paper once and the confusion goes away.

One thing I'd add from watching new grads run AWS interviews: don't try to memorize all seven the night before. Pick the two that match your specific gaps (almost always free-tier hands-on and IAM basics) and close those. The other five take care of themselves once the foundation is there.

Key terms

EC2 / EBS / EFS
EC2 is virtual servers. EBS is block storage attached to one EC2 instance (like a hard drive). EFS is shared file storage accessible from multiple instances (like a network file share). Use EBS for databases and OS volumes. Use EFS for shared application data across a fleet. EC2 instance store is a third option: ephemeral local SSD, fast but lost when the instance stops.
VPC / subnet / NAT / IGW
VPC is your isolated network in one region. Subnet is a smaller CIDR range tied to one AZ. IGW (Internet Gateway) lets resources in the VPC reach the internet. NAT Gateway lets private-subnet resources reach the internet outbound only. The canonical pattern: public subnets with IGW for load balancers, private subnets without IGW for app servers and databases, NAT Gateway in a public subnet for private-subnet outbound traffic.
IAM role vs policy vs user
An IAM user is a person or app with long-term credentials. An IAM role is permissions assumed temporarily, used by people via assume-role and by services via instance profiles or execution roles. An IAM policy is a JSON document defining allowed actions on resources. Roles and users attach policies. The 2026 best practice is roles over users, scoped tightly via least privilege.
S3 storage classes (Standard / IA / Glacier)
S3 Standard for frequently accessed data, expensive to store and cheap to read. S3 Standard-IA for monthly-or-less reads, cheaper storage and a per-GB retrieval fee. S3 Glacier Instant Retrieval for quarterly access. S3 Glacier Deep Archive for compliance archives. Lifecycle rules transition objects automatically based on age. Pick by access pattern, not by data size.
Lambda cold start
The latency when AWS spins up a new container for your function. The runtime initializes, dependencies load, your handler starts. 100-500ms for Python and Node. 1-3 seconds for Java and .NET. Mitigations: lighter runtime, trimmer dependencies, Provisioned Concurrency to keep containers always warm.
CloudFront / Route 53
CloudFront is AWS's global CDN. Cache content at edge locations near users for low-latency delivery. Use in front of S3 for static assets, in front of ALB for API caching, in front of API Gateway for global APIs. Route 53 is AWS's managed DNS. Use for domain registration, DNS records, health checks, and failover routing. The two often work together: Route 53 points to CloudFront, CloudFront caches and originates back to your AWS services.
Auto Scaling Group / Launch Template
A Launch Template defines how to launch an EC2 instance (AMI, instance type, IAM role, security groups, user data). An Auto Scaling Group uses the template to maintain N healthy instances across AZs. ASG handles the lifecycle: launching new instances when one dies, scaling out on traffic, scaling in when load drops. The 2026 best practice is Launch Templates over the older Launch Configurations.
RDS vs Aurora vs DynamoDB
RDS is managed relational databases (Postgres, MySQL, SQL Server, etc.) with familiar admin patterns. Aurora is AWS's cloud-native re-architecture of MySQL and Postgres with better performance and scaling. DynamoDB is a managed NoSQL key-value and document store. Decision: complex SQL with joins picks RDS or Aurora. Key-value at high volume picks DynamoDB. Aurora over RDS for new workloads when MySQL or Postgres is the right engine.
Well-Architected Framework
AWS's framework for evaluating architectures across six pillars: Operational Excellence, Security, Reliability, Performance Efficiency, Cost Optimization, and Sustainability. The framework is the lens Solutions Architect interviews use to grade scenario answers. Bringing one or two pillars into a trade-off discussion is the senior signal.

Related guides


About the author: Alex Chen is the founder of InterviewChamp.AI, building AI interview prep for the new-grad CS market and writing about the modern interview gauntlet from the inside.

Related guides

Interview Process

System Design Interview Guide for CS New Grads (2026): Framework, Templates, Cheat Sheet

The new-grad system design interview is a vocabulary check, a structure check, and a communication check, not a senior architect evaluation. This guide gives you a 4-step framework, a 12-template cheat sheet, a 45-minute time budget, the five canonical problems that carry 80% of new-grad rotations, and a side-by-side of HLD vs LLD vs machine-learning-system-design. Built for the CS new grad who has solved 600 LeetCode problems but never drawn a load balancer.

Alex Chen ·

Read more →
Interview Process

The 2026 CS New-Grad Interview Loop: Phone Screen to Offer at Every Tier

The 2026 CS new-grad interview loop runs five steps (recruiter screen, technical screen, onsite, debrief, offer) but the shape of each step now depends on tier of company. This guide maps the loop for FAANG, mid-tier public, startup, consultancy, and research lab, with 2026 timelines and how AI-fraud concerns brought in-person rounds back.

Alex Chen ·

Read more →
Interview Process

Accounting Interview Questions for 2026: 40+ Questions for Staff Accountants, Big 4 Candidates, and CPA Pivots

Accounting interview questions in 2026 test six things at once: do you know GAAP cold, can you walk a transaction from journal entry to the three financial statements, can you read a balance sheet under pressure, do you understand the difference between Big 4 audit and corporate close work, can you handle the behavioral round without sounding rehearsed, and can you reason through a case study when the prompt is intentionally vague. If you're an accounting grad, a CPA candidate, or pivoting from finance/ops into staff accountant work, the technical bar isn't the killer. It's framing what you know in 60 seconds while a senior manager watches you on Zoom. This guide walks 40+ questions across six categories, the Big 4 vs corporate vs public-accounting split, and the four-week prep plan that actually works.

Alex Chen ·

Read more →

Frequently asked questions

What AWS interview questions should I expect in 2026?
Expect three buckets. Service knowledge (40-50% of questions) tests whether you understand EC2, S3, Lambda, VPC, and IAM at a level deeper than the AWS landing page. Scenario design (30-40%) tests whether you can sketch an architecture for a vague prompt like 'design a URL shortener that handles 10K daily users.' Cost and security trade-offs (10-20%) test whether you reason about S3 storage classes, IAM least privilege, and the multi-AZ vs single-AZ choice. Most entry-level AWS interviews lean heavier on service knowledge. Solutions Architect and senior loops lean heavier on scenario design.
Do AWS interviewers expect production experience from CS new grads?
No, but they expect you to have spun up a free-tier account, launched at least one EC2 instance, created at least one S3 bucket with non-default settings, and written at least one Lambda function from scratch. Hands-on work in the free tier is the floor. Production experience is a plus, not a requirement. The honest framing if you're a CS new grad: 'I've worked through the AWS free tier and built three small projects to learn the services, but I haven't run anything at production scale.' That answer keeps the interviewer engaged. Pretending you have production experience you don't have gets you caught in 90 seconds.
What's the difference between EC2 and Lambda?
EC2 gives you a virtual server you control. You choose the OS, install software, manage scaling, and pay by the second the instance runs. Lambda runs your code in response to events without you managing any server. You upload the code, AWS handles the runtime and scaling, and you pay only per invocation plus execution time. The interview-relevant decision: EC2 is right when you need persistent state, long-running processes, or specific OS-level control. Lambda is right when the workload is event-driven, bursty, or short-lived. For a CRUD API getting 100 requests per minute, Lambda is cheaper and operationally simpler. For a video transcoding pipeline running 24/7, EC2 with auto-scaling wins on cost.
What are the main S3 storage classes and when do you use each?
Five tiers worth knowing. S3 Standard for frequently accessed data (read multiple times per month). S3 Standard-IA (Infrequent Access) for data read less than once a month but still needed quickly. S3 One Zone-IA for the same access pattern when you don't need multi-AZ durability. S3 Glacier Instant Retrieval for archives accessed quarterly. S3 Glacier Deep Archive for compliance archives kept for years and accessed maybe once. The interview-relevant trade-off is the dual axis of storage cost vs retrieval cost. Standard is expensive to store and cheap to read. Deep Archive is the opposite. Pick by access frequency, not by storage size.
What is a VPC and what are subnets, NAT, and IGW?
A VPC (Virtual Private Cloud) is your own isolated network inside AWS. It has a CIDR range (e.g., 10.0.0.0/16) and lives inside one region. Subnets are smaller CIDR ranges within the VPC, each tied to one Availability Zone. Public subnets have a route to an Internet Gateway (IGW). Private subnets don't. A NAT Gateway lets instances in private subnets reach the internet for outbound traffic (e.g., to install packages or call external APIs) without exposing them to inbound traffic. The interview-relevant question is usually 'where does this resource live in a real architecture?' Web servers in public subnets behind a load balancer. Databases in private subnets. NAT Gateway in a public subnet to give private subnets outbound access.
What is an IAM role vs an IAM user vs an IAM policy?
An IAM user represents a person or application with long-term credentials (access key, password). An IAM role is a set of permissions that can be assumed temporarily by a person, an EC2 instance, a Lambda function, or another AWS service. An IAM policy is a JSON document defining what actions are allowed on what resources. Roles and users attach policies to gain permissions. The 2026 best practice is to minimize IAM users (especially with long-lived access keys) and grant permissions through roles instead. EC2 instance profiles attach a role to the instance so the code running on it gets temporary credentials automatically.
What is a Lambda cold start and how do you reduce it?
A cold start is the latency that happens when AWS spins up a new container to run your Lambda function. The runtime initializes, dependencies load, and your handler code starts. For Python and Node, cold start is typically 100-500ms. For Java and .NET, it can hit 1-3 seconds. Four ways to reduce it. Use a lighter runtime (Python or Node over Java for latency-sensitive APIs). Trim dependencies (every imported library adds initialization time). Use Provisioned Concurrency to keep N containers warm at all times. Run the function from inside a VPC only when necessary, because VPC-attached Lambdas had historically slower cold starts. As of late 2025, the VPC penalty is much smaller than it used to be but still measurable.
What's the difference between an Application Load Balancer and a Network Load Balancer?
An Application Load Balancer (ALB) operates at Layer 7 (HTTP/HTTPS). It can route based on URL path, host header, or query string. It also handles TLS termination, sticky sessions, and WebSocket. Use ALB for HTTP APIs and web apps. A Network Load Balancer (NLB) operates at Layer 4 (TCP/UDP). It's faster, supports static IP addresses, and handles millions of requests per second. Use NLB for non-HTTP protocols, ultra-low-latency requirements, or when you need a static IP for whitelisting. The interview-trap question is 'which one for gRPC?' The answer is ALB (since 2020, ALB supports gRPC at Layer 7). The older answer 'NLB' is still floating around in old blog posts.
How do you design a highly available architecture on AWS?
Three principles. First, run across multiple Availability Zones (typically two or three). A single AZ failure shouldn't take down your service. Second, eliminate single points of failure at every layer. Multi-AZ RDS for the database, ALB across multiple AZs for the load balancer, EC2 in an Auto Scaling Group across AZs for the application tier. Third, use managed services where you can. They handle the HA story for you. A typical entry-level scenario answer: ALB in front of an Auto Scaling Group of EC2 instances spread across two AZs, with RDS Multi-AZ for the database and S3 for static assets. Add CloudFront in front for global users. Add Route 53 with health checks for DNS failover if you need multi-region.
What is CloudFormation and how is it different from Terraform?
CloudFormation is AWS's native infrastructure-as-code tool. You write a YAML or JSON template describing the resources you want, and AWS provisions them. Terraform is HashiCorp's multi-cloud tool that does the same thing for AWS, GCP, Azure, and dozens of other providers. The trade-offs in 2026: CloudFormation is tightly integrated with AWS (new services often land in CloudFormation before Terraform), and it's free. Terraform has a richer module ecosystem, better state management, and the HCL syntax is less verbose than CloudFormation YAML. Most non-AWS-only teams have moved to Terraform. AWS-only teams split roughly 50-50.
What does 'least privilege' mean in IAM?
Grant the minimum permissions required to do the job, and nothing more. A Lambda function that reads from one S3 bucket should have an IAM role that allows only s3:GetObject on that specific bucket ARN. Not s3:*. Not Resource '*'. The interview-relevant follow-up is usually 'why does it matter?' Because if the function is compromised (bad input, dependency exploit, leaked credentials), the blast radius is limited to what the role can do. A least-privilege role can read one bucket. An over-permissive role can wipe your account.
What's the difference between Auto Scaling Groups and Launch Templates?
A Launch Template defines how to launch an EC2 instance (AMI, instance type, IAM role, security groups, user data). An Auto Scaling Group (ASG) uses a Launch Template to maintain N healthy instances across one or more AZs. ASG handles the lifecycle: launching new instances when one dies, scaling out on traffic spikes, scaling in when load drops. The 2026 best practice is to use Launch Templates over the older Launch Configurations. They support versioning and newer instance features. The interview signal: knowing that the ASG references a Launch Template by version and that you can update the template without disrupting running instances.
How would you design a URL shortener on AWS?
A canonical scenario question. Sketch the data flow: client sends a long URL, system generates a short code (base62 of a counter or hash), stores the mapping, and returns the short URL. On lookup, client hits the short URL, system finds the long URL, returns a 301 redirect. AWS implementation: API Gateway in front of a Lambda function for write and read endpoints, DynamoDB for the URL mapping (short code as the partition key), CloudFront in front for caching popular redirects. Discuss the trade-offs: DynamoDB over RDS because the access pattern is purely key-value lookups at high read volume. Lambda over EC2 because the workload is bursty and per-request. CloudFront because most short URLs follow a long-tail distribution where a few links get most of the clicks.
How do I prepare for an AWS interview as a CS new grad?
Three weeks of focused work. Week 1: spin up a free-tier account, launch an EC2 instance, create an S3 bucket with versioning and lifecycle rules, write a Lambda function that triggers on S3 uploads. Get hands-on. Week 2: read one short overview of each major service (EC2, S3, Lambda, VPC, IAM, DynamoDB, RDS, CloudFront, Route 53, CloudWatch). Build a small project that uses three or four of them together. Week 3: drill scenario questions. Read AWS architecture blog posts. Walk through how you'd build common systems (URL shortener, photo-sharing app, chat backend) and what services you'd pick. Run a timed mock interview where you sketch an architecture out loud. The hands-on week one is non-negotiable. AWS interviewers can tell within 60 seconds whether you've used the services or only read about them.
What's the most common AWS interview mistake new grads make?
Memorizing service descriptions instead of understanding the trade-offs. A candidate who can recite 'S3 is object storage for the web' but can't explain when to use S3 vs EFS vs EBS is going to bomb the first follow-up. Interviewers grade reasoning, not recall. The other common mistake is overdesigning. A new grad asked to design a simple CRUD API who proposes API Gateway plus Lambda plus DynamoDB plus ElastiCache plus CloudFront plus Route 53 plus WAF reads as someone who learned the services from a buzzword list. Strong candidates start simple and add complexity only when the requirements demand it.