Design Instagram — System Design Interview Guide
Design Instagram is a system-design interview that asks you to build a photo-and-video sharing platform: 2B+ users post media, follow each other, and scroll a personalized feed. The hard part is media storage, feed generation, and ranking at scale.
By Alex Chen, Founder, InterviewChamp.AI · Last verified
Reported in interviews at
- Meta
- Snapchat
- Twitter/X
- ByteDance
Sourced from Glassdoor, Levels.fyi, and Blind interview reports.
Functional requirements
- Upload a photo or short video with optional caption
- Follow and unfollow other users
- View a personalized feed (chronological-mixed-with-ranked) of posts from followees
- Like, comment on, and save a post
- View any user's profile grid (their posts)
Non-functional requirements
- Read-heavy: ~100:1 ratio of feed reads to post writes
- Feed-load latency: <500ms p95 from cold start to first row visible
- Availability: 99.99%; eventually consistent feed (new post visible within seconds) is acceptable
- Scale: 2B MAU, ~500M DAU, ~100M posts/day, peak ~3K post writes/sec, ~300K feed reads/sec
Capacity estimation
Public 2024 scale: ~2B monthly active users, ~500M daily active users, ~100M posts/day (photos + reels). Average photo size after server-side compression ~200 KB; average short-form video ~5 MB. Daily media storage: 100M × (0.8 × 200 KB + 0.2 × 5 MB) = ~120 TB/day raw, ~360 TB/day with 3× replication. Yearly: ~130 PB. Multiple resolution variants (thumbnail, low-res, full-res, WebP/HEIC) push this to ~3-5× the raw figure.
Post metadata is much smaller: 100M posts × ~500 bytes (user_id, caption, timestamp, media_refs, geo) = 50 GB/day. Feed materialization storage: 500M DAU × 800 entries kept per user × ~100 bytes/entry = ~40 TB in the timeline cache. Peak feed-read QPS: 500M DAU × ~5 feed loads/day / 86,400 × 3 (peak) ≈ 90K reads/sec average, ~300K peak.
Ingest: peak post QPS is ~3K/sec; the media-upload byte rate, however, is the dominant ingest cost — 3K posts/sec × avg 1 MB media = 3 GB/sec sustained upload bandwidth.
High-level design
Two-plane architecture similar to other social products: control plane for metadata, data plane for media. The control plane has stateless app servers behind an API gateway. A post service writes metadata to a sharded relational store (sharded by user_id). A feed service materializes personalized feeds; a user-graph service holds follow edges. A discovery service powers search and hashtags from an inverted index.
Media upload flow: the client requests a signed upload URL from the upload service, then uploads the original photo or video directly to object storage. A processing pipeline (triggered by an event from the object store) generates multiple variants — thumbnail, low-res, full-res, and for video, multiple bitrates plus a poster frame. Variants land in object storage; the post metadata row is then written with refs to all variants. Media is served from object storage through a CDN edge cache for everyone.
Feed generation uses a hybrid push-pull model. For ordinary users, when they post, a fanout worker pushes the post_id into each follower's precomputed feed cache (sorted set, top ~800 entries, sharded by user_id). For high-fanout users (>500K followers), no push; the feed service merges those at read time from a separate 'celebrity recent posts' cache. Feed reads then become: read the user's precomputed cache, merge with recent celebrity posts they follow, apply the ranking model, return the top N. Rankings are computed from precomputed engagement features + a learned model that scores each candidate; for v1, pure recency works and is the safe answer if time is tight.
Deep dive — the hard problem
The deep dive splits across two surfaces: media at the byte level, and feed ranking. On media: the upload pipeline must be elastic because creators upload in bursts (concert events, sports moments) that spike 10× normal. Solution: upload service writes the original to object storage immediately and returns success to the client; the heavy work — transcoding, thumbnail generation, content moderation, copyright fingerprinting — happens asynchronously in a queue-fed worker fleet. The post is visible in the author's profile immediately (using a placeholder rendition if needed) and visible in followers' feeds once processing completes. Choose the queue's ordering semantics carefully: per-user FIFO keeps a creator's posts in order, but global ordering isn't needed.
On feed ranking: the hybrid push-pull split (same as Design Twitter) handles celebrity fanout. But Instagram adds a content-ranking layer that pure-chronological doesn't. The standard pattern: precompute per-post engagement features (likes/sec, comment rate, time-decayed) in a streaming aggregation tier; for each feed request, fetch ~500 candidate posts (precomputed cache + celebrity merge), score them with a lightweight model that takes (post_features, viewer_features) → predicted engagement score, sort, return top 50. The model is served from a fast in-memory tier; features for the viewer are cached per session. Discussing the offline-trained / online-served split signals you understand the ML/serving boundary even if you don't go deep on the model.
Third tradeoff: comment threads. Comments can outnumber posts by 10-50×. The standard pattern stores comments in a separate sharded store keyed by post_id (so all comments for a post live on one shard), with a precomputed 'top 3 comments' cache rendered with the post in the feed; full comment list is loaded on tap. Mentioning that you push comment fanout to read time (most posts never get expanded) saves a 10× write fanout.
Common mistakes
- Storing media bytes in the same database as post metadata — wrecks both storage cost and feed-read latency
- Designing fanout-on-write for all users — the celebrity-fanout problem appears within a year of launch
- Conflating the feed-ranking model with infrastructure — interviewers want a clean offline-train / online-serve split
- Forgetting comments are a separate scaling problem and trying to fanout-write them into every viewer's feed
- Missing CDN — serving full-resolution images directly from object storage gives terrible page-load latency
Likely follow-up questions
- How would your design handle a viral video that hits 1B views in 24 hours?
- What changes if you have to add a 'Stories' feature (ephemeral 24-hour posts)?
- How would you implement private accounts where posts are only visible to approved followers?
- How would you support live video broadcasts with 100K concurrent viewers?
- How would you detect and remove a deepfake video uploaded by a malicious user?
Practice Design Instagram live with an AI interviewer
Free, no sign-up required. Get real-time feedback on your design.
Practice these liveFrequently asked questions
- Is Design Instagram easier than Design Twitter?
- Slightly. The follow-graph and fanout pattern is identical, but Instagram has lower write-rate and reads tolerate slightly more lag. The added complexity is media at scale, which is usually a separate-plane discussion (CDN, encoding pipeline) and easier to reason about than fanout.
- Do I need to design Reels separately for Design Instagram?
- Briefly. Reels is short-form video on the same backend with a different ranking model and a different surface in the app. Most interviewers accept 'reels share the upload+CDN pipeline; ranking has a heavier ML score'.
- How does Meta typically grade Design Instagram?
- Top signals: push-vs-pull tradeoff, separate plane for media, mention of ranking model. Missing celebrity fanout is a hire-reduce signal at E5+. Source: Glassdoor Meta E5 reports 2023–2024.
- Should I cover image moderation in Design Instagram?
- One sentence. 'Async content-moderation worker scans every upload before it's broadcast to followers' is enough — drilling deeper costs time you need for fanout and ranking.