Skip to main content

System Design Questions

Design Stack Overflow — System Design Interview Guide

Design Stack Overflow is a system-design interview that asks you to build a Q&A platform: users post technical questions, others answer, the community votes on both, and a reputation system rewards quality contributors. The hard part is ranking, search, and the reputation engine — all read-heavy at hundreds of millions of pageviews.

By Alex Chen, Founder, InterviewChamp.AI · Last verified

Reported in interviews at

  • Stack Overflow
  • Meta
  • Google
  • GitHub
  • Atlassian

Sourced from Glassdoor, Levels.fyi, and Blind interview reports.

Functional requirements

  • Post a question with title, body (Markdown), and tags
  • Answer a question; multiple answers per question, each independently voted
  • Upvote/downvote questions and answers; downvotes require minimum reputation
  • Search questions by keyword, tag, and acceptance status
  • Reputation system: gain rep for upvotes received, lose rep for downvotes; reputation unlocks privileges

Non-functional requirements

  • Page load: <300ms p99 for question page (full body + all answers + sidebar)
  • Read-heavy: ~99% of traffic is reads from search engines; writes are <1% of QPS
  • Scale: ~100M+ monthly users, ~25M+ questions, ~35M+ answers, peak ~50K reads/sec
  • Availability: 99.99% for reads (search-engine traffic is the lifeblood); degraded writes acceptable briefly

Capacity estimation

Stack Overflow public scale (2023-2024 data): ~100M monthly visitors, ~25M+ questions accumulated since 2008, ~35M+ answers, ~150M+ comments. Daily questions: ~6K-8K new per day = ~1 question/15 sec average. Daily reads: ~50M page views/day = ~600 reads/sec average, with peak ~50K reads/sec at developer-active hours (US/EU business hours).

Storage: a question is ~3-5 KB (title + body + metadata); an answer ~2-3 KB. Total accumulated content: ~25M × 4 KB + 35M × 2.5 KB ≈ ~190 GB primary content. Comments add ~50 GB. Vote rows: ~500M cumulative votes × 50B = ~25 GB. Total: ~300 GB excluding indexes. The search index over question titles + bodies is another ~50-100 GB inverted index.

The shape that matters: this is the most read-heavy of all classical design problems — Google search traffic is ~90% of Stack Overflow's pageviews. Every question page must be cacheable, every search must be fast, and writes are a rounding error. Design the system to optimize the read path first; the write path is trivial by comparison.

High-level design

Five core domains: questions/answers, voting, search, reputation, and the read-cache.

Questions and answers live in a sharded relational store, sharded by question_id (so all answers and comments for one question colocate). Each row is small enough that hot questions fit in cache easily. The schema is denormalized: each question row carries view_count, answer_count, vote_score, and accepted_answer_id, all updated on the relevant write event.

Voting is a high-volume stream (relative to writes). Each vote is (user_id, target_id, target_type, value) written to a vote table with a unique key on (user_id, target_id, target_type) — idempotent. On vote write, the target's score column is incremented in the question/answer row. Reputation gain (+10 for upvote received, -2 for downvote received) is computed by a reputation worker that consumes the vote-event stream and updates the voter and target reputation scores.

Search: an inverted-index cluster ingests questions and answers via change-data-capture. Search queries combine keyword (over title and body), tag filter, and a relevance score that blends text match + vote score + freshness. Tag filters are denormalized into the inverted index so 'how do I X tagged python' is a single multi-filter query, not a join.

Reputation: each user has a reputation score, denormalized into the user row and recomputed on every relevant event (vote, accepted answer, badge earned). Privileges are reputation-gated lookups — 'can this user downvote' = 'reputation >= 125'. The lookup is a single user-row read, cached aggressively.

Read-cache: the dominant infrastructure layer. Question pages are cached as fully-rendered HTML or as structured JSON in an in-memory cache with sub-second TTL on the hot tail and longer TTL on the cold tail. The cache is keyed by (question_id, view_version) where view_version increments on any edit, ensuring cache freshness. Search-engine bot traffic hits this cache for ~99% of question reads.

Deep dive — the hard problem

Two deep dives: the read-heavy cache architecture and the reputation engine.

Read-heavy cache: Stack Overflow's traffic shape is extreme — 99% reads, mostly from search-engine referrals, hitting a long-tail of millions of unique URLs. Naïve caching strategies fail because the working set is too large to fit in any single cache node. The standard pattern is multi-tier caching. Tier 1: in-memory cache at the application layer, partitioned by question_id via consistent hashing — each application server holds a slice of the hot questions in process memory. Tier 2: a separate in-memory cache cluster (shared across application servers) holding the warm tail. Tier 3: the database with read-replicas for the cold tail.

Question-page rendering is expensive — a question with 30 answers, 100 comments, and a reputation-aware permissions check for each visible vote button costs CPU. The solution is to precompute the rendered HTML for the read-mostly version of the page (logged-out users, search-engine bots), cached at high TTL. Logged-in users get a personalization layer applied client-side or via edge compute — display vote buttons, highlight 'your answer', etc. — without re-rendering the whole page server-side. Mention this 'cacheable shell + personalization overlay' pattern explicitly.

Reputation engine: reputation must be eventually consistent but accurate to within a few minutes — privileges depend on it. The engine consumes the vote-event stream and updates per-user reputation columns. Idempotency is critical: vote events can be replayed (e.g. during a queue restart) and must not double-credit reputation. The standard pattern: a (user_id, vote_id) deduplication table that the worker checks before applying. Once applied, the vote_id is marked processed and never re-applied.

Reputation also has anti-abuse rules: daily reputation gain is capped (e.g. +200/day from votes), preventing a vote-ring from elevating a colluding user to high reputation overnight. The cap is enforced by tracking daily gain per user and rejecting further increments once the cap is hit. Mention the cap; it's a reliable interviewer probe.

Third hard problem: search-as-you-type and 'did you mean'. The autocomplete is a trie-backed in-memory service that suggests questions matching the prefix. 'Did you mean' uses fuzzy matching (Levenshtein distance ≤ 2) on the search index to find candidate corrections. Both run as separate services in front of the main search.

Common mistakes

  • Designing for write throughput — this system is dominated by reads; writes are <1% of QPS
  • Skipping the cache architecture — at Stack Overflow's read load, the database alone can't serve traffic
  • Forgetting reputation idempotency — replayed vote events would double-credit users
  • Treating search as a simple keyword match — production search blends text + votes + tag + recency
  • Skipping the read-mostly cache vs personalization split — every logged-in request rebuilding the page wastes CPU

Likely follow-up questions

  • How would your design handle the Stack Overflow data dump (a public quarterly release of all Q&A)?
  • What changes if you add real-time notifications for 'someone answered your question'?
  • How would you implement question deduplication ('this is a duplicate of question X')?
  • How would you support multilingual Q&A across multiple languages with cross-language search?
  • How would you build a recommendation system suggesting questions a user might want to answer?

Practice Design Stack Overflow live with an AI interviewer

Free, no sign-up required. Get real-time feedback on your design.

Practice these live

Frequently asked questions

How long is a Design Stack Overflow system-design round?
45-60 minutes. The expectation at senior+ is full coverage of the read-heavy cache architecture plus reputation engine. Source: Glassdoor Stack Overflow + Meta 2022-2024 reports.
Is Design Stack Overflow easier than Design Reddit?
Slightly different shape. Stack Overflow is heavier on read-caching and reputation; Reddit is heavier on real-time ranking and comment trees. Most interviewers consider them comparable, with Stack Overflow being slightly more straightforward because the social graph is shallower (no subscriptions, no per-user feed).
Do I need to know specific cache eviction policies?
Naming LRU or TinyLFU is enough; explaining why one fits read-heavy workloads better is bonus signal. Drawing the actual eviction algorithm is overkill in 45 minutes.
Should I cover the badge system?
Mention badges as a small async worker that consumes the same event stream the reputation engine does. Drilling into badge-criteria evaluation is bonus signal only if you've covered the read-cache and reputation engine first.