Skip to main content

System Design Questions

Design Pinterest — System Design Interview Guide

Design Pinterest is a system-design interview that asks you to build a visual discovery board: users save pins to boards, follow other users and topics, and browse an algorithmic feed of recommended images. The hard part is visual-similarity recommendation and the home-feed ranking under image-heavy storage.

By Alex Chen, Founder, InterviewChamp.AI · Last verified

Reported in interviews at

  • Pinterest
  • Meta
  • Snapchat
  • Etsy
  • Amazon

Sourced from Glassdoor, Levels.fyi, and Blind interview reports.

Functional requirements

  • Upload or save a pin (image plus optional caption and source URL) to a board
  • Create and manage boards; pins can be repinned across boards
  • Follow users, topics, or specific boards
  • Browse a personalized home feed mixing followed-content with algorithmic recommendations
  • Search pins by keyword and by visual similarity (find similar pins to this one)
  • Notify users on repins, board-follows, and friend activity

Non-functional requirements

  • Scale: ~500M MAU, average ~100 pins saved/user, average ~10 board follows
  • Feed latency: <500ms p99 to first byte of the home-feed page
  • Image delivery: <100ms median to first paint via edge cache
  • Visual-similarity search: <500ms p99 over an index of billions of pins
  • Availability: 99.95%; eventual consistency on feed ordering is acceptable

Capacity estimation

Public 2024 scale anchors: ~500M MAU, ~5B total pins saved across all boards. Daily uploads and repins ~50M/day = ~600/sec average, ~3K/sec peak. Daily feed loads ~1B/day = ~12K/sec average, ~50K/sec peak (each user opens the app 2-3x/day).

Storage breakdown: images dominate. Average pin image ~150 KB compressed (multiple resolutions stored — thumbnail, mobile, desktop, retina). 5B pins × 4 resolutions × 100 KB average = ~2 PB image storage. Pin metadata (caption, board, owner, timestamps) ~500 bytes/pin × 5B = ~2.5 TB. Board metadata negligible by comparison.

Visual embeddings for similarity search: each pin gets a ~512-dimensional float vector = 2 KB. At 5B pins that's 10 TB of vector data, but this lives in a specialized vector index (HNSW or quantized variants) sharded across nodes. Pin engagement data (clicks, repins, dwell time per pin) is voluminous — ~10B events/day during peaks, written to a streaming log.

Read QPS heavily skewed by image fetches: each feed page has 20-30 images, so 1B feed loads/day = ~25B image fetches/day, the vast majority served from edge cache.

High-level design

Five services: pin storage, board/follow graph, feed, search, and recommendations. The image-delivery path is largely independent of the others.

Pin storage: image bytes go to object storage and are served via a CDN. The pin service writes metadata (pin_id, owner_id, board_id, image_url, caption, source_url, embedding_ref, created_at) to a sharded relational store keyed by pin_id. An image-processing pipeline runs on upload: resize to multiple resolutions, compute the visual embedding via a vision model, extract dominant colors and detect any text in the image. Outputs are written back into the pin row and an external vector index.

Board/follow graph: a separate sharded store holds board membership (board_id → list of pin_ids, ordered by save_at) and user follows (user_id → list of followed user/board/topic IDs). Reads are cached because every feed assembly hits this layer to find followed content.

Feed service builds the home feed by mixing followed content with algorithmic recommendations. For each user the feed is a precomputed ranked list of pin_ids stored in an in-memory cache, refreshed every few minutes by a feed-builder worker. Fresh activity (new pin from a followed user) is pushed into the cache via fanout-on-write for ordinary users.

Recommendation service generates the algorithmic portion of the feed. It runs an offline pipeline producing per-user candidate lists by combining (a) pins similar to those the user recently engaged with, fetched from the vector index, and (b) pins popular on boards the user follows. Candidates are scored online by a ranker model using user features and pin features.

Search handles two modes: keyword search hits a text inverted index over pin captions and OCR-extracted text. Visual similarity ('more like this') queries the vector index with the source pin's embedding and returns the nearest 1000 candidates, then re-ranks for diversity.

Deep dive — the hard problem

Two deep dives: visual-similarity search at billion-pin scale, and the home-feed mix between followed and algorithmic content.

Visual-similarity search: every pin upload produces a 512-dim float embedding via a vision model (typically a CNN or vision transformer). Storing 5B vectors and doing nearest-neighbor in real time requires an approximate-nearest-neighbor index. The standard production answer is a graph-based index (HNSW) or an inverted-file index with product quantization (IVF-PQ), both of which trade exact recall for query speed.

HNSW builds a multi-layer proximity graph; queries descend from a sparse top layer to dense lower layers, achieving sub-10ms search at hundreds of millions of vectors per node. Above a few hundred million vectors per node, RAM becomes the bottleneck — pin vector indexes are sharded across many nodes, each holding ~100M vectors. Queries fan out to all shards, each returns its local top-K, and a coordinator merges into the global top-K. The vector index is rebuilt from the source-of-truth pin database periodically; incremental updates use a 'staging' index merged at read time until the next full rebuild.

Product quantization reduces vector RAM by ~10-30x by quantizing each vector to a compact code, trading off some recall. Most production systems combine: PQ for the bulk of the index, full-precision vectors for re-ranking the top-100 candidates.

Feed ranking and mixing: the home feed is roughly 30-50% followed content (pins from boards/users you follow) and 50-70% algorithmic recommendations (pins similar to your engagement history or trending on your topics). The split is dynamic — for a brand-new user with no follows, the feed is 100% algorithmic; for a heavy follower, it skews to followed content.

The ranker combines features: user-pin affinity (predicted from the vector index distance between user embedding and pin embedding), pin freshness, pin engagement velocity (clicks/saves over the last hour), and topic relevance. Final ranking enforces diversity rules — no more than 3 pins from the same board, no more than 5 pins on a tight topic cluster — so the feed doesn't degenerate into one theme.

Third tradeoff: image storage costs. Image bytes are the dominant infrastructure cost. Production systems use multi-tier storage: hot tier for recently-saved pins (replicated globally, served from CDN), warm tier for older pins (replicated regionally), cold tier for pins not viewed in months (single-region, served via on-demand promotion). The pin metadata always lives in the hot path; only image bytes tier.

Common mistakes

  • Storing image bytes in the relational metadata store instead of object storage with CDN front
  • Using exact nearest-neighbor (KNN) for visual search at billion-pin scale — query latency exceeds budget by 100x
  • Designing the feed as pure recency-ordered followed content — at 100+ follows the feed drowns in repins, and recommendations are the actual product
  • Forgetting image processing pipeline — resizing, embedding generation, OCR happen on upload, not at view time
  • Treating CDN as an afterthought — without an aggressive edge-cache strategy, image delivery costs are prohibitive

Likely follow-up questions

  • How would you implement a 'shop the look' feature where users tap a pin and see purchasable products visible in the image?
  • What changes if a pin goes viral and gets 10M repins in 24 hours?
  • How would you support multi-image pins (a single pin with a 5-image carousel)?
  • How would you detect and remove duplicate pins where users upload the same image with different captions?
  • How would you implement personalized notifications for activity on a user's boards without spamming low-value events?

Practice Design Pinterest live with an AI interviewer

Free, no sign-up required. Get real-time feedback on your design.

Practice these live

Frequently asked questions

How long is the Design Pinterest interview?
45-60 minutes. Pinterest's own system-design loop runs 60 minutes and expects visual-similarity search to be a centerpiece. Source: Glassdoor Pinterest 2023-2024 interview reports.
Do I need to know the embedding model architecture for visual similarity?
No. Saying 'pre-trained vision model produces a 512-dim embedding per image, stored in an ANN index' is enough. Drawing the model layers wastes time you need for the system architecture.
Should I cover the recommendation model in detail?
Cover the inputs (user features, pin features, candidate sources) and the offline-batch + online-ranker split. Drawing the actual model gradient is overkill in 45 minutes.
Is Design Pinterest harder or easier than Design Instagram?
About the same difficulty; the recommendation surface is broader at Pinterest because boards are organizational structures with their own following graph, while Instagram's organization is mostly a single profile feed. Pinterest's visual-similarity is a more central feature.