Design Notion — System Design Interview Guide
Design Notion is a system-design interview that asks you to build a flexible document workspace: users create pages composed of blocks (text, images, embeds, databases), collaborate in real time, and link pages into nested trees. The hard part is the block-level data model plus real-time multi-user editing.
By Alex Chen, Founder, InterviewChamp.AI · Last verified
Reported in interviews at
- Notion
- Atlassian
- Meta
Sourced from Glassdoor, Levels.fyi, and Blind interview reports.
Functional requirements
- Create a page composed of blocks (text paragraphs, headings, images, tables, embeds)
- Edit blocks; nest blocks inside other blocks (toggles, sub-pages, indented lists)
- Share a page with other users (workspace members or external guests) with read or edit permissions
- Real-time multi-user editing: two users editing the same page see each other's changes within 200ms
- Search across all accessible pages in a workspace
Non-functional requirements
- Edit-to-peer latency: <300ms p99 for a keystroke to reach all collaborators on a page
- Page load: <500ms p99 from page click to first paint, even for pages with 1000+ blocks
- Availability: 99.99% for read and edit; conflict resolution must never silently lose user input
- Scale: ~100M+ registered users across millions of workspaces, average page ~50 blocks, peak ~30K edit operations/sec
Capacity estimation
Public Notion scale (2023-2024 reports): ~100M+ registered users, tens of millions of MAU, millions of workspaces with a heavy long-tail (most workspaces are <10 users; enterprise workspaces reach 10K+). Pages per workspace ranges from dozens to hundreds of thousands. Average page has ~30-50 blocks.
Block storage is the dominant scaling dimension. If the average user creates ~100 blocks total and there are 100M users, that's ~10B blocks in the system. Each block is small (~500 bytes including content, parent_id, and metadata) so total block storage is ~5 TB before indexes — small in PB terms but enormous in row count. Edit operations stream in at ~10K-30K ops/sec at peak; each operation is a small delta (a few hundred bytes).
The shape that matters: writes are tiny but frequent, the data model is nested-tree-of-blocks rather than rows-of-documents, and the read pattern is 'load one page' (load N blocks where N can vary 100x). Designing the block store for fast 'load all descendants of page X' is the central challenge.
High-level design
Three core services: the block store, the collaboration engine, and search.
The block store holds every block as a row in a sharded relational store, sharded by workspace_id (so all blocks in a workspace colocate). Each block has (block_id, page_id, parent_block_id, position, type, content, version). The (parent_block_id, position) pair forms an ordered tree — siblings sorted by position, descendants reachable by recursion. A page load is a bounded query: 'all blocks where page_id = X', returning a few hundred rows, reconstructed into a tree client-side. The position field uses fractional indexing (between any two positions, you can always insert a new one without renumbering) so insertions are local writes.
The collaboration engine runs alongside the block store. Each open page has a session — clients connect via persistent connection to a collaboration server, identified by page_id. Edits are serialized as operations (insert-block, update-content, delete-block, reorder) and broadcast to all clients on the page. The collaboration server maintains the canonical block list in memory, applies each incoming operation, and forwards to other clients. Periodic snapshots flush to the block store; if the collaboration server crashes, peers reconstruct from the last snapshot plus the operation log.
Search ingests block content via change-data-capture into an inverted-index cluster sharded by workspace_id (same isolation pattern as the block store). Searches return matching block_ids; the renderer fetches the parent page and highlights the block. Permissions are a separate service: each page (and each workspace member) has an ACL; permission checks happen at read and edit time. Inheritance from parent page is resolved at read time by walking up the page tree.
Deep dive — the hard problem
The deep dive is real-time collaborative editing. The classical solutions are Operational Transformation (OT) and Conflict-free Replicated Data Types (CRDT). Both let multiple users edit the same document concurrently and converge to the same final state. Notion-style block-level editing is somewhat easier than character-level text editing because most operations target whole blocks (insert, delete, move) rather than character ranges within a paragraph. Within a paragraph, you still need character-level conflict resolution.
CRDTs are the modern preference. Each operation is a (block_id, op_type, payload, lamport_timestamp, replica_id) tuple. Clients apply operations in any order and reach the same final state because the data structure is designed so that order doesn't matter — for example, insertions are placed by fractional position keys instead of integer indexes, and concurrent insertions at the same position deterministically resolve by replica_id. Discussing fractional indexing and Lamport timestamps explicitly is strong signal.
The alternative is OT: a central server transforms each incoming operation against operations that have happened since the sender's last sync. OT is what Google Docs uses; it requires a central authority but produces tighter document state. Pick one and explain the tradeoff — CRDT for offline-tolerance and decentralization, OT for tighter convergence.
Second hard surface: the block-tree at scale. A single page with 1000+ blocks is common in enterprise workspaces; rendering it requires loading every block. The naïve approach (one row per block, fetched together) gets slow at deep tree depths because the client must walk the tree before knowing what to render. Better: precompute a denormalized 'page snapshot' that holds the full block tree as a single JSON blob, refreshed on every edit, served from an in-memory cache. Reads become a single cache hit; edits invalidate and recompute the snapshot. Edits stream through the collaboration engine in parallel for live updates.
Third: permissions. A nested page inherits its parent's permissions by default but can be overridden. Resolving 'can user X read page Y' requires walking up the page tree until an explicit ACL is found. Caching the resolved ACL per (user, page) for a few seconds, with cache-invalidation events fired on any permission change, makes the check cheap.
Common mistakes
- Storing the whole page as a single document — kills granular edits and conflict resolution
- Skipping real-time collaboration entirely — interviewer expects CRDT or OT mentioned by minute 15
- Using integer block positions — every insert forces renumbering of subsequent siblings
- Sharding by page_id instead of workspace_id — kills workspace-locality and search routing
- Forgetting offline editing — Notion supports it; missing this drops a level at senior+ rounds
Likely follow-up questions
- How would you support offline editing where a user works on a page for hours without connectivity?
- What changes if you add a Notion-style database block (a table view with sortable, filterable rows)?
- How would you implement undo/redo across collaborative edits from multiple users?
- How would you handle a single page reaching 100K blocks (a large knowledge base index page)?
- How would you support page export to PDF or Markdown without loading the entire workspace?
Practice Design Notion live with an AI interviewer
Free, no sign-up required. Get real-time feedback on your design.
Practice these liveFrequently asked questions
- How long is the Design Notion system-design interview?
- 45-60 minutes. Notion explicitly tests real-time collaboration plus the block data model. Missing either is a no-hire. Source: Glassdoor Notion 2023-2024 reports plus Levels.fyi compensation reports referencing the loop.
- Do I need to know CRDTs in detail?
- Naming the concept, explaining fractional indexing for ordered lists, and Lamport timestamps for causal ordering is enough. Drawing the actual CRDT proof of convergence is overkill in 45 minutes. The signal is 'I understand why concurrent edits converge,' not 'I can prove the algebra.'
- Is Design Notion easier than Design Google Docs?
- Comparable. Notion adds the nested-block-tree, multi-page navigation, and the database-block surface, which Docs doesn't have. Docs has deeper character-level OT history. Both rounds test real-time collaboration and rich document modeling.
- Should I draw the block table schema?
- Yes — a 30-second sketch of (block_id, page_id, parent_block_id, position, type, content) gets the data-model signal across efficiently. Avoid spending more than 2 minutes on schema; the interviewer cares about how it scales, not the column types.