Skip to main content

27. Find Median from Data Stream

mediumAsked at Spotify

Return the running median as numbers arrive one-by-one using two heaps — a pattern Spotify uses to compute real-time median session-length or track-duration statistics across millions of concurrent streams.

By Alex Chen, Founder, InterviewChamp.AI · Last verified

Problem

Implement the MedianFinder class. It should support addNum(num) which adds a number to the data structure, and findMedian() which returns the median of all elements added so far. If there is an even number of elements, the median is the mean of the two middle values.

Constraints

  • -10^5 <= num <= 10^5
  • At most 5 * 10^4 calls will be made to addNum and findMedian
  • findMedian will be called at least once

Examples

Example 1

Input
addNum(1); addNum(2); findMedian(); addNum(3); findMedian()
Output
[1.5, 2.0]

Explanation: After [1,2] median = 1.5; after [1,2,3] median = 2.

Approaches

1. Sort on every findMedian (naive)

Append each number; sort the array when the median is requested. Simple but too slow for frequent reads.

Time
O(n log n) per findMedian, O(1) addNum
Space
O(n)
class MedianFinder {
  constructor() { this.data = []; }
  addNum(num) { this.data.push(num); }
  findMedian() {
    const sorted = [...this.data].sort((a, b) => a - b);
    const n = sorted.length;
    return n % 2 === 1 ? sorted[Math.floor(n / 2)] : (sorted[n/2 - 1] + sorted[n/2]) / 2;
  }
}

Tradeoff:

2. Two heaps (max-heap + min-heap)

Maintain a max-heap for the lower half and a min-heap for the upper half, keeping their sizes balanced. The median is either the two tops' average or the top of the larger heap.

Time
O(log n) addNum, O(1) findMedian
Space
O(n)
// JS has no built-in heap; we simulate with a sorted insertion helper for interview clarity.
class MedianFinder {
  constructor() {
    // lo is a max-heap (store negated values in a min-heap array)
    // hi is a min-heap
    this.lo = []; // lower half, kept as negatives for max-heap simulation
    this.hi = []; // upper half
  }
  _heapPush(heap, val) {
    heap.push(val);
    heap.sort((a, b) => a - b);
  }
  addNum(num) {
    // Push to lo (as negative for max-heap behavior)
    this._heapPush(this.lo, -num);
    // Balance: lo's max must be <= hi's min
    const loMax = -this.lo[0];
    if (this.hi.length > 0 && loMax > this.hi[0]) {
      this.lo.shift();
      this._heapPush(this.hi, loMax);
    }
    // Keep sizes balanced: lo can be at most 1 larger
    if (this.lo.length > this.hi.length + 1) {
      const top = -this.lo.shift();
      this._heapPush(this.hi, top);
    } else if (this.hi.length > this.lo.length) {
      const top = this.hi.shift();
      this._heapPush(this.lo, -top);
    }
  }
  findMedian() {
    if (this.lo.length > this.hi.length) return -this.lo[0];
    return (-this.lo[0] + this.hi[0]) / 2;
  }
}

Tradeoff:

Spotify-specific tips

Spotify's data engineers encounter this problem in disguise when computing real-time P50/P95 latency metrics or median listening duration across a stream of user events. The two-heap invariant — lower half max ≤ upper half min, sizes differ by at most one — is what Spotify tests for. Draw the two heaps and narrate the rebalancing step clearly.

Solve it now

Free. No sign-up. Python and JavaScript run instantly in your browser.

Output

Press Run or Cmd+Enter to execute

Practice these live with InterviewChamp.AI

Drill Find Median from Data Stream and other Spotify interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.

Practice these live with InterviewChamp.AI →