Skip to main content

22. Find Median from Data Stream

hardAsked at JetBrains

Maintain a running median over a stream of integers — JetBrains uses this to test two-heap balance, the same trick behind their incremental indexer's latency percentile sampler.

By Alex Chen, Founder, InterviewChamp.AI · Last verified

Problem

Design a data structure with addNum(num) and findMedian() that supports inserting integers and returning the median of all inserted values so far in better-than-linear time per query.

Constraints

  • -10^5 <= num <= 10^5
  • Up to 5 * 10^4 calls
  • findMedian only after at least one addNum

Examples

Example 1

Input
addNum(1); addNum(2); findMedian();
Output
1.5

Example 2

Input
addNum(3); findMedian();
Output
2

Approaches

1. Sort on every query

Keep all numbers in a list; sort each time findMedian is called.

Time
O(n log n) per query
Space
O(n)
// every findMedian sorts the full buffer — addNum is O(1) but queries blow up

Tradeoff:

2. Two heaps (max-heap left, min-heap right)

Keep the lower half in a max-heap and the upper half in a min-heap, balanced to within 1. Median is the top of the larger heap, or the average of the two tops. JetBrains values this for streaming percentile telemetry.

Time
O(log n) add, O(1) query
Space
O(n)
class MedianFinder {
  constructor() { this.lo = []; this.hi = []; } // lo = max-heap (negated), hi = min-heap
  _pushHeap(h, v, cmp) {
    h.push(v); let i = h.length - 1;
    while (i > 0) { const p = (i - 1) >> 1; if (cmp(h[i], h[p]) < 0) { [h[i], h[p]] = [h[p], h[i]]; i = p; } else break; }
  }
  _popHeap(h, cmp) {
    const top = h[0], last = h.pop();
    if (h.length) { h[0] = last; let i = 0; while (true) { const l = 2*i+1, r = 2*i+2; let s = i; if (l < h.length && cmp(h[l], h[s]) < 0) s = l; if (r < h.length && cmp(h[r], h[s]) < 0) s = r; if (s === i) break; [h[i], h[s]] = [h[s], h[i]]; i = s; } }
    return top;
  }
  addNum(n) {
    const maxCmp = (a,b) => b-a, minCmp = (a,b) => a-b;
    if (!this.lo.length || n <= this.lo[0]) this._pushHeap(this.lo, n, maxCmp); else this._pushHeap(this.hi, n, minCmp);
    if (this.lo.length > this.hi.length + 1) this._pushHeap(this.hi, this._popHeap(this.lo, maxCmp), minCmp);
    if (this.hi.length > this.lo.length) this._pushHeap(this.lo, this._popHeap(this.hi, minCmp), maxCmp);
  }
  findMedian() {
    if (this.lo.length > this.hi.length) return this.lo[0];
    return (this.lo[0] + this.hi[0]) / 2;
  }
}

Tradeoff:

JetBrains-specific tips

JetBrains expects you to explain the balancing invariant precisely — they care about streaming-percentile primitives the same way their indexer's telemetry pipeline maintains rolling p50/p99 latency.

Solve it now

Free. No sign-up. Python and JavaScript run instantly in your browser.

Output

Press Run or Cmd+Enter to execute

Practice these live with InterviewChamp.AI

Drill Find Median from Data Stream and other JetBrains interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.

Practice these live with InterviewChamp.AI →