21. Find Median from Data Stream
hardAsked at ConfluentDesign a structure that absorbs numbers and reports the running median — Confluent uses it because two-heap streaming aggregates are the textbook fit for an online consumer over a Kafka topic.
By Alex Chen, Founder, InterviewChamp.AI · Last verified
Problem
Design a class MedianFinder with addNum(num) and findMedian(). findMedian returns the median of all numbers seen so far. Both operations should be efficient as the input grows.
Constraints
-10^5 <= num <= 10^5At most 5*10^4 calls to addNum and findMedianfindMedian is called after at least one addNum
Examples
Example 1
addNum(1); addNum(2); findMedian()1.5Example 2
addNum(3); findMedian()2.0Approaches
1. Sorted array
Maintain a sorted array; insert by binary-search; report middle.
- Time
- O(n) per add
- Space
- O(n)
// arr stays sorted; addNum binary-searches insertion index, splices inTradeoff:
2. Two heaps
Keep a max-heap for the lower half and a min-heap for the upper half; balance sizes after each insert. Median is the top of the larger heap or the average of both tops.
- Time
- O(log n) per add
- Space
- O(n)
class MedianFinder {
constructor() { this.lo = []; this.hi = []; } // lo = max-heap (negate); hi = min-heap
_push(h, x) { h.push(x); for (let i = h.length - 1; i > 0;) { const p = (i-1)>>1; if (h[p] <= h[i]) break; [h[p], h[i]] = [h[i], h[p]]; i = p; } }
_pop(h) { const t = h[0]; const last = h.pop(); if (h.length) { h[0] = last; let i=0, n=h.length; for(;;){let l=2*i+1,r=l+1,s=i; if(l<n&&h[l]<h[s])s=l; if(r<n&&h[r]<h[s])s=r; if(s===i)break; [h[s],h[i]]=[h[i],h[s]]; i=s;} } return t; }
addNum(num) {
this._push(this.lo, -num);
this._push(this.hi, -this._pop(this.lo));
if (this.hi.length > this.lo.length) this._push(this.lo, -this._pop(this.hi));
}
findMedian() {
if (this.lo.length > this.hi.length) return -this.lo[0];
return (-this.lo[0] + this.hi[0]) / 2;
}
}Tradeoff:
Confluent-specific tips
Confluent grades this on streaming integration — describe how partition assignment forces a per-partition pair of heaps and a downstream merger so exactly-once semantics keep the global median stable across consumer-group rebalance.
Solve it now
Free. No sign-up. Python and JavaScript run instantly in your browser.
Practice these live with InterviewChamp.AI
Drill Find Median from Data Stream and other Confluent interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.
Practice these live with InterviewChamp.AI →