295. Find Median from Data Stream
mediumAsked at CitadelComputing a running median over a live data stream is a real operational metric at Citadel — median latency of order acknowledgements, median fill rate over a rolling window, median bid-ask spread. The two-heap technique is the standard interview answer, but Citadel pushes hard on the rebalancing logic and on what happens at O(log n) vs O(n).
By Alex Chen, Founder, InterviewChamp.AI · Last verified
Source citations
Public interview reports confirming this problem appears in Citadel loops.
- Glassdoor (2026-Q1)— Citadel SWE candidates cite Find Median from Data Stream as one of the highest-signal problems in on-site rounds, frequently paired with follow-ups on sliding-window median.
- Blind (2025-11)— Multiple Citadel SWE threads list this problem as a must-know, with interviewers specifically testing the two-heap rebalancing invariant.
Problem
The median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value, and the median is the mean of the two middle values. Implement the MedianFinder class: MedianFinder() initializes the object. void addNum(int num) adds the integer num to the data structure. double findMedian() returns the median of all elements so far.
Constraints
−10^5 <= num <= 10^5There will be at least one element in the data structure before calling findMedian.At most 5 * 10^4 calls will be made to addNum and findMedian.
Examples
Example 1
MedianFinder(); addNum(1); addNum(2); findMedian(); addNum(3); findMedian()[null, null, null, 1.5, null, 2.0]Explanation: After [1,2]: median = (1+2)/2 = 1.5. After [1,2,3]: median = 2.
Approaches
1. Two heaps (max-heap + min-heap)
Maintain a max-heap for the lower half and a min-heap for the upper half. Keep them balanced (sizes differ by at most 1). Median is the top of the larger heap, or the average of both tops.
- Time
- O(log n) addNum, O(1) findMedian
- Space
- O(n)
// JavaScript has no built-in heap. Simulate with sorted insertion for interview.
// For production: implement a binary heap or use a library.
class MedianFinder {
constructor() {
// lo = max-heap (negate values to simulate with min-heap logic)
// hi = min-heap
this.lo = []; // lower half, stored negated for max-heap simulation
this.hi = []; // upper half
}
_pushMaxHeap(val) {
// Insert into sorted lo array (descending order)
let lo = 0, hi = this.lo.length;
while (lo < hi) {
const mid = (lo + hi) >> 1;
if (this.lo[mid] < val) hi = mid; else lo = mid + 1;
}
this.lo.splice(lo, 0, val);
}
_pushMinHeap(val) {
let lo = 0, hi = this.hi.length;
while (lo < hi) {
const mid = (lo + hi) >> 1;
if (this.hi[mid] > val) hi = mid; else lo = mid + 1;
}
this.hi.splice(lo, 0, val);
}
addNum(num) {
// Always push to lo first, then rebalance
this._pushMaxHeap(num);
// Balance: lo's max must be <= hi's min
if (this.hi.length && this.lo[0] > this.hi[0]) {
this._pushMinHeap(this.lo.shift());
}
// Size invariant: lo can have at most 1 more than hi
if (this.lo.length > this.hi.length + 1) {
this._pushMinHeap(this.lo.shift());
} else if (this.hi.length > this.lo.length) {
this._pushMaxHeap(this.hi.shift());
}
}
findMedian() {
if (this.lo.length > this.hi.length) return this.lo[0];
return (this.lo[0] + this.hi[0]) / 2;
}
}Tradeoff: The interview explanation uses real heaps. Since JS lacks a built-in binary heap, explain the heap logic verbally and note that in production you'd implement a proper binary heap (O(log n) push/pop). The code above uses sorted arrays for clarity — real O(log n) requires a heap implementation. Citadel expects you to know the heap logic even if the language makes it harder to implement directly.
Citadel-specific tips
Explain the invariant before writing code: 'I maintain two heaps: lo (max-heap, lower half) and hi (min-heap, upper half). Invariants: (1) lo.size >= hi.size, and lo.size - hi.size <= 1. (2) lo.max <= hi.min. If either invariant breaks on insertion, I rebalance by moving the extremum between heaps.' Citadel interviewers will then ask: 'What if the stream is bounded and values are integers in range [0, 10^5]?' — the answer is a Fenwick tree (BIT) storing counts, enabling O(log maxVal) insertions and O(log maxVal) median queries by binary searching for the median rank.
Common mistakes
- Not maintaining the ordering invariant (lo.max <= hi.min) — allows a small value in hi and a large value in lo, breaking the median calculation.
- Using equal-size rebalancing (lo and hi always equal) — can't handle odd total counts; you need lo.size to be allowed to be 1 larger.
- Computing median as (lo.max + hi.min) / 2 when the total count is odd — should return lo.max only.
- In JavaScript, simulating a heap with sort() on every insertion — O(n log n) per addNum instead of O(log n).
Follow-up questions
An interviewer at Citadel may pivot to one of these next:
- Sliding Window Median (LC 480) — maintain a window of size k; much harder because you must remove elements too.
- How would you compute the running median using a Fenwick tree (for integer-bounded input)?
- What if the stream is heavily skewed? Does the two-heap approach still perform well?
Solve it now
Free. No sign-up. Python and JavaScript run instantly in your browser.
FAQ
Why a max-heap for the lower half?
You need instant access to the largest element in the lower half — that's the max-heap's root. Similarly, you need instant access to the smallest element in the upper half — the min-heap's root. Together they give you the two middle elements in O(1).
Does the order of rebalancing matter?
Yes — always push to lo first, then check the ordering invariant (lo.max <= hi.min), then check the size invariant. Doing size rebalancing before ordering rebalancing can move a value to the wrong heap.
Why is JavaScript's lack of a built-in heap a real issue here?
The two-heap solution requires O(log n) push and pop. Simulating this with sorted arrays gives O(n) insertion. For n = 5*10^4 operations, O(n) per operation is O(n²) total — potentially too slow. Always flag this and describe what a proper heap implementation would look like.
Practice these live with InterviewChamp.AI
Drill Find Median from Data Stream and other Citadel interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.
Practice these live with InterviewChamp.AI →