88. Find Median from Data Stream
hardAsked at DatadogDesign addNum and findMedian over a streaming sequence. Datadog asks this as the canonical streaming-percentile problem — two-heap solution is the exact pattern they use for p50 over a sliding metric window.
By Alex Chen, Founder, InterviewChamp.AI · Last verified
Source citations
Public interview reports confirming this problem appears in Datadog loops.
- Glassdoor (2026-Q1)— Datadog onsite — direct streaming-percentile analog.
- Blind (2025-12)— Recurring at Datadog.
Problem
The median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value, and the median is the mean of the two middle values. Implement the MedianFinder class with addNum(num) and findMedian() that adds a number from the data stream and returns the median of all elements so far.
Constraints
-10^5 <= num <= 10^5There will be at least one element in the data structure before calling findMedian.At most 5 * 10^4 calls.
Examples
Example 1
addNum(1), addNum(2), findMedian(), addNum(3), findMedian()[null,null,1.5,null,2.0]Approaches
1. Sorted list, binary insert
Maintain a sorted array; binary-insert each new num.
- Time
- addNum O(n), findMedian O(1)
- Space
- O(n)
// Binary search insertion point; array.splice for insert.Tradeoff: O(n) per insert due to shift. Datadog will fail at scale.
2. Two heaps (max-heap left, min-heap right) (optimal)
Left half in a max-heap, right half in a min-heap. Balance sizes (diff <= 1). Median = top of larger heap (or average of both tops if balanced).
- Time
- addNum O(log n), findMedian O(1)
- Space
- O(n)
// Pseudocode (assume MaxHeap and MinHeap implementations).
class MedianFinder {
constructor() { this.low = new MaxHeap(); this.high = new MinHeap(); }
addNum(num) {
if (this.low.size === 0 || num <= this.low.peek()) this.low.push(num);
else this.high.push(num);
if (this.low.size > this.high.size + 1) this.high.push(this.low.pop());
else if (this.high.size > this.low.size) this.low.push(this.high.pop());
}
findMedian() {
if (this.low.size > this.high.size) return this.low.peek();
return (this.low.peek() + this.high.peek()) / 2;
}
}Tradeoff: addNum O(log n). Datadog-canonical: two-heap is the gold standard for streaming median. Generalizes to any percentile via two-heap with skewed sizes.
Datadog-specific tips
Datadog will follow up with: 'Now do this over a sliding window' (LC 480). The two-heap doesn't directly support removal — need lazy deletion or two indexed multisets. Articulate the tradeoff.
Common mistakes
- Failing to balance heaps after each push — one heap grows arbitrarily, median wrong.
- Wrong comparison in addNum — must check 'num <= low.peek()' (not <) for edge cases.
- Computing average without considering integer overflow — fine in JS, but in C++/Java use (a + b) / 2.0 or (a / 2) + (b / 2).
Follow-up questions
An interviewer at Datadog may pivot to one of these next:
- Sliding Window Median (LC 480).
- Streaming percentile beyond median — t-digest, HDR histogram.
- Datadog-style: p50 + p95 + p99 over a metric stream.
Solve it now
Free. No sign-up. Python and JavaScript run instantly in your browser.
FAQ
Why two heaps?
The median is the boundary between lower and upper halves. Max-heap on lower lets us read the boundary; min-heap on upper symmetric.
How to extend to p99?
Use a t-digest or HDR histogram for approximate. Two-heap doesn't scale to arbitrary percentiles with bounded memory.
Practice these live with InterviewChamp.AI
Drill Find Median from Data Stream and other Datadog interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.
Practice these live with InterviewChamp.AI →