Skip to main content

295. Find Median from Data Stream

hardAsked at Atlassian

Find Median from Data Stream is Atlassian's two-heap design problem. Numbers arrive one at a time; return the median after every insert. Atlassian asks it because the same pattern appears in their Confluence analytics dashboards that surface live percentiles.

By Alex Chen, Founder, InterviewChamp.AI · Last verified

Source citations

Public interview reports confirming this problem appears in Atlassian loops.

  • Glassdoor (2026-Q1)Atlassian Senior onsite reports cite Find Median from Data Stream as a recurring two-heap design problem.
  • Blind (2025-11)Atlassian interview threads mention Median From Data Stream as the bridge between top-k heap problems and harder streaming problems.

Problem

The median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value and the median is the mean of the two middle values. Implement the MedianFinder class: MedianFinder() initializes the MedianFinder object. void addNum(int num) adds the integer num from the data stream to the data structure. double findMedian() returns the median of all elements so far. Answers within 10^-5 of the actual answer will be accepted.

Constraints

  • -10^5 <= num <= 10^5
  • There will be at least one element in the data structure before calling findMedian.
  • At most 5 * 10^4 calls will be made to addNum and findMedian.

Examples

Example 1

Input
MedianFinder(); addNum(1); addNum(2); findMedian(); addNum(3); findMedian()
Output
[null,null,null,1.5,null,2.0]

Approaches

1. Sorted-array insertion

Keep a sorted array. addNum does a binary search + array splice; findMedian returns the middle.

Time
O(n) per addNum, O(1) per findMedian
Space
O(n)
class MedianFinderArr {
  constructor() {
    this.arr = [];
  }
  addNum(num) {
    let lo = 0, hi = this.arr.length;
    while (lo < hi) {
      const mid = (lo + hi) >> 1;
      if (this.arr[mid] < num) lo = mid + 1;
      else hi = mid;
    }
    this.arr.splice(lo, 0, num);
  }
  findMedian() {
    const n = this.arr.length;
    if (n % 2 === 1) return this.arr[(n - 1) / 2];
    return (this.arr[n / 2 - 1] + this.arr[n / 2]) / 2;
  }
}

Tradeoff: Simplest but the splice is O(n) per insert. Times out around n=2*10^4. Use it only to motivate the two-heap structure.

2. Two heaps (max-heap low + min-heap high)

Maintain a max-heap of the lower half and a min-heap of the upper half. Keep sizes within 1; median = top of larger half, or average of tops if equal.

Time
O(log n) per addNum, O(1) per findMedian
Space
O(n)
class MaxHeap {
  constructor() { this.h = []; }
  size() { return this.h.length; }
  peek() { return this.h[0]; }
  push(v) {
    this.h.push(v);
    let i = this.h.length - 1;
    while (i > 0) {
      const p = (i - 1) >> 1;
      if (this.h[p] >= this.h[i]) break;
      [this.h[p], this.h[i]] = [this.h[i], this.h[p]];
      i = p;
    }
  }
  pop() {
    const top = this.h[0];
    const last = this.h.pop();
    if (this.h.length) {
      this.h[0] = last;
      let i = 0;
      for (;;) {
        const l = 2 * i + 1;
        const r = 2 * i + 2;
        let m = i;
        if (l < this.h.length && this.h[l] > this.h[m]) m = l;
        if (r < this.h.length && this.h[r] > this.h[m]) m = r;
        if (m === i) break;
        [this.h[i], this.h[m]] = [this.h[m], this.h[i]];
        i = m;
      }
    }
    return top;
  }
}
class MinHeap {
  constructor() { this.h = []; }
  size() { return this.h.length; }
  peek() { return this.h[0]; }
  push(v) {
    this.h.push(v);
    let i = this.h.length - 1;
    while (i > 0) {
      const p = (i - 1) >> 1;
      if (this.h[p] <= this.h[i]) break;
      [this.h[p], this.h[i]] = [this.h[i], this.h[p]];
      i = p;
    }
  }
  pop() {
    const top = this.h[0];
    const last = this.h.pop();
    if (this.h.length) {
      this.h[0] = last;
      let i = 0;
      for (;;) {
        const l = 2 * i + 1;
        const r = 2 * i + 2;
        let m = i;
        if (l < this.h.length && this.h[l] < this.h[m]) m = l;
        if (r < this.h.length && this.h[r] < this.h[m]) m = r;
        if (m === i) break;
        [this.h[i], this.h[m]] = [this.h[m], this.h[i]];
        i = m;
      }
    }
    return top;
  }
}
class MedianFinder {
  constructor() {
    this.low = new MaxHeap();
    this.high = new MinHeap();
  }
  addNum(num) {
    if (this.low.size() === 0 || num <= this.low.peek()) this.low.push(num);
    else this.high.push(num);
    if (this.low.size() > this.high.size() + 1) this.high.push(this.low.pop());
    else if (this.high.size() > this.low.size()) this.low.push(this.high.pop());
  }
  findMedian() {
    if (this.low.size() > this.high.size()) return this.low.peek();
    return (this.low.peek() + this.high.peek()) / 2;
  }
}

Tradeoff: The Atlassian-expected answer. O(log n) per insert and O(1) median. The size-rebalancing is the subtle part — get the invariant right (low_size >= high_size, differing by at most 1) and the rest falls out.

Atlassian-specific tips

Atlassian explicitly grades you on stating the heap invariant out loud BEFORE writing addNum: 'low is a max-heap of the smaller half; high is a min-heap of the larger half; sizes differ by at most 1, with low being the larger half on ties'. If you write the code without articulating this, expect the interviewer to ask 'and what's your invariant?'. State it first to avoid losing the question mid-coding when an edge case forces you to re-derive it.

Common mistakes

  • Letting the heap sizes drift more than 1 apart — the median becomes wrong silently.
  • On the size-1 case, returning low.peek() but low is empty (you pushed the first num into high). The 'push to low if empty or num <= low.peek()' rule prevents this — state it.
  • Returning low.peek() always when sizes are equal — that's a bug; you must average the two tops.

Follow-up questions

An interviewer at Atlassian may pivot to one of these next:

  • Sliding Window Median (LeetCode 480) — same data structure but with deletions; multi-set or lazy-deletion two-heap.
  • What if the stream is multi-billion and won't fit in memory? Reservoir sampling or quantile sketches (t-digest).
  • What if you needed an arbitrary percentile, not just median? Same two-heap with adjusted size invariant.

Solve it now

Free. No sign-up. Python and JavaScript run instantly in your browser.

Output

Press Run or Cmd+Enter to execute

FAQ

Why max-heap on the low half and min-heap on the high half?

Because you only ever need the TOP of each half to compute the median. The max of the lower half + the min of the upper half are the two values that straddle the middle. Each heap's natural peek gives you exactly what you need in O(1).

Can I use a balanced BST instead?

Yes, and it's a clean answer if your language has one (Java TreeMap, C++ multiset). In JS you would hand-write a treap or skip-list, which is a 200-line answer. Stick with two heaps unless the interviewer explicitly asks for deletions, where the BST starts to win.

Free learning resources

Curated free links for this problem.

Practice these live with InterviewChamp.AI

Drill Find Median from Data Stream and other Atlassian interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.

Practice these live with InterviewChamp.AI →