Skip to main content

19. Top K Frequent Elements

mediumAsked at Databricks

Return the k most frequent integers — the canonical heap-vs-bucket-sort duel that Databricks maps directly to top-N analytics queries and the cardinality-estimation problems inside Delta Live Tables.

By Alex Chen, Founder, InterviewChamp.AI · Last verified

Problem

Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order. The algorithm must run in better than O(n log n) time.

Constraints

  • 1 <= nums.length <= 10^5
  • k is in the range [1, the number of unique elements in nums]
  • The answer is guaranteed to be unique

Examples

Example 1

Input
nums = [1,1,1,2,2,3], k = 2
Output
[1,2]

Explanation: 1 appears 3 times, 2 appears 2 times — both beat 3 which appears once.

Example 2

Input
nums = [1], k = 1
Output
[1]

Approaches

1. Sort by frequency

Build a frequency map, convert to an array of [element, count] pairs, sort descending by count, slice k. O(n log n) — violates the problem's time constraint.

Time
O(n log n)
Space
O(n)
function topKFrequent(nums, k) {
  const freq = new Map();
  for (const n of nums) freq.set(n, (freq.get(n) || 0) + 1);
  return [...freq.entries()]
    .sort((a, b) => b[1] - a[1])
    .slice(0, k)
    .map(([num]) => num);
}

Tradeoff:

2. Bucket sort — O(n)

Since frequencies range from 1 to n, bucket elements by frequency into an array of length n+1, then scan buckets from high to low to collect k elements.

Time
O(n)
Space
O(n)
function topKFrequent(nums, k) {
  const freq = new Map();
  for (const n of nums) freq.set(n, (freq.get(n) || 0) + 1);

  const buckets = Array.from({ length: nums.length + 1 }, () => []);
  for (const [num, cnt] of freq) buckets[cnt].push(num);

  const result = [];
  for (let i = buckets.length - 1; i >= 1 && result.length < k; i--) {
    for (const num of buckets[i]) {
      result.push(num);
      if (result.length === k) break;
    }
  }
  return result;
}

Tradeoff:

Databricks-specific tips

Databricks probes whether you know both solutions and when to use each: a min-heap gives you O(n log k) which wins when k << n and you're streaming data (can't hold the full array); bucket sort wins on batch. Articulate that tradeoff — they hire engineers who think in terms of data-pipeline topology, not just algorithmic elegance.

Solve it now

Free. No sign-up. Python and JavaScript run instantly in your browser.

Output

Press Run or Cmd+Enter to execute

Practice these live with InterviewChamp.AI

Drill Top K Frequent Elements and other Databricks interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.

Practice these live with InterviewChamp.AI →