Skip to main content

347. Top K Frequent Elements

mediumAsked at Citadel

Top-K selection shows up constantly in quantitative finance: top-K movers by volume, top-K securities by realized volatility, top-K events by arrival rate. Citadel expects you to know all three approaches — sort, min-heap, and bucket sort — and choose the right one based on the relationship between n and k.

By Alex Chen, Founder, InterviewChamp.AI · Last verified

Source citations

Public interview reports confirming this problem appears in Citadel loops.

  • Glassdoor (2026-Q1)Citadel SWE candidates report Top K Frequent Elements and heap-based selection problems as recurring mid-round questions.
  • Blind (2025-11)Citadel SWE threads recommend knowing both heap and bucket approaches for top-K problems before interviewing.

Problem

Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order.

Constraints

  • 1 <= nums.length <= 10^5
  • −10^4 <= nums[i] <= 10^4
  • k is in the range [1, the number of unique elements in the array].
  • It is guaranteed that the answer is unique.

Examples

Example 1

Input
nums = [1,1,1,2,2,3], k = 2
Output
[1,2]

Explanation: 1 appears 3 times, 2 appears 2 times — they are the top 2.

Example 2

Input
nums = [1], k = 1
Output
[1]

Explanation: Single element, k = 1.

Approaches

1. Min-heap of size k

Count frequencies with a Map. Maintain a min-heap of size k (keyed by frequency). When the heap exceeds size k, pop the minimum. The heap holds the k most frequent elements.

Time
O(n log k)
Space
O(n)
// JavaScript lacks a built-in heap; simulate with sorted array for interview.
// Production: use a proper min-heap (binary heap).
function topKFrequent(nums, k) {
  const freq = new Map();
  for (const n of nums) freq.set(n, (freq.get(n) || 0) + 1);

  // Convert to [count, num] pairs and sort by count descending
  // (in an interview without a heap library, this is acceptable)
  const entries = [...freq.entries()].map(([num, cnt]) => [cnt, num]);
  entries.sort((a, b) => b[0] - a[0]);
  return entries.slice(0, k).map(([, num]) => num);
}

Tradeoff: The sort version is O(n log n) — fine when n is small. In an interview where you can't use a heap library, this is the practical substitute. Explain the heap version conceptually: maintain a min-heap of size k; if heap.size > k, pop the minimum-frequency element. That gives O(n log k) which beats O(n log n) when k << n.

2. Bucket sort (O(n))

Frequencies range from 1 to n. Create an array of n+1 buckets where bucket[i] holds all numbers with frequency i. Then read from the highest-frequency bucket downward until k elements are collected.

Time
O(n)
Space
O(n)
function topKFrequent(nums, k) {
  const freq = new Map();
  for (const n of nums) freq.set(n, (freq.get(n) || 0) + 1);

  // Bucket: index = frequency, value = list of numbers with that frequency
  const buckets = Array.from({ length: nums.length + 1 }, () => []);
  for (const [num, cnt] of freq) {
    buckets[cnt].push(num);
  }

  // Collect from highest frequency downward
  const result = [];
  for (let i = buckets.length - 1; i >= 0 && result.length < k; i--) {
    result.push(...buckets[i]);
  }
  return result.slice(0, k);
}

Tradeoff: O(n) time and space — optimal. The key insight is that frequency is bounded by n, so bucket sort applies. This is the showstopper answer that Citadel rewards. State it after presenting the heap approach.

Citadel-specific tips

Present approaches in order of increasing sophistication: sort (O(n log n)) → heap (O(n log k)) → bucket sort (O(n)). Citadel interviewers expect you to drive toward the O(n) solution unprompted. Then connect to finance: 'In a real-time market scanner tracking the top-k most active symbols, I'd use a min-heap maintained incrementally as events arrive — O(log k) per update, never sorting the whole dataset.' This shows you're thinking about streaming data, not just batch processing.

Common mistakes

  • Stopping at the O(n log n) sort without mentioning the heap or bucket approaches.
  • Implementing bucket sort but off-by-one on the bucket array size — frequencies go up to nums.length, so the array must be size n+1 (indices 0 to n).
  • Using Object instead of Map for frequency counting — Object keys are strings, causing issues with integer keys like '0' vs 0.
  • In the heap approach, forgetting to heapify by frequency not by value — you're selecting by count, not by numerical order.

Follow-up questions

An interviewer at Citadel may pivot to one of these next:

  • Kth Largest Element in an Array (LC 215) — quickselect for O(n) average.
  • Sort Characters by Frequency (LC 451) — sort a string's characters by frequency.
  • How would you maintain the top-k most frequent elements in a streaming context?

Solve it now

Free. No sign-up. Python and JavaScript run instantly in your browser.

Output

Press Run or Cmd+Enter to execute

FAQ

When does the heap approach beat the bucket sort approach?

Never asymptotically — bucket sort is O(n). But when n is very large and k is tiny, a heap might use less memory (O(k) vs O(n)) in a production implementation. In this problem, the frequency Map already costs O(n), so bucket sort's space is not worse.

Why does bucket sort work here but not on arbitrary integers?

Because the frequency values are bounded: they range from 1 to n. Standard bucket/counting sort requires a bounded key range.

Is the problem asking for frequency, not value?

Yes — return the k elements with the highest occurrence counts, not the k numerically largest elements. A common trap is confusing 'top k frequent' with 'top k largest'.

Practice these live with InterviewChamp.AI

Drill Top K Frequent Elements and other Citadel interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.

Practice these live with InterviewChamp.AI →