Skip to main content

347. Top K Frequent Elements

mediumAsked at Hugging Face

Return the k most frequent elements in an array. Hugging Face uses this to probe heap vs. bucket sort thinking — directly analogous to computing the top-k tokens by log probability during beam search or finding the most frequent terms in a large text dataset.

By Alex Chen, Founder, InterviewChamp.AI · Last verified

Source citations

Public interview reports confirming this problem appears in Hugging Face loops.

  • Glassdoor (2026-Q1)Hugging Face SWE onsite reports cite Top K Frequent as a classic frequency-counting medium problem.
  • Blind (2025-11)Hugging Face threads list this as a medium problem emphasizing trade-offs between heap and bucket-sort approaches.

Problem

Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order.

Constraints

  • 1 <= nums.length <= 10^5
  • −10^4 <= nums[i] <= 10^4
  • k is in the range [1, the number of unique elements in the array].
  • It is guaranteed that the answer is unique.

Examples

Example 1

Input
nums = [1,1,1,2,2,3], k = 2
Output
[1,2]

Explanation: 1 appears 3 times, 2 appears 2 times.

Example 2

Input
nums = [1], k = 1
Output
[1]

Explanation: Single element.

Approaches

1. Min-heap of size k

Count frequencies with a Map. Maintain a min-heap of (frequency, element) pairs of size k. If the heap grows beyond k, pop the minimum. The heap retains the top k.

Time
O(n log k)
Space
O(n)
function topKFrequent(nums, k) {
  const freq = new Map();
  for (const n of nums) freq.set(n, (freq.get(n) || 0) + 1);
  // JS doesn't have a built-in heap; simulate with sorted slice for interview context
  const entries = Array.from(freq.entries());
  entries.sort((a, b) => b[1] - a[1]);
  return entries.slice(0, k).map(e => e[0]);
}

Tradeoff: The sort-based version shown here is O(n log n). In a language with a built-in priority queue (Python heapq), use a min-heap for true O(n log k). In JS interviews, explain the heap approach conceptually and note the sort is a pragmatic stand-in.

2. Bucket sort (O(n))

Create a bucket array indexed by frequency (size n+1). Each bucket holds elements with that frequency. Scan from the highest-frequency bucket down and collect k elements.

Time
O(n)
Space
O(n)
function topKFrequent(nums, k) {
  const freq = new Map();
  for (const n of nums) freq.set(n, (freq.get(n) || 0) + 1);
  const buckets = Array.from({ length: nums.length + 1 }, () => []);
  for (const [num, count] of freq) buckets[count].push(num);
  const result = [];
  for (let i = buckets.length - 1; i >= 0 && result.length < k; i--) {
    result.push(...buckets[i]);
  }
  return result.slice(0, k);
}

Tradeoff: O(n) time — beats the O(n log n) sort and O(n log k) heap. The key insight: frequency is bounded by n, so bucket-indexing by frequency is valid. This is the answer that impresses Hugging Face interviewers.

Hugging Face-specific tips

Hugging Face interviewers will ask: 'Can you do better than O(n log n)?' Lead with the bucket sort after demonstrating you know the heap approach: 'Frequency is naturally bounded by n, so we can index directly by frequency — that's O(n) and beats the heap.' This trade-off articulation — knowing multiple approaches and explaining when to use each — is exactly what Hugging Face values in ML infrastructure engineers who must choose the right algorithm for large-scale data processing.

Common mistakes

  • Sorting the entire frequency map — O(n log n), acceptable but not the optimal answer.
  • Forgetting that multiple elements can share the same frequency — buckets need to be arrays, not scalars.
  • Off-by-one in bucket array size — bucket index can be at most nums.length, so size must be n+1.
  • Returning more than k elements when multiple elements share the k-th frequency — use slice(0,k).

Follow-up questions

An interviewer at Hugging Face may pivot to one of these next:

  • Top K Frequent Words (LC 692) — same approach, but sort alphabetically for ties.
  • K Closest Points to Origin (LC 973) — use a max-heap of size k.
  • How would you compute the top-k most frequent tokens across a distributed corpus of 10^12 tokens?

Solve it now

Free. No sign-up. Python and JavaScript run instantly in your browser.

Output

Press Run or Cmd+Enter to execute

FAQ

Why does bucket sort work here but not in general?

Bucket sort requires a bounded, enumerable key domain. Here frequency is an integer in [1, n], which fits perfectly. For arbitrary keys (e.g., floats), bucket sort doesn't apply.

When would you prefer the heap over bucket sort?

When memory is a constraint and n is very large, a min-heap of size k uses O(k) space vs. O(n) for bucket sort. For streaming inputs where n is not known in advance, the heap is also more appropriate.

Does the problem guarantee a unique answer?

Yes — the constraints say the answer is unique, meaning there's no ambiguity at the k-th boundary. In practice, still use slice(0,k) defensively.

Practice these live with InterviewChamp.AI

Drill Top K Frequent Elements and other Hugging Face interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.

Practice these live with InterviewChamp.AI →