347. Top K Frequent Elements
mediumAsked at Hugging FaceReturn the k most frequent elements in an array. Hugging Face uses this to probe heap vs. bucket sort thinking — directly analogous to computing the top-k tokens by log probability during beam search or finding the most frequent terms in a large text dataset.
By Alex Chen, Founder, InterviewChamp.AI · Last verified
Source citations
Public interview reports confirming this problem appears in Hugging Face loops.
- Glassdoor (2026-Q1)— Hugging Face SWE onsite reports cite Top K Frequent as a classic frequency-counting medium problem.
- Blind (2025-11)— Hugging Face threads list this as a medium problem emphasizing trade-offs between heap and bucket-sort approaches.
Problem
Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order.
Constraints
1 <= nums.length <= 10^5−10^4 <= nums[i] <= 10^4k is in the range [1, the number of unique elements in the array].It is guaranteed that the answer is unique.
Examples
Example 1
nums = [1,1,1,2,2,3], k = 2[1,2]Explanation: 1 appears 3 times, 2 appears 2 times.
Example 2
nums = [1], k = 1[1]Explanation: Single element.
Approaches
1. Min-heap of size k
Count frequencies with a Map. Maintain a min-heap of (frequency, element) pairs of size k. If the heap grows beyond k, pop the minimum. The heap retains the top k.
- Time
- O(n log k)
- Space
- O(n)
function topKFrequent(nums, k) {
const freq = new Map();
for (const n of nums) freq.set(n, (freq.get(n) || 0) + 1);
// JS doesn't have a built-in heap; simulate with sorted slice for interview context
const entries = Array.from(freq.entries());
entries.sort((a, b) => b[1] - a[1]);
return entries.slice(0, k).map(e => e[0]);
}Tradeoff: The sort-based version shown here is O(n log n). In a language with a built-in priority queue (Python heapq), use a min-heap for true O(n log k). In JS interviews, explain the heap approach conceptually and note the sort is a pragmatic stand-in.
2. Bucket sort (O(n))
Create a bucket array indexed by frequency (size n+1). Each bucket holds elements with that frequency. Scan from the highest-frequency bucket down and collect k elements.
- Time
- O(n)
- Space
- O(n)
function topKFrequent(nums, k) {
const freq = new Map();
for (const n of nums) freq.set(n, (freq.get(n) || 0) + 1);
const buckets = Array.from({ length: nums.length + 1 }, () => []);
for (const [num, count] of freq) buckets[count].push(num);
const result = [];
for (let i = buckets.length - 1; i >= 0 && result.length < k; i--) {
result.push(...buckets[i]);
}
return result.slice(0, k);
}Tradeoff: O(n) time — beats the O(n log n) sort and O(n log k) heap. The key insight: frequency is bounded by n, so bucket-indexing by frequency is valid. This is the answer that impresses Hugging Face interviewers.
Hugging Face-specific tips
Hugging Face interviewers will ask: 'Can you do better than O(n log n)?' Lead with the bucket sort after demonstrating you know the heap approach: 'Frequency is naturally bounded by n, so we can index directly by frequency — that's O(n) and beats the heap.' This trade-off articulation — knowing multiple approaches and explaining when to use each — is exactly what Hugging Face values in ML infrastructure engineers who must choose the right algorithm for large-scale data processing.
Common mistakes
- Sorting the entire frequency map — O(n log n), acceptable but not the optimal answer.
- Forgetting that multiple elements can share the same frequency — buckets need to be arrays, not scalars.
- Off-by-one in bucket array size — bucket index can be at most nums.length, so size must be n+1.
- Returning more than k elements when multiple elements share the k-th frequency — use slice(0,k).
Follow-up questions
An interviewer at Hugging Face may pivot to one of these next:
- Top K Frequent Words (LC 692) — same approach, but sort alphabetically for ties.
- K Closest Points to Origin (LC 973) — use a max-heap of size k.
- How would you compute the top-k most frequent tokens across a distributed corpus of 10^12 tokens?
Solve it now
Free. No sign-up. Python and JavaScript run instantly in your browser.
FAQ
Why does bucket sort work here but not in general?
Bucket sort requires a bounded, enumerable key domain. Here frequency is an integer in [1, n], which fits perfectly. For arbitrary keys (e.g., floats), bucket sort doesn't apply.
When would you prefer the heap over bucket sort?
When memory is a constraint and n is very large, a min-heap of size k uses O(k) space vs. O(n) for bucket sort. For streaming inputs where n is not known in advance, the heap is also more appropriate.
Does the problem guarantee a unique answer?
Yes — the constraints say the answer is unique, meaning there's no ambiguity at the k-th boundary. In practice, still use slice(0,k) defensively.
Practice these live with InterviewChamp.AI
Drill Top K Frequent Elements and other Hugging Face interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.
Practice these live with InterviewChamp.AI →