49. Group Anagrams
mediumAsked at Hugging FaceGroup strings that are anagrams of each other. Hugging Face uses this to test canonical-key hashing — the same fingerprinting technique used to cluster semantically equivalent dataset examples or deduplicate near-duplicate model cards in the Hub.
By Alex Chen, Founder, InterviewChamp.AI · Last verified
Source citations
Public interview reports confirming this problem appears in Hugging Face loops.
- Glassdoor (2025-12)— Cited in Hugging Face SWE medium-round reports as a canonical-key grouping exercise.
- Blind (2025-09)— Hugging Face threads list Group Anagrams as a common medium problem testing hash map design.
Problem
Given an array of strings strs, group the anagrams together. You can return the answer in any order. An Anagram is a word or phrase formed by rearranging the letters of a different word or phrase, using all the original letters exactly once.
Constraints
1 <= strs.length <= 10^40 <= strs[i].length <= 100strs[i] consists of lowercase English letters.
Examples
Example 1
strs = ["eat","tea","tan","ate","nat","bat"][["bat"],["nat","tan"],["ate","eat","tea"]]Explanation: "eat", "tea", "ate" are anagrams; "tan", "nat" are anagrams; "bat" stands alone.
Example 2
strs = [""][[""]]Explanation: Single empty string.
Example 3
strs = ["a"][["a"]]Explanation: Single character.
Approaches
1. Sorted-string key
Sort each string to produce a canonical key. Strings that are anagrams will have the same sorted key. Group them in a Map.
- Time
- O(n * k log k) where k is max string length
- Space
- O(n * k)
function groupAnagrams(strs) {
const map = new Map();
for (const str of strs) {
const key = str.split('').sort().join('');
if (!map.has(key)) map.set(key, []);
map.get(key).push(str);
}
return Array.from(map.values());
}Tradeoff: Simple and correct. The sort is O(k log k) per string. For short strings (k ≤ 100) this is essentially O(1) per string, making the total O(n). Preferred in most interview settings.
2. Character-count key
Build a 26-element frequency count for each string and use it as a key. No sorting required — O(k) per string.
- Time
- O(n * k)
- Space
- O(n * k)
function groupAnagrams(strs) {
const map = new Map();
for (const str of strs) {
const count = new Array(26).fill(0);
for (const ch of str) count[ch.charCodeAt(0) - 97]++;
const key = count.join(',');
if (!map.has(key)) map.set(key, []);
map.get(key).push(str);
}
return Array.from(map.values());
}Tradeoff: Strictly O(n * k) — better than sort for very long strings. Mention this optimization if the interviewer asks about the bottleneck or if k can be large.
Hugging Face-specific tips
Hugging Face will often follow up: 'What if the strings were Unicode and not just lowercase letters?' Discuss generalizing the character-count key to a Map<char, count> or using a sorted-character key (which already handles Unicode). Connect to their domain: 'The same canonical-fingerprint approach is how dataset deduplication works — you hash a normalized version of each example and group by hash.' This shows ML systems awareness.
Common mistakes
- Using the original string as a key — anagrams have different strings but the same sorted key.
- Sorting by character code but forgetting that sort() is lexicographic by default — always pass a comparator or use split/sort/join for strings.
- Returning the map directly instead of converting to Array.from(map.values()).
- Not handling empty strings — an empty string sorts to an empty string, which is a valid key.
Follow-up questions
An interviewer at Hugging Face may pivot to one of these next:
- Group Shifted Strings (LC 249) — group strings whose character differences are the same.
- How would you scale this to group anagrams in a corpus of 10^9 strings using MapReduce?
- What canonical key would you use if the alphabet was Unicode instead of just lowercase English letters?
Solve it now
Free. No sign-up. Python and JavaScript run instantly in your browser.
FAQ
Why is sorted-string key acceptable despite O(k log k)?
In practice k ≤ 100, so the sort is effectively O(1). For very large k (e.g., genomic sequences), the character-count key at O(k) is strictly better.
Can two different strings have the same sorted key but not be anagrams?
No — sorting is a bijection on multisets of characters. Same sorted string ↔ same character multiset ↔ anagram.
Is the output order important?
No — 'return the answer in any order' means groups and their internal order can vary.
Practice these live with InterviewChamp.AI
Drill Group Anagrams and other Hugging Face interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.
Practice these live with InterviewChamp.AI →