49. Group Anagrams
mediumAsked at HPHP document and content management systems aggregate files with equivalent names or metadata under a unified key. Group Anagrams is a proxy for that canonical key-design question: how do you derive a consistent, collision-free key from unordered data? HP uses it to test hashing intuition.
By Alex Chen, Founder, InterviewChamp.AI · Last verified
Source citations
Public interview reports confirming this problem appears in HP loops.
- Glassdoor (2025-Q4)— HP backend SWE interviewees report Group Anagrams as a common second-round question focused on hash-map key design.
- Blind (2025-10)— HP interview prep threads list Group Anagrams as a standard medium problem in technical screens for SWE roles.
Problem
Given an array of strings strs, group the anagrams together. You can return the answer in any order. An Anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.
Constraints
1 <= strs.length <= 10^40 <= strs[i].length <= 100strs[i] consists of lowercase English letters.
Examples
Example 1
strs = ["eat","tea","tan","ate","nat","bat"][["bat"],["nat","tan"],["ate","eat","tea"]]Explanation: eat/tea/ate are anagrams; tan/nat are anagrams; bat stands alone.
Example 2
strs = [""][[""]]Explanation: Single empty string forms one group.
Approaches
1. Sorted key
Sort each string's characters to produce a canonical key. Group strings by key in a map.
- Time
- O(n * k log k)
- Space
- O(n * k)
function groupAnagrams(strs) {
const map = new Map();
for (const s of strs) {
const key = s.split('').sort().join('');
if (!map.has(key)) map.set(key, []);
map.get(key).push(s);
}
return [...map.values()];
}Tradeoff: Simple and correct. O(k log k) per string for sorting. Suitable for most interview contexts.
2. Character frequency key
Represent each string as a 26-element frequency count array. Serialize it to a string key. Avoids sorting.
- Time
- O(n * k)
- Space
- O(n * k)
function groupAnagrams(strs) {
const map = new Map();
for (const s of strs) {
const count = new Array(26).fill(0);
for (const ch of s) count[ch.charCodeAt(0) - 97]++;
const key = count.join('#'); // e.g. '1#0#0#...#1#...'
if (!map.has(key)) map.set(key, []);
map.get(key).push(s);
}
return [...map.values()];
}Tradeoff: O(n * k) — no sorting overhead. The separator '#' prevents key collisions between different frequency tuples. Better for large strings or when the alphabet is bounded.
HP-specific tips
HP interviewers appreciate discussing the key-design trade-off explicitly: sorted key is simpler; frequency-count key is faster. Point out that the '#' separator is critical in the frequency approach — without it, [1,10] and [11,0] would produce the same string. For HP's document processing context, mention that this scales to millions of files with O(1) lookup per file after the initial pass.
Common mistakes
- Forgetting the separator in the frequency key — leads to hash collisions.
- Using an object instead of a Map — object keys are strings which coerce types and don't preserve insertion order in older JS environments.
- Sorting the output groups — the problem says order doesn't matter, so sorting wastes time.
- Not handling empty strings — an empty string has all-zero frequency counts and maps to its own group.
Follow-up questions
An interviewer at HP may pivot to one of these next:
- Group words that are scrambles of each other where repetition is allowed (multi-sets).
- What if strings contain Unicode characters beyond the 26 English letters?
- How would you shard this grouping across multiple machines for a billion-string input?
Solve it now
Free. No sign-up. Python and JavaScript run instantly in your browser.
FAQ
Why use '#' as separator in the frequency key?
Consider counts [1, 10, ...] and [11, 0, ...] — joined without a separator both produce '110...'. The separator makes each position unambiguous.
Which approach is better for large strings?
The frequency-count approach is O(k) per string versus O(k log k) for sorting. For strings with thousands of characters the difference is measurable.
Is the output order deterministic?
Not guaranteed by the problem spec — any ordering of groups, and any ordering within groups, is valid.
Practice these live with InterviewChamp.AI
Drill Group Anagrams and other HP interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.
Practice these live with InterviewChamp.AI →