49. Group Anagrams
mediumAsked at GleanGlean asks this because normalizing and bucketing strings by a canonical key is the foundation of term-normalization in search indexing — the same logic that maps 'ran', 'nar', and 'arn' to a single canonical form for retrieval.
By Alex Chen, Founder, InterviewChamp.AI · Last verified
Source citations
Public interview reports confirming this problem appears in Glean loops.
- Glassdoor (2025-12)— Cited in Glean onsite notes as a favorite medium that tests hashing and canonical-key design.
- Blind (2025-10)— Glean engineering threads list Group Anagrams as a high-signal problem due to its direct relevance to search term normalization.
Problem
Given an array of strings strs, group the anagrams together. You can return the answer in any order. An anagram is a word formed by rearranging the letters of another, using all the original letters exactly once.
Constraints
1 <= strs.length <= 10^40 <= strs[i].length <= 100strs[i] consists of lowercase English letters.
Examples
Example 1
strs = ["eat","tea","tan","ate","nat","bat"][["bat"],["nat","tan"],["ate","eat","tea"]]Explanation: Anagrams share the same sorted-character key.
Example 2
strs = [""][[""]]Explanation: Empty string maps to an empty sorted key.
Example 3
strs = ["a"][["a"]]Explanation: Single character — trivially in its own group.
Approaches
1. Sort each string as a key
For each string, sort its characters to produce a canonical key. Group strings by their key using a Map.
- Time
- O(n * k log k) where k is max string length
- Space
- O(n * k)
function groupAnagrams(strs) {
const map = new Map();
for (const s of strs) {
const key = s.split('').sort().join('');
if (!map.has(key)) map.set(key, []);
map.get(key).push(s);
}
return [...map.values()];
}Tradeoff: Simple and correct. The sorting step dominates at O(k log k) per string. This is the expected interview answer.
2. Character frequency array as key
Instead of sorting, build a 26-element frequency count array for each string and use its string representation as the key. Avoids the log-k factor.
- Time
- O(n * k)
- Space
- O(n * k)
function groupAnagrams(strs) {
const map = new Map();
for (const s of strs) {
const count = new Array(26).fill(0);
for (const ch of s) count[ch.charCodeAt(0) - 97]++;
const key = count.join(',');
if (!map.has(key)) map.set(key, []);
map.get(key).push(s);
}
return [...map.values()];
}Tradeoff: O(n * k) — better asymptotic complexity by eliminating the sort. The key is a string like '1,0,0,...,1,0' — slightly more verbose but O(k) to construct. Present this as an optimization when asked.
Glean-specific tips
Frame the canonical key insight early: 'Anagrams are equivalent under permutation, so I need a permutation-invariant representation — sorted characters is the simplest.' At Glean, also mention the search-indexing parallel: 'This is essentially how stemming and lemmatization work — different surface forms map to the same normalized token.' That context lands well.
Common mistakes
- Using an object {} as the map and not a Map — object keys are strings by default which works here, but Map is more explicit and handles edge cases better.
- Forgetting to initialize the array entry before pushing — check if the key exists and create an empty array if not.
- Not handling empty strings — an empty string sorts to '' and groups correctly; no special case needed.
- Joining the frequency array without a separator — '1101' is ambiguous (is it 11+0+1 or 1+1+0+1?); use a comma separator.
Follow-up questions
An interviewer at Glean may pivot to one of these next:
- Valid Anagram (LC 242) — check if two strings are anagrams of each other.
- Find All Anagrams in a String (LC 438) — sliding window to find all start indices of anagram substrings.
- What if the strings can contain non-ASCII characters? How does the key change?
Solve it now
Free. No sign-up. Python and JavaScript run instantly in your browser.
FAQ
Why does sorting work as a canonical key?
Two strings are anagrams if and only if they have the same multiset of characters. Sorting both strings produces the same output for anagrams and different output for non-anagrams.
Is the frequency-array approach always faster?
Asymptotically yes (O(k) vs O(k log k) per string), but for short strings (k ≤ 10) the constant overhead of array initialization may make sorting faster in practice. Mention this trade-off.
Practice these live with InterviewChamp.AI
Drill Group Anagrams and other Glean interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.
Practice these live with InterviewChamp.AI →