49. Group Anagrams
mediumAsked at CohereGroup strings that are anagrams of one another. Cohere asks this because canonical-key hashing mirrors how embedding models deduplicate semantically-equivalent queries before routing them to a retrieval index.
By Alex Chen, Founder, InterviewChamp.AI · Last verified
Source citations
Public interview reports confirming this problem appears in Cohere loops.
- Glassdoor (2025-Q4)— Cohere MLE and SWE candidates cite Group Anagrams as a frequent medium question in coding rounds.
- Blind (2025-09)— Mentioned in Cohere prep guides as a standard hashing-and-grouping problem.
Problem
Given an array of strings strs, group the anagrams together. You can return the answer in any order. An anagram is a word or phrase formed by rearranging the letters of a different word or phrase, using all the original letters exactly once.
Constraints
1 <= strs.length <= 10^40 <= strs[i].length <= 100strs[i] consists of lowercase English letters.
Examples
Example 1
strs = ["eat","tea","tan","ate","nat","bat"][["bat"],["nat","tan"],["ate","eat","tea"]]Explanation: Three anagram groups.
Example 2
strs = [""][[""]]Explanation: Single empty string group.
Example 3
strs = ["a"][["a"]]Explanation: Single character group.
Approaches
1. Sort each string as key
For each word, sort its characters to produce a canonical key. Group words by their canonical key in a Map.
- Time
- O(n · k log k) where k is max string length
- Space
- O(n · k)
function groupAnagrams(strs) {
const map = new Map();
for (const s of strs) {
const key = s.split('').sort().join('');
if (!map.has(key)) map.set(key, []);
map.get(key).push(s);
}
return [...map.values()];
}Tradeoff: Simple and correct. Sorting costs O(k log k) per string. For small alphabets a character-count key is asymptotically faster.
2. Character-count key
Build a 26-element frequency array for each string and use its string representation as the Map key. Avoids sorting.
- Time
- O(n · k)
- Space
- O(n · k)
function groupAnagrams(strs) {
const map = new Map();
for (const s of strs) {
const count = new Array(26).fill(0);
for (const ch of s) count[ch.charCodeAt(0) - 97]++;
const key = count.join(',');
if (!map.has(key)) map.set(key, []);
map.get(key).push(s);
}
return [...map.values()];
}Tradeoff: O(n·k) time — eliminates the sort. The key is a comma-joined frequency string. Preferred when k is large and the alphabet is fixed.
Cohere-specific tips
Cohere engineers think about canonical representations constantly — embeddings, normalised queries, deduplicated document hashes. When presenting this problem, frame canonical-key grouping as: 'I need a representation that is the same for all members of the equivalence class and different across classes.' Then discuss which canonical form is cheaper: sorted characters (O(k log k)) vs frequency vector (O(k)). Cohere teams working on retrieval will find the frequency-vector approach conceptually closer to TF-IDF and bag-of-words indexing.
Common mistakes
- Using array reference as a Map key — two distinct arrays with identical content are not equal in JS. Always join to a string key.
- Not handling empty strings — sorted empty string is an empty string, which is a valid key.
- Sorting in-place and modifying the original string — split first to preserve the original.
- Using an object instead of a Map — object keys are stringified differently for non-string types, though here both work.
Follow-up questions
An interviewer at Cohere may pivot to one of these next:
- Find All Anagram Groups in a stream — how would you group anagrams arriving one at a time?
- Valid Anagram (LC 242) — check if two strings are anagrams.
- How would you group near-duplicate sentences where word order differs but vocabulary is identical?
Solve it now
Free. No sign-up. Python and JavaScript run instantly in your browser.
FAQ
Why join the count array with commas?
Without a separator, [1,10] and [11,0] would produce the same string '110'. Commas prevent key collisions.
What if strings contain uppercase or non-alpha characters?
The frequency-array approach needs a larger array or a general Map<char, count>. Mention this when discussing real-world input sanitization.
Is there a solution faster than O(n·k)?
Not in the worst case — you must read every character at least once to determine the canonical key.
Practice these live with InterviewChamp.AI
Drill Group Anagrams and other Cohere interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.
Practice these live with InterviewChamp.AI →