Skip to main content

49. Group Anagrams

mediumAsked at HP

HP document and content management systems aggregate files with equivalent names or metadata under a unified key. Group Anagrams is a proxy for that canonical key-design question: how do you derive a consistent, collision-free key from unordered data? HP uses it to test hashing intuition.

By Alex Chen, Founder, InterviewChamp.AI · Last verified

Source citations

Public interview reports confirming this problem appears in HP loops.

  • Glassdoor (2025-Q4)HP backend SWE interviewees report Group Anagrams as a common second-round question focused on hash-map key design.
  • Blind (2025-10)HP interview prep threads list Group Anagrams as a standard medium problem in technical screens for SWE roles.

Problem

Given an array of strings strs, group the anagrams together. You can return the answer in any order. An Anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.

Constraints

  • 1 <= strs.length <= 10^4
  • 0 <= strs[i].length <= 100
  • strs[i] consists of lowercase English letters.

Examples

Example 1

Input
strs = ["eat","tea","tan","ate","nat","bat"]
Output
[["bat"],["nat","tan"],["ate","eat","tea"]]

Explanation: eat/tea/ate are anagrams; tan/nat are anagrams; bat stands alone.

Example 2

Input
strs = [""]
Output
[[""]]

Explanation: Single empty string forms one group.

Approaches

1. Sorted key

Sort each string's characters to produce a canonical key. Group strings by key in a map.

Time
O(n * k log k)
Space
O(n * k)
function groupAnagrams(strs) {
  const map = new Map();
  for (const s of strs) {
    const key = s.split('').sort().join('');
    if (!map.has(key)) map.set(key, []);
    map.get(key).push(s);
  }
  return [...map.values()];
}

Tradeoff: Simple and correct. O(k log k) per string for sorting. Suitable for most interview contexts.

2. Character frequency key

Represent each string as a 26-element frequency count array. Serialize it to a string key. Avoids sorting.

Time
O(n * k)
Space
O(n * k)
function groupAnagrams(strs) {
  const map = new Map();
  for (const s of strs) {
    const count = new Array(26).fill(0);
    for (const ch of s) count[ch.charCodeAt(0) - 97]++;
    const key = count.join('#'); // e.g. '1#0#0#...#1#...'
    if (!map.has(key)) map.set(key, []);
    map.get(key).push(s);
  }
  return [...map.values()];
}

Tradeoff: O(n * k) — no sorting overhead. The separator '#' prevents key collisions between different frequency tuples. Better for large strings or when the alphabet is bounded.

HP-specific tips

HP interviewers appreciate discussing the key-design trade-off explicitly: sorted key is simpler; frequency-count key is faster. Point out that the '#' separator is critical in the frequency approach — without it, [1,10] and [11,0] would produce the same string. For HP's document processing context, mention that this scales to millions of files with O(1) lookup per file after the initial pass.

Common mistakes

  • Forgetting the separator in the frequency key — leads to hash collisions.
  • Using an object instead of a Map — object keys are strings which coerce types and don't preserve insertion order in older JS environments.
  • Sorting the output groups — the problem says order doesn't matter, so sorting wastes time.
  • Not handling empty strings — an empty string has all-zero frequency counts and maps to its own group.

Follow-up questions

An interviewer at HP may pivot to one of these next:

  • Group words that are scrambles of each other where repetition is allowed (multi-sets).
  • What if strings contain Unicode characters beyond the 26 English letters?
  • How would you shard this grouping across multiple machines for a billion-string input?

Solve it now

Free. No sign-up. Python and JavaScript run instantly in your browser.

Output

Press Run or Cmd+Enter to execute

FAQ

Why use '#' as separator in the frequency key?

Consider counts [1, 10, ...] and [11, 0, ...] — joined without a separator both produce '110...'. The separator makes each position unambiguous.

Which approach is better for large strings?

The frequency-count approach is O(k) per string versus O(k log k) for sorting. For strings with thousands of characters the difference is measurable.

Is the output order deterministic?

Not guaranteed by the problem spec — any ordering of groups, and any ordering within groups, is valid.

Practice these live with InterviewChamp.AI

Drill Group Anagrams and other HP interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.

Practice these live with InterviewChamp.AI →