Skip to main content

127. Word Ladder

hardAsked at Hugging Face

Find the shortest word transformation sequence from begin to end using a dictionary. Hugging Face uses this BFS shortest-path problem to probe graph construction from implicit edges — the same skill needed to build token neighborhood graphs for nearest-neighbor search in embedding spaces.

By Alex Chen, Founder, InterviewChamp.AI · Last verified

Source citations

Public interview reports confirming this problem appears in Hugging Face loops.

  • Glassdoor (2025-12)Cited in Hugging Face SWE onsite reports as a BFS graph problem with NLP framing.
  • Blind (2025-09)Hugging Face interview threads note Word Ladder as an occasionally-asked hard problem for roles touching text search or embedding systems.

Problem

A transformation sequence from word beginWord to word endWord using a dictionary wordList is a sequence of words beginWord -> s1 -> s2 -> ... -> sk such that: every adjacent pair of words differs by a single letter, every si for 1 <= i <= k is in wordList, and sk == endWord. Given two words, beginWord and endWord, and a dictionary wordList, return the number of words in the shortest transformation sequence from beginWord to endWord, or 0 if no such sequence exists.

Constraints

  • 1 <= beginWord.length <= 10
  • endWord.length == beginWord.length
  • 1 <= wordList.length <= 5000
  • wordList[i].length == beginWord.length
  • beginWord, endWord, and wordList[i] consist of lowercase English letters.
  • beginWord != endWord
  • All the words in wordList are unique.

Examples

Example 1

Input
beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log","cog"]
Output
5

Explanation: hit -> hot -> dot -> dog -> cog (5 words, 4 transformations).

Example 2

Input
beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log"]
Output
0

Explanation: endWord "cog" is not in wordList — no valid sequence.

Approaches

1. BFS with wildcard pattern preprocessing

Pre-build a map from wildcard patterns (e.g., '*ot') to words matching that pattern. BFS from beginWord; for each word, generate all wildcard patterns, look up neighbors, and enqueue unvisited ones.

Time
O(M² * N) where M=word length, N=wordList size
Space
O(M² * N)
function ladderLength(beginWord, endWord, wordList) {
  const wordSet = new Set(wordList);
  if (!wordSet.has(endWord)) return 0;
  // Build pattern map
  const patternMap = new Map();
  for (const word of [beginWord, ...wordList]) {
    for (let i = 0; i < word.length; i++) {
      const pattern = word.slice(0, i) + '*' + word.slice(i + 1);
      if (!patternMap.has(pattern)) patternMap.set(pattern, []);
      patternMap.get(pattern).push(word);
    }
  }
  const visited = new Set([beginWord]);
  const queue = [[beginWord, 1]];
  while (queue.length) {
    const [word, level] = queue.shift();
    for (let i = 0; i < word.length; i++) {
      const pattern = word.slice(0, i) + '*' + word.slice(i + 1);
      for (const neighbor of (patternMap.get(pattern) || [])) {
        if (neighbor === endWord) return level + 1;
        if (!visited.has(neighbor)) {
          visited.add(neighbor);
          queue.push([neighbor, level + 1]);
        }
      }
    }
  }
  return 0;
}

Tradeoff: O(M² * N) time for preprocessing + BFS. The pattern map avoids O(N) per-neighbor comparison. This is the standard optimized solution.

2. Bidirectional BFS

Run BFS simultaneously from beginWord and endWord, expanding the smaller frontier at each step. Stop when the frontiers meet.

Time
O(M² * N) but typically faster in practice
Space
O(M² * N)
function ladderLength(beginWord, endWord, wordList) {
  const wordSet = new Set(wordList);
  if (!wordSet.has(endWord)) return 0;
  let frontBegin = new Set([beginWord]);
  let frontEnd = new Set([endWord]);
  const visited = new Set([beginWord, endWord]);
  let steps = 1;
  while (frontBegin.size && frontEnd.size) {
    if (frontBegin.size > frontEnd.size) [frontBegin, frontEnd] = [frontEnd, frontBegin];
    const nextFront = new Set();
    for (const word of frontBegin) {
      for (let i = 0; i < word.length; i++) {
        for (let c = 97; c <= 122; c++) {
          const newWord = word.slice(0, i) + String.fromCharCode(c) + word.slice(i + 1);
          if (frontEnd.has(newWord)) return steps + 1;
          if (wordSet.has(newWord) && !visited.has(newWord)) {
            visited.add(newWord);
            nextFront.add(newWord);
          }
        }
      }
    }
    frontBegin = nextFront;
    steps++;
  }
  return 0;
}

Tradeoff: Bidirectional BFS reduces the search space from O(b^d) to O(b^(d/2)) where b is branching factor and d is depth. Significantly faster in practice for long transformation chains.

Hugging Face-specific tips

Hugging Face will ask: 'How would you build the graph efficiently without O(N²) edge computation?' The wildcard pattern map is the key answer. Then connect to NLP: 'Building a token neighborhood graph for nearest-neighbor search in embedding space uses a similar approach — precompute locality-sensitive hash buckets to avoid computing all pairwise distances.' Also explain bidirectional BFS clearly — it's a major optimization that demonstrates algorithmic depth.

Common mistakes

  • Checking each word against all other words to find neighbors — O(N² * M) and too slow.
  • Not checking if endWord is in wordList before starting BFS — return 0 early.
  • Marking a word as visited when dequeuing instead of when enqueuing — causes duplicate processing.
  • Forgetting to include beginWord in the pattern map — it can be a neighbor of wordList words.

Follow-up questions

An interviewer at Hugging Face may pivot to one of these next:

  • Word Ladder II (LC 126) — return all shortest transformation sequences; use BFS + backtracking.
  • How would you find the edit distance between two words? (Levenshtein distance — dynamic programming)
  • How does this relate to nearest-neighbor search in token embedding spaces?

Solve it now

Free. No sign-up. Python and JavaScript run instantly in your browser.

Output

Press Run or Cmd+Enter to execute

FAQ

Why BFS and not DFS for this problem?

BFS guarantees the shortest path in an unweighted graph. DFS finds A path but not necessarily the shortest one.

What does the wildcard pattern map accomplish?

It groups words by their one-substitution patterns, allowing O(M) neighbor lookup instead of O(N * M) per-word comparison.

Why does bidirectional BFS improve performance?

Unidirectional BFS explores O(b^d) nodes. Bidirectional explores O(b^(d/2)) from each end. For large d this is a dramatic reduction.

Practice these live with InterviewChamp.AI

Drill Word Ladder and other Hugging Face interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.

Practice these live with InterviewChamp.AI →