127. Word Ladder
hardAsked at Hugging FaceFind the shortest word transformation sequence from begin to end using a dictionary. Hugging Face uses this BFS shortest-path problem to probe graph construction from implicit edges — the same skill needed to build token neighborhood graphs for nearest-neighbor search in embedding spaces.
By Alex Chen, Founder, InterviewChamp.AI · Last verified
Source citations
Public interview reports confirming this problem appears in Hugging Face loops.
- Glassdoor (2025-12)— Cited in Hugging Face SWE onsite reports as a BFS graph problem with NLP framing.
- Blind (2025-09)— Hugging Face interview threads note Word Ladder as an occasionally-asked hard problem for roles touching text search or embedding systems.
Problem
A transformation sequence from word beginWord to word endWord using a dictionary wordList is a sequence of words beginWord -> s1 -> s2 -> ... -> sk such that: every adjacent pair of words differs by a single letter, every si for 1 <= i <= k is in wordList, and sk == endWord. Given two words, beginWord and endWord, and a dictionary wordList, return the number of words in the shortest transformation sequence from beginWord to endWord, or 0 if no such sequence exists.
Constraints
1 <= beginWord.length <= 10endWord.length == beginWord.length1 <= wordList.length <= 5000wordList[i].length == beginWord.lengthbeginWord, endWord, and wordList[i] consist of lowercase English letters.beginWord != endWordAll the words in wordList are unique.
Examples
Example 1
beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log","cog"]5Explanation: hit -> hot -> dot -> dog -> cog (5 words, 4 transformations).
Example 2
beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log"]0Explanation: endWord "cog" is not in wordList — no valid sequence.
Approaches
1. BFS with wildcard pattern preprocessing
Pre-build a map from wildcard patterns (e.g., '*ot') to words matching that pattern. BFS from beginWord; for each word, generate all wildcard patterns, look up neighbors, and enqueue unvisited ones.
- Time
- O(M² * N) where M=word length, N=wordList size
- Space
- O(M² * N)
function ladderLength(beginWord, endWord, wordList) {
const wordSet = new Set(wordList);
if (!wordSet.has(endWord)) return 0;
// Build pattern map
const patternMap = new Map();
for (const word of [beginWord, ...wordList]) {
for (let i = 0; i < word.length; i++) {
const pattern = word.slice(0, i) + '*' + word.slice(i + 1);
if (!patternMap.has(pattern)) patternMap.set(pattern, []);
patternMap.get(pattern).push(word);
}
}
const visited = new Set([beginWord]);
const queue = [[beginWord, 1]];
while (queue.length) {
const [word, level] = queue.shift();
for (let i = 0; i < word.length; i++) {
const pattern = word.slice(0, i) + '*' + word.slice(i + 1);
for (const neighbor of (patternMap.get(pattern) || [])) {
if (neighbor === endWord) return level + 1;
if (!visited.has(neighbor)) {
visited.add(neighbor);
queue.push([neighbor, level + 1]);
}
}
}
}
return 0;
}Tradeoff: O(M² * N) time for preprocessing + BFS. The pattern map avoids O(N) per-neighbor comparison. This is the standard optimized solution.
2. Bidirectional BFS
Run BFS simultaneously from beginWord and endWord, expanding the smaller frontier at each step. Stop when the frontiers meet.
- Time
- O(M² * N) but typically faster in practice
- Space
- O(M² * N)
function ladderLength(beginWord, endWord, wordList) {
const wordSet = new Set(wordList);
if (!wordSet.has(endWord)) return 0;
let frontBegin = new Set([beginWord]);
let frontEnd = new Set([endWord]);
const visited = new Set([beginWord, endWord]);
let steps = 1;
while (frontBegin.size && frontEnd.size) {
if (frontBegin.size > frontEnd.size) [frontBegin, frontEnd] = [frontEnd, frontBegin];
const nextFront = new Set();
for (const word of frontBegin) {
for (let i = 0; i < word.length; i++) {
for (let c = 97; c <= 122; c++) {
const newWord = word.slice(0, i) + String.fromCharCode(c) + word.slice(i + 1);
if (frontEnd.has(newWord)) return steps + 1;
if (wordSet.has(newWord) && !visited.has(newWord)) {
visited.add(newWord);
nextFront.add(newWord);
}
}
}
}
frontBegin = nextFront;
steps++;
}
return 0;
}Tradeoff: Bidirectional BFS reduces the search space from O(b^d) to O(b^(d/2)) where b is branching factor and d is depth. Significantly faster in practice for long transformation chains.
Hugging Face-specific tips
Hugging Face will ask: 'How would you build the graph efficiently without O(N²) edge computation?' The wildcard pattern map is the key answer. Then connect to NLP: 'Building a token neighborhood graph for nearest-neighbor search in embedding space uses a similar approach — precompute locality-sensitive hash buckets to avoid computing all pairwise distances.' Also explain bidirectional BFS clearly — it's a major optimization that demonstrates algorithmic depth.
Common mistakes
- Checking each word against all other words to find neighbors — O(N² * M) and too slow.
- Not checking if endWord is in wordList before starting BFS — return 0 early.
- Marking a word as visited when dequeuing instead of when enqueuing — causes duplicate processing.
- Forgetting to include beginWord in the pattern map — it can be a neighbor of wordList words.
Follow-up questions
An interviewer at Hugging Face may pivot to one of these next:
- Word Ladder II (LC 126) — return all shortest transformation sequences; use BFS + backtracking.
- How would you find the edit distance between two words? (Levenshtein distance — dynamic programming)
- How does this relate to nearest-neighbor search in token embedding spaces?
Solve it now
Free. No sign-up. Python and JavaScript run instantly in your browser.
FAQ
Why BFS and not DFS for this problem?
BFS guarantees the shortest path in an unweighted graph. DFS finds A path but not necessarily the shortest one.
What does the wildcard pattern map accomplish?
It groups words by their one-substitution patterns, allowing O(M) neighbor lookup instead of O(N * M) per-word comparison.
Why does bidirectional BFS improve performance?
Unidirectional BFS explores O(b^d) nodes. Bidirectional explores O(b^(d/2)) from each end. For large d this is a dramatic reduction.
Practice these live with InterviewChamp.AI
Drill Word Ladder and other Hugging Face interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.
Practice these live with InterviewChamp.AI →