127. Word Ladder
hardAsked at GleanGlean uses Word Ladder to test BFS on an implicit graph — a critical skill when traversing semantic neighborhoods in a word-embedding space or computing edit-distance hops between query terms for query expansion.
By Alex Chen, Founder, InterviewChamp.AI · Last verified
Source citations
Public interview reports confirming this problem appears in Glean loops.
- Glassdoor (2025-12)— Glean candidates report Word Ladder appearing in hard rounds where interviewers want to test BFS on non-trivial graph construction.
- Blind (2025-08)— Listed in Glean interview prep threads as a hard problem that tests both graph-building intuition and BFS efficiency.
Problem
A transformation sequence from word beginWord to word endWord using a dictionary wordList is a sequence beginWord → s1 → s2 → ... → sk such that every adjacent pair of words differs by exactly one letter, every si for 1 <= i <= k is in wordList, and sk == endWord. Given two words beginWord and endWord and a dictionary wordList, return the number of words in the shortest transformation sequence from beginWord to endWord, or 0 if no such sequence exists.
Constraints
1 <= beginWord.length <= 10endWord.length == beginWord.length1 <= wordList.length <= 5000wordList[i].length == beginWord.lengthbeginWord, endWord, and wordList[i] consist of lowercase English letters.beginWord != endWordAll the words in wordList are unique.
Examples
Example 1
beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log","cog"]5Explanation: hit → hot → dot → dog → cog (5 words in sequence).
Example 2
beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log"]0Explanation: 'cog' is not in wordList — no valid transformation sequence exists.
Approaches
1. BFS on one-letter-apart graph
Model each word as a graph node. Two words are connected if they differ by exactly one letter. BFS from beginWord finds the shortest path to endWord. Use wildcard patterns (e.g., *ot) to efficiently group words by each possible one-letter gap.
- Time
- O(M² * N) where M = word length, N = wordList size
- Space
- O(M² * N)
function ladderLength(beginWord, endWord, wordList) {
const wordSet = new Set(wordList);
if (!wordSet.has(endWord)) return 0;
// Build adjacency via wildcard patterns
const patternMap = new Map(); // '*ot' -> ['hot','dot','lot']
for (const word of [beginWord, ...wordList]) {
for (let i = 0; i < word.length; i++) {
const pattern = word.slice(0, i) + '*' + word.slice(i + 1);
if (!patternMap.has(pattern)) patternMap.set(pattern, []);
patternMap.get(pattern).push(word);
}
}
const visited = new Set([beginWord]);
const queue = [[beginWord, 1]];
while (queue.length > 0) {
const [word, level] = queue.shift();
for (let i = 0; i < word.length; i++) {
const pattern = word.slice(0, i) + '*' + word.slice(i + 1);
for (const neighbor of (patternMap.get(pattern) || [])) {
if (neighbor === endWord) return level + 1;
if (!visited.has(neighbor)) {
visited.add(neighbor);
queue.push([neighbor, level + 1]);
}
}
}
}
return 0;
}Tradeoff: O(M² * N) time. The wildcard pre-grouping avoids O(M * N) per BFS step by looking up neighbors in O(M) instead of comparing against all N words. This is the production-grade approach.
2. Bidirectional BFS
Run BFS simultaneously from both beginWord and endWord. Expand the smaller frontier at each step. When the two frontiers meet, sum the levels. This reduces the search space from O(b^d) to O(b^(d/2)).
- Time
- O(M² * N) — same asymptotic, smaller constant
- Space
- O(M² * N)
function ladderLength(beginWord, endWord, wordList) {
const wordSet = new Set(wordList);
if (!wordSet.has(endWord)) return 0;
function getNeighbors(word) {
const nbrs = [];
for (let i = 0; i < word.length; i++) {
for (let c = 97; c <= 122; c++) {
const next = word.slice(0, i) + String.fromCharCode(c) + word.slice(i + 1);
if (wordSet.has(next)) nbrs.push(next);
}
}
return nbrs;
}
let frontA = new Map([[beginWord, 1]]);
let frontB = new Map([[endWord, 1]]);
while (frontA.size > 0 && frontB.size > 0) {
if (frontA.size > frontB.size) [frontA, frontB] = [frontB, frontA];
const nextFront = new Map();
for (const [word, level] of frontA) {
for (const nbr of getNeighbors(word)) {
if (frontB.has(nbr)) return level + frontB.get(nbr);
if (!frontA.has(nbr)) nextFront.set(nbr, level + 1);
}
}
frontA = nextFront;
}
return 0;
}Tradeoff: Same asymptotic complexity but often much faster in practice on wide word graphs. Worth mentioning to Glean interviewers as it demonstrates understanding of search-space reduction.
Glean-specific tips
Frame the graph explicitly before coding: 'Each word is a node. Two nodes share an edge if the words differ by exactly one letter. I need the shortest path from beginWord to endWord — BFS.' Glean values graph abstraction. Mention the wildcard trick: 'Pre-grouping by pattern reduces neighbor lookup from O(N*M) to O(M) per node.' If you have time, mention bidirectional BFS and explain why it shrinks the frontier — Glean interviewers appreciate search-optimization awareness.
Common mistakes
- Not checking if endWord is in wordList — return 0 immediately if it is not.
- Forgetting to mark words as visited before enqueuing — causes duplicate visits and TLE.
- Comparing each word against all N dictionary words to find neighbors (O(M * N) per step) instead of using wildcard patterns.
- Counting the sequence length incorrectly — the problem asks for the number of words in the sequence (including beginWord and endWord), so start the BFS level at 1.
Follow-up questions
An interviewer at Glean may pivot to one of these next:
- Word Ladder II (LC 126) — return all shortest transformation sequences, not just the length.
- How would you handle a weighted word graph where some transformations are cheaper than others?
- Design Query Expansion: Given a query word, return all words reachable within K one-letter-change hops.
Solve it now
Free. No sign-up. Python and JavaScript run instantly in your browser.
FAQ
Why use BFS and not DFS for shortest path?
BFS explores level by level — the first time it reaches the target, the path length is guaranteed to be minimal. DFS finds any path, not necessarily the shortest.
How does the wildcard pre-grouping help?
Instead of comparing each word against all dictionary words (O(N*M) per node), you build a map from wildcard patterns to word lists once. Finding all one-letter neighbors of a word then costs O(M) pattern lookups — each O(1) in the map.
Practice these live with InterviewChamp.AI
Drill Word Ladder and other Glean interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.
Practice these live with InterviewChamp.AI →