127. Word Ladder
hardAsked at BroadcomFind the shortest transformation sequence from one word to another, changing one letter at a time. Broadcom asks this because it is shortest-path BFS on an implicit graph — the same technique used to find minimum-hop routing paths and convergence analysis in network topology recalculation after a link failure.
By Alex Chen, Founder, InterviewChamp.AI · Last verified
Source citations
Public interview reports confirming this problem appears in Broadcom loops.
- Glassdoor (2025-11)— Reported in Broadcom SWE senior onsite rounds as a BFS graph-modelling problem.
- Blind (2025-09)— Broadcom threads cite Word Ladder as a hard problem that tests the ability to model implicit graphs and run BFS efficiently.
Problem
A transformation sequence from word beginWord to word endWord using a dictionary wordList is a sequence of words beginWord -> s1 -> s2 -> ... -> sk such that: every adjacent pair of words differs by a single letter, every si for 1 <= i <= k is in wordList. Given two words beginWord, endWord, and a dictionary wordList, return the number of words in the shortest transformation sequence from beginWord to endWord, or 0 if no such sequence exists.
Constraints
1 <= beginWord.length <= 10endWord.length == beginWord.length1 <= wordList.length <= 5000wordList[i].length == beginWord.lengthbeginWord, endWord, and wordList[i] consist of lowercase English letters.beginWord != endWordAll the words in wordList are unique.
Examples
Example 1
beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log","cog"]5Explanation: hit → hot → dot → dog → cog (5 words).
Example 2
beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log"]0Explanation: endWord 'cog' is not in wordList — no transformation possible.
Approaches
1. BFS on implicit graph
Model words as graph nodes; two words are adjacent if they differ by one letter. BFS from beginWord finds the shortest path to endWord. For each word in the queue, try replacing each character with every letter a-z; if the resulting word is in the dictionary, add it to the queue.
- Time
- O(N * L * 26) where N = dictionary size, L = word length
- Space
- O(N * L)
function ladderLength(beginWord, endWord, wordList) {
const wordSet = new Set(wordList);
if (!wordSet.has(endWord)) return 0;
const queue = [[beginWord, 1]];
const visited = new Set([beginWord]);
let head = 0;
while (head < queue.length) {
const [word, steps] = queue[head++];
for (let i = 0; i < word.length; i++) {
for (let c = 0; c < 26; c++) {
const newWord = word.slice(0, i) + String.fromCharCode(97 + c) + word.slice(i + 1);
if (newWord === endWord) return steps + 1;
if (wordSet.has(newWord) && !visited.has(newWord)) {
visited.add(newWord);
queue.push([newWord, steps + 1]);
}
}
}
}
return 0;
}Tradeoff: O(N*L*26) time. BFS guarantees the shortest path. The key optimisation is generating neighbours by mutation rather than comparing all pairs of words, reducing the neighbour-finding step from O(N*L) to O(L*26).
2. Bidirectional BFS
Run BFS simultaneously from beginWord and endWord. At each step, expand the smaller frontier. When the two frontiers intersect, the path length is found. Reduces the search space from O(b^d) to O(b^(d/2)).
- Time
- O(N * L * 26)
- Space
- O(N * L)
function ladderLength(beginWord, endWord, wordList) {
const wordSet = new Set(wordList);
if (!wordSet.has(endWord)) return 0;
let frontA = new Set([beginWord]);
let frontB = new Set([endWord]);
let visited = new Set([beginWord, endWord]);
let steps = 1;
while (frontA.size > 0 && frontB.size > 0) {
if (frontA.size > frontB.size) [frontA, frontB] = [frontB, frontA];
const next = new Set();
for (const word of frontA) {
for (let i = 0; i < word.length; i++) {
for (let c = 0; c < 26; c++) {
const nw = word.slice(0, i) + String.fromCharCode(97 + c) + word.slice(i + 1);
if (frontB.has(nw)) return steps + 1;
if (wordSet.has(nw) && !visited.has(nw)) {
visited.add(nw); next.add(nw);
}
}
}
}
frontA = next;
steps++;
}
return 0;
}Tradeoff: Same asymptotic complexity but significantly faster in practice by cutting the BFS tree depth in half. This is the premium answer at Broadcom that demonstrates advanced BFS optimisation thinking.
Broadcom-specific tips
At Broadcom, model the problem explicitly before coding: 'Words are nodes; two words sharing an edge differ by exactly one character — this is shortest-path BFS on an implicit graph.' Connect to networking: 'Minimum-hop path computation in a network graph after link-state convergence uses the same BFS principle.' Bidirectional BFS is the follow-up answer Broadcom expects from senior candidates. Walk through why expanding the smaller frontier first is optimal.
Common mistakes
- Checking all pairs of words for single-character difference: O(N²*L) — extremely slow for large dictionaries.
- Forgetting to check if endWord is in wordList before starting BFS — return 0 immediately.
- Not marking a word as visited when enqueuing it (only when dequeuing) — causes duplicate enqueues.
- Counting steps instead of words — the problem asks for number of words in the sequence, so start at 1 and add 1 for each step.
Follow-up questions
An interviewer at Broadcom may pivot to one of these next:
- Word Ladder II (LC 126) — return all shortest transformation sequences.
- How would you preprocess the dictionary to speed up neighbour lookup (generic pattern buckets e.g., *ot for hot, dot, lot)?
- How does bidirectional BFS generalise to A* search for weighted graphs?
Solve it now
Free. No sign-up. Python and JavaScript run instantly in your browser.
FAQ
Why generate neighbours by character substitution rather than comparing all word pairs?
Comparing all pairs is O(N²*L). Generating by substitution — trying all 26 chars in each position — is O(L*26) per word, total O(N*L*26). For N=5000, L=10, this is 5000*10*26 = 1.3M operations vs 5000²*10 = 250M comparisons.
Why does bidirectional BFS halve the depth?
If the answer has depth d and branching factor b, standard BFS visits O(b^d) nodes. Bidirectional BFS meets in the middle at depth d/2, visiting O(2*b^(d/2)) — exponentially fewer nodes.
What is the pattern-bucket optimisation for neighbour lookup?
Preprocess each word into L generic patterns (replace position i with '*'). Group words by pattern. Looking up neighbours of a word becomes: generate its L patterns, fetch the groups. O(N*L) preprocessing, O(L) neighbour lookup.
Practice these live with InterviewChamp.AI
Drill Word Ladder and other Broadcom interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.
Practice these live with InterviewChamp.AI →