126. Word Ladder II

Q: Why delete words layer-by-layer instead of per-word?

Because two paths in the same BFS layer might both need to use the same intermediate word. Deleting it after the first path uses it would block the second. Layer-by-layer deletion preserves all shortest-path options.

Q: Why use a parent map instead of storing paths directly?

Because the parent map is O(N) memory (one entry per word) while the path set can be exponentially larger. DFS reconstruction only materializes paths at the end — far cheaper than maintaining them during BFS.

Q: Why is Word Ladder II so much harder than Word Ladder I?

Word Ladder I only needs a distance, so plain BFS suffices. Returning ALL shortest sequences forces you to keep enough structure to rebuild every optimal path without storing exponentially many partial paths in the queue — the BFS-then-DFS split exists precisely to dodge that blow-up.

hardAsked at LinkedIn

Given begin and end words and a dictionary, return ALL shortest transformation sequences. LinkedIn asks this for the hardest BFS slot — the trick is to do BFS for distances first, then DFS to reconstruct paths only following shorter-distance neighbors.

By Sam K., Founder, InterviewChamp.AI · Last verified 2026-05-19

Source citations

Public interview reports confirming this problem appears in LinkedIn loops.

Glassdoor (2026-Q1)— LinkedIn staff IC onsite reports list Word Ladder II as the hardest graph problem in the loop.
Blind (2025-10)— LinkedIn writeups describe the 'BFS for layers then DFS for paths' separation as the explicit signal.

Problem

A transformation sequence from word beginWord to word endWord using a dictionary wordList is a sequence of words beginWord -> s1 -> s2 -> ... -> sk such that: every adjacent pair of words differs by a single letter, every s_i for 1 <= i <= k is in wordList. Note that beginWord does not need to be in wordList. sk == endWord. Given two words, beginWord and endWord, and a dictionary wordList, return all the shortest transformation sequences from beginWord to endWord, or an empty list if no such sequence exists. Each sequence should be returned as a list of the words [beginWord, s1, s2, ..., sk].

Constraints

1 <= beginWord.length <= 5
endWord.length == beginWord.length
1 <= wordList.length <= 500
wordList[i].length == beginWord.length
beginWord, endWord, and wordList[i] consist of lowercase English letters.
beginWord != endWord
All the words in wordList are unique.
The sum of all shortest transformation sequences does not exceed 10^5.

Examples

Example 1

Input

beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log","cog"]

Output

[["hit","hot","dot","dog","cog"],["hit","hot","lot","log","cog"]]

Explanation: Both sequences are 5 words long and both must be returned — the two branches diverge at 'hot' and rejoin at 'cog', which is why per-word visited-marking breaks this problem.

Example 2

Input

beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log"]

Output

[]

Explanation: endWord 'cog' is not in the word list, so no valid sequence can terminate — return an empty list rather than the near-miss paths.

Approaches

1. Naive BFS retaining full path per state

BFS where each queue entry is the full path. When we reach endWord, save the path. Continue until the queue exhausts the current shortest layer.

Time: Exponential paths in worst case
Space: Exponential

function findLaddersNaive(beginWord, endWord, wordList) {
  const dict = new Set(wordList);
  if (!dict.has(endWord)) return [];
  let queue = [[beginWord]], found = [], minLen = Infinity;
  while (queue.length) {
    if (queue[0].length > minLen) break;
    const next = [];
    const seenThisLayer = new Set();
    for (const path of queue) {
      const w = path[path.length - 1];
      if (w === endWord) { found.push(path); minLen = path.length; continue; }
      for (let i = 0; i < w.length; i++) {
        for (let c = 97; c < 123; c++) {
          const nw = w.slice(0, i) + String.fromCharCode(c) + w.slice(i + 1);
          if (dict.has(nw) && !path.includes(nw)) {
            next.push([...path, nw]);
            seenThisLayer.add(nw);
          }
        }
      }
    }
    for (const w of seenThisLayer) dict.delete(w);
    queue = next;
  }
  return found;
}

Tradeoff: Correct but storing full paths in the queue blows memory. Many invocations TLE/MLE in LC. Use only as the conceptual baseline.

2. BFS for layer distances + DFS to reconstruct (optimal)

Phase 1: BFS records each word's distance from beginWord, deleting visited words layer by layer to enforce shortest-path-only. Phase 2: DFS from beginWord, advancing only to neighbors whose distance is exactly current + 1.

Time: O(N * L^2 + P * L) where N is dict size, L word length, P number of paths
Space: O(N + P)

function findLadders(beginWord, endWord, wordList) {
  const dict = new Set(wordList);
  if (!dict.has(endWord)) return [];
  const L = beginWord.length;
  const parents = new Map();
  let current = new Set([beginWord]);
  let found = false;
  while (current.size && !found) {
    const next = new Set();
    for (const w of current) dict.delete(w);
    for (const w of current) {
      for (let i = 0; i < L; i++) {
        for (let c = 97; c < 123; c++) {
          const nw = w.slice(0, i) + String.fromCharCode(c) + w.slice(i + 1);
          if (dict.has(nw)) {
            if (nw === endWord) found = true;
            if (!parents.has(nw)) parents.set(nw, []);
            parents.get(nw).push(w);
            next.add(nw);
          }
        }
      }
    }
    current = next;
  }
  const out = [];
  function dfs(w, path) {
    if (w === beginWord) { out.push([beginWord, ...path].slice().reverse().reverse()); return; }
    const ps = parents.get(w) || [];
    for (const p of ps) dfs(p, [w, ...path]);
  }
  if (found) dfs(endWord, []);
  return out.map(p => p[0] === beginWord ? p : [beginWord, ...p]);
}

Tradeoff: Two phases keep memory bounded by graph size (not path count) during BFS, then expand only the relevant paths during DFS. Deleting visited words layer-by-layer is critical — it enforces 'only follow shortest paths'.

LinkedIn-specific tips

LinkedIn interviewers specifically grade whether you separate path-discovery from path-enumeration. Articulate: 'BFS finds the shortest distance to each word; DFS reconstructs all shortest paths using a parent map.' Without this split, naive BFS with full paths in the queue blows memory. They'll also probe: 'How do you ensure you only follow SHORTEST paths during DFS?' (Answer: each parent has distance current - 1.)

Common mistakes

Keeping full paths in the BFS queue — memory explodes.
Forgetting to delete visited words layer-by-layer (not per-word) — incorrectly excludes neighbors that another path in the same layer also needs.
Not handling the case where endWord is unreachable — should return [].

Follow-up questions

An interviewer at LinkedIn may pivot to one of these next:

Word Ladder (LC 127) — return only the length.
Bidirectional BFS for further speedup.
What if you needed paths of EXACTLY length k? (Generalize parent map with per-level entries.)

Solve it now

Free. No sign-up. Python and JavaScript run instantly in your browser.

Output

Press Run or Cmd+Enter to execute

FAQ

Why delete words layer-by-layer instead of per-word?

Because two paths in the same BFS layer might both need to use the same intermediate word. Deleting it after the first path uses it would block the second. Layer-by-layer deletion preserves all shortest-path options.

Why use a parent map instead of storing paths directly?

Because the parent map is O(N) memory (one entry per word) while the path set can be exponentially larger. DFS reconstruction only materializes paths at the end — far cheaper than maintaining them during BFS.

Why is Word Ladder II so much harder than Word Ladder I?

Word Ladder I only needs a distance, so plain BFS suffices. Returning ALL shortest sequences forces you to keep enough structure to rebuild every optimal path without storing exponentially many partial paths in the queue — the BFS-then-DFS split exists precisely to dodge that blow-up.

Free learning resources

Curated free links for this problem.

126. Word Ladder II

Source citations

Problem

Constraints

Examples

Approaches

1. Naive BFS retaining full path per state

2. BFS for layer distances + DFS to reconstruct (optimal)

LinkedIn-specific tips

Common mistakes

Follow-up questions

Solve it now

Editor Settings

FAQ

Free learning resources

More LinkedIn coding interview questions