Skip to main content

126. Word Ladder II

hardAsked at LinkedIn

Given begin and end words and a dictionary, return ALL shortest transformation sequences. LinkedIn asks this for the hardest BFS slot — the trick is to do BFS for distances first, then DFS to reconstruct paths only following shorter-distance neighbors.

By Alex Chen, Founder, InterviewChamp.AI · Last verified

Source citations

Public interview reports confirming this problem appears in LinkedIn loops.

  • Glassdoor (2026-Q1)LinkedIn staff IC onsite reports list Word Ladder II as the hardest graph problem in the loop.
  • Blind (2025-10)LinkedIn writeups describe the 'BFS for layers then DFS for paths' separation as the explicit signal.

Problem

A transformation sequence from word beginWord to word endWord using a dictionary wordList is a sequence of words beginWord -> s1 -> s2 -> ... -> sk such that: every adjacent pair of words differs by a single letter, every s_i for 1 <= i <= k is in wordList. Note that beginWord does not need to be in wordList. sk == endWord. Given two words, beginWord and endWord, and a dictionary wordList, return all the shortest transformation sequences from beginWord to endWord, or an empty list if no such sequence exists. Each sequence should be returned as a list of the words [beginWord, s1, s2, ..., sk].

Constraints

  • 1 <= beginWord.length <= 5
  • endWord.length == beginWord.length
  • 1 <= wordList.length <= 500
  • wordList[i].length == beginWord.length
  • beginWord, endWord, and wordList[i] consist of lowercase English letters.
  • beginWord != endWord
  • All the words in wordList are unique.
  • The sum of all shortest transformation sequences does not exceed 10^5.

Examples

Example 1

Input
beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log","cog"]
Output
[["hit","hot","dot","dog","cog"],["hit","hot","lot","log","cog"]]

Example 2

Input
beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log"]
Output
[]

Approaches

1. Naive BFS retaining full path per state

BFS where each queue entry is the full path. When we reach endWord, save the path. Continue until the queue exhausts the current shortest layer.

Time
Exponential paths in worst case
Space
Exponential
function findLaddersNaive(beginWord, endWord, wordList) {
  const dict = new Set(wordList);
  if (!dict.has(endWord)) return [];
  let queue = [[beginWord]], found = [], minLen = Infinity;
  while (queue.length) {
    if (queue[0].length > minLen) break;
    const next = [];
    const seenThisLayer = new Set();
    for (const path of queue) {
      const w = path[path.length - 1];
      if (w === endWord) { found.push(path); minLen = path.length; continue; }
      for (let i = 0; i < w.length; i++) {
        for (let c = 97; c < 123; c++) {
          const nw = w.slice(0, i) + String.fromCharCode(c) + w.slice(i + 1);
          if (dict.has(nw) && !path.includes(nw)) {
            next.push([...path, nw]);
            seenThisLayer.add(nw);
          }
        }
      }
    }
    for (const w of seenThisLayer) dict.delete(w);
    queue = next;
  }
  return found;
}

Tradeoff: Correct but storing full paths in the queue blows memory. Many invocations TLE/MLE in LC. Use only as the conceptual baseline.

2. BFS for layer distances + DFS to reconstruct (optimal)

Phase 1: BFS records each word's distance from beginWord, deleting visited words layer by layer to enforce shortest-path-only. Phase 2: DFS from beginWord, advancing only to neighbors whose distance is exactly current + 1.

Time
O(N * L^2 + P * L) where N is dict size, L word length, P number of paths
Space
O(N + P)
function findLadders(beginWord, endWord, wordList) {
  const dict = new Set(wordList);
  if (!dict.has(endWord)) return [];
  const L = beginWord.length;
  const parents = new Map();
  let current = new Set([beginWord]);
  let found = false;
  while (current.size && !found) {
    const next = new Set();
    for (const w of current) dict.delete(w);
    for (const w of current) {
      for (let i = 0; i < L; i++) {
        for (let c = 97; c < 123; c++) {
          const nw = w.slice(0, i) + String.fromCharCode(c) + w.slice(i + 1);
          if (dict.has(nw)) {
            if (nw === endWord) found = true;
            if (!parents.has(nw)) parents.set(nw, []);
            parents.get(nw).push(w);
            next.add(nw);
          }
        }
      }
    }
    current = next;
  }
  const out = [];
  function dfs(w, path) {
    if (w === beginWord) { out.push([beginWord, ...path].slice().reverse().reverse()); return; }
    const ps = parents.get(w) || [];
    for (const p of ps) dfs(p, [w, ...path]);
  }
  if (found) dfs(endWord, []);
  return out.map(p => p[0] === beginWord ? p : [beginWord, ...p]);
}

Tradeoff: Two phases keep memory bounded by graph size (not path count) during BFS, then expand only the relevant paths during DFS. Deleting visited words layer-by-layer is critical — it enforces 'only follow shortest paths'.

LinkedIn-specific tips

LinkedIn interviewers specifically grade whether you separate path-discovery from path-enumeration. Articulate: 'BFS finds the shortest distance to each word; DFS reconstructs all shortest paths using a parent map.' Without this split, naive BFS with full paths in the queue blows memory. They'll also probe: 'How do you ensure you only follow SHORTEST paths during DFS?' (Answer: each parent has distance current - 1.)

Common mistakes

  • Keeping full paths in the BFS queue — memory explodes.
  • Forgetting to delete visited words layer-by-layer (not per-word) — incorrectly excludes neighbors that another path in the same layer also needs.
  • Not handling the case where endWord is unreachable — should return [].

Follow-up questions

An interviewer at LinkedIn may pivot to one of these next:

  • Word Ladder (LC 127) — return only the length.
  • Bidirectional BFS for further speedup.
  • What if you needed paths of EXACTLY length k? (Generalize parent map with per-level entries.)

Solve it now

Free. No sign-up. Python and JavaScript run instantly in your browser.

Output

Press Run or Cmd+Enter to execute

FAQ

Why delete words layer-by-layer instead of per-word?

Because two paths in the same BFS layer might both need to use the same intermediate word. Deleting it after the first path uses it would block the second. Layer-by-layer deletion preserves all shortest-path options.

Why use a parent map instead of storing paths directly?

Because the parent map is O(N) memory (one entry per word) while the path set can be exponentially larger. DFS reconstruction only materializes paths at the end — far cheaper than maintaining them during BFS.

Free learning resources

Curated free links for this problem.

Practice these live with InterviewChamp.AI

Drill Word Ladder II and other LinkedIn interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.

Practice these live with InterviewChamp.AI →