Skip to main content

139. Word Break

mediumAsked at Elastic

Determine if a string can be segmented into words from a dictionary. Elastic interviews this because token segmentation is at the core of Elasticsearch's text analysis pipeline — understanding which substrings can be valid tokens is exactly what analyzers compute before indexing.

By Alex Chen, Founder, InterviewChamp.AI · Last verified

Source citations

Public interview reports confirming this problem appears in Elastic loops.

  • Glassdoor (2026-Q1)Elastic SWE candidates report string DP problems appearing in mid-level rounds with explicit text-analysis framing.
  • Blind (2025-11)Word Break listed in Elastic interview threads as a DP problem that directly maps to their search tokenization domain.

Problem

Given a string s and a dictionary of strings wordDict, return true if s can be segmented into a space-separated sequence of one or more dictionary words. Note that the same word in the dictionary may be reused multiple times in the segmentation.

Constraints

  • 1 <= s.length <= 300
  • 1 <= wordDict.length <= 1000
  • 1 <= wordDict[i].length <= 20
  • s and wordDict[i] consist of only lowercase English letters.
  • All the words in wordDict are unique.

Examples

Example 1

Input
s = "leetcode", wordDict = ["leet","code"]
Output
true

Explanation: 'leetcode' = 'leet' + 'code'.

Example 2

Input
s = "applepenapple", wordDict = ["apple","pen"]
Output
true

Explanation: 'apple' + 'pen' + 'apple'. 'apple' is reused.

Example 3

Input
s = "catsandog", wordDict = ["cats","dog","sand","and","cat"]
Output
false

Explanation: No segmentation covers the entire string.

Approaches

1. Bottom-up DP with Set lookup

Create a boolean DP array where dp[i] = true means s[0..i-1] can be segmented. For each position i, check every word in the dictionary: if dp[i - word.length] is true and s.substring(i - word.length, i) equals the word, then dp[i] = true.

Time
O(n * m * k) where m=wordDict size, k=max word length
Space
O(n)
function wordBreak(s, wordDict) {
  const wordSet = new Set(wordDict);
  const dp = new Array(s.length + 1).fill(false);
  dp[0] = true; // empty prefix is always valid
  for (let i = 1; i <= s.length; i++) {
    for (const word of wordSet) {
      const start = i - word.length;
      if (start >= 0 && dp[start] && s.substring(start, i) === word) {
        dp[i] = true;
        break;
      }
    }
  }
  return dp[s.length];
}

Tradeoff: O(n * m * k) time. Straightforward DP. Converting wordDict to a Set first allows O(1) membership tests if you iterate substrings instead of words.

2. Bottom-up DP with substring lookup (Set variant)

Iterate all positions i and j < i. If dp[j] is true and s.substring(j, i) is in the word set, set dp[i] = true.

Time
O(n² * k) for substring creation
Space
O(n + total word chars)
function wordBreak(s, wordDict) {
  const wordSet = new Set(wordDict);
  const n = s.length;
  const dp = new Array(n + 1).fill(false);
  dp[0] = true;
  for (let i = 1; i <= n; i++) {
    for (let j = 0; j < i; j++) {
      if (dp[j] && wordSet.has(s.substring(j, i))) {
        dp[i] = true;
        break;
      }
    }
  }
  return dp[n];
}

Tradeoff: O(n²) substring checks with O(k) each for comparison and hashing. Often faster in practice than iterating all words when n is small. Standard interview answer.

Elastic-specific tips

Start with the DP recurrence stated clearly: 'dp[i] = true if there exists a split point j < i such that dp[j] is true and s[j..i-1] is in the dictionary.' Elastic interviewers value precision in recurrence definitions. Then connect to tokenization: 'Elasticsearch's standard analyzer solves a similar problem — it finds valid token boundaries in a stream of characters using grammar rules rather than a dictionary.'

Common mistakes

  • Initializing dp[0] = false — the empty prefix is always a valid segmentation; dp[0] must be true.
  • Substring indexing off by one — s.substring(j, i) gives s[j..i-1], which is the segment from j to i.
  • Not converting wordDict to a Set — O(m) membership test per inner loop iteration instead of O(1).
  • Breaking early before updating dp[i] — set dp[i] = true before break, not after.

Follow-up questions

An interviewer at Elastic may pivot to one of these next:

  • Word Break II (LC 140) — return all valid sentence segmentations (backtracking + memoization).
  • How would you handle a very large dictionary (millions of words) — use a Trie for prefix pruning.
  • How does Elasticsearch's compound word token filter split compound words using a dictionary?

Solve it now

Free. No sign-up. Python and JavaScript run instantly in your browser.

Output

Press Run or Cmd+Enter to execute

FAQ

Why is dp[0] true?

It represents the empty string, which trivially satisfies the segmentation condition. Without dp[0] = true, no valid segmentation starting from index 0 would ever be detected.

Can memoized recursion be used instead?

Yes — recursive(s, start) = any(recursive(s, start+len) for word in dict if s[start:start+len] == word). Memoize on start index. Same O(n * m * k) complexity.

Practice these live with InterviewChamp.AI

Drill Word Break and other Elastic interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.

Practice these live with InterviewChamp.AI →