139. Word Break
mediumAsked at ElasticDetermine if a string can be segmented into words from a dictionary. Elastic interviews this because token segmentation is at the core of Elasticsearch's text analysis pipeline — understanding which substrings can be valid tokens is exactly what analyzers compute before indexing.
By Alex Chen, Founder, InterviewChamp.AI · Last verified
Source citations
Public interview reports confirming this problem appears in Elastic loops.
- Glassdoor (2026-Q1)— Elastic SWE candidates report string DP problems appearing in mid-level rounds with explicit text-analysis framing.
- Blind (2025-11)— Word Break listed in Elastic interview threads as a DP problem that directly maps to their search tokenization domain.
Problem
Given a string s and a dictionary of strings wordDict, return true if s can be segmented into a space-separated sequence of one or more dictionary words. Note that the same word in the dictionary may be reused multiple times in the segmentation.
Constraints
1 <= s.length <= 3001 <= wordDict.length <= 10001 <= wordDict[i].length <= 20s and wordDict[i] consist of only lowercase English letters.All the words in wordDict are unique.
Examples
Example 1
s = "leetcode", wordDict = ["leet","code"]trueExplanation: 'leetcode' = 'leet' + 'code'.
Example 2
s = "applepenapple", wordDict = ["apple","pen"]trueExplanation: 'apple' + 'pen' + 'apple'. 'apple' is reused.
Example 3
s = "catsandog", wordDict = ["cats","dog","sand","and","cat"]falseExplanation: No segmentation covers the entire string.
Approaches
1. Bottom-up DP with Set lookup
Create a boolean DP array where dp[i] = true means s[0..i-1] can be segmented. For each position i, check every word in the dictionary: if dp[i - word.length] is true and s.substring(i - word.length, i) equals the word, then dp[i] = true.
- Time
- O(n * m * k) where m=wordDict size, k=max word length
- Space
- O(n)
function wordBreak(s, wordDict) {
const wordSet = new Set(wordDict);
const dp = new Array(s.length + 1).fill(false);
dp[0] = true; // empty prefix is always valid
for (let i = 1; i <= s.length; i++) {
for (const word of wordSet) {
const start = i - word.length;
if (start >= 0 && dp[start] && s.substring(start, i) === word) {
dp[i] = true;
break;
}
}
}
return dp[s.length];
}Tradeoff: O(n * m * k) time. Straightforward DP. Converting wordDict to a Set first allows O(1) membership tests if you iterate substrings instead of words.
2. Bottom-up DP with substring lookup (Set variant)
Iterate all positions i and j < i. If dp[j] is true and s.substring(j, i) is in the word set, set dp[i] = true.
- Time
- O(n² * k) for substring creation
- Space
- O(n + total word chars)
function wordBreak(s, wordDict) {
const wordSet = new Set(wordDict);
const n = s.length;
const dp = new Array(n + 1).fill(false);
dp[0] = true;
for (let i = 1; i <= n; i++) {
for (let j = 0; j < i; j++) {
if (dp[j] && wordSet.has(s.substring(j, i))) {
dp[i] = true;
break;
}
}
}
return dp[n];
}Tradeoff: O(n²) substring checks with O(k) each for comparison and hashing. Often faster in practice than iterating all words when n is small. Standard interview answer.
Elastic-specific tips
Start with the DP recurrence stated clearly: 'dp[i] = true if there exists a split point j < i such that dp[j] is true and s[j..i-1] is in the dictionary.' Elastic interviewers value precision in recurrence definitions. Then connect to tokenization: 'Elasticsearch's standard analyzer solves a similar problem — it finds valid token boundaries in a stream of characters using grammar rules rather than a dictionary.'
Common mistakes
- Initializing dp[0] = false — the empty prefix is always a valid segmentation; dp[0] must be true.
- Substring indexing off by one — s.substring(j, i) gives s[j..i-1], which is the segment from j to i.
- Not converting wordDict to a Set — O(m) membership test per inner loop iteration instead of O(1).
- Breaking early before updating dp[i] — set dp[i] = true before break, not after.
Follow-up questions
An interviewer at Elastic may pivot to one of these next:
- Word Break II (LC 140) — return all valid sentence segmentations (backtracking + memoization).
- How would you handle a very large dictionary (millions of words) — use a Trie for prefix pruning.
- How does Elasticsearch's compound word token filter split compound words using a dictionary?
Solve it now
Free. No sign-up. Python and JavaScript run instantly in your browser.
FAQ
Why is dp[0] true?
It represents the empty string, which trivially satisfies the segmentation condition. Without dp[0] = true, no valid segmentation starting from index 0 would ever be detected.
Can memoized recursion be used instead?
Yes — recursive(s, start) = any(recursive(s, start+len) for word in dict if s[start:start+len] == word). Memoize on start index. Same O(n * m * k) complexity.
Practice these live with InterviewChamp.AI
Drill Word Break and other Elastic interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.
Practice these live with InterviewChamp.AI →