KMP String Matching Pattern
The Knuth-Morris-Pratt (KMP) algorithm finds all occurrences of a pattern inside a text in O(n + m) time without ever backing up the text pointer. It does this by precomputing a failure function — an array that, for each prefix of the pattern, stores the length of the longest proper prefix that is also a suffix — and then using that table to skip redundant character comparisons on a mismatch.
By Alex Chen, Founder, InterviewChamp.AI · Last verified
Complexity
- Time
- O(n + m) where n is text length and m is pattern length
- Space
- O(m) for the failure function
When to use KMP String Matching
- You need to find every occurrence of a pattern P inside a text T and naive O(n * m) substring search is too slow.
- The problem asks 'is X a substring of Y' or 'how many times does X appear in Y' where both can be large.
- You need to compute the longest proper prefix that is also a suffix (LPS / failure function) of a string — this shows up in palindrome and rotation problems too.
- Rolling hash (Rabin-Karp) is unacceptable because the problem forbids randomized collisions or requires deterministic correctness.
- Problems involve string rotation, repeated patterns, or shortest palindrome construction where the LPS array gives the answer directly.
Template
function computeLPS(pattern) {
const m = pattern.length;
const lps = new Array(m).fill(0);
let length = 0;
let i = 1;
while (i < m) {
if (pattern[i] === pattern[length]) {
length++;
lps[i] = length;
i++;
} else if (length > 0) {
length = lps[length - 1];
} else {
lps[i] = 0;
i++;
}
}
return lps;
}
function kmpSearch(text, pattern) {
if (pattern.length === 0) return [0];
const lps = computeLPS(pattern);
const matches = [];
let i = 0;
let j = 0;
while (i < text.length) {
if (text[i] === pattern[j]) {
i++;
j++;
if (j === pattern.length) {
matches.push(i - j);
j = lps[j - 1];
}
} else if (j > 0) {
j = lps[j - 1];
} else {
i++;
}
}
return matches;
}Example LeetCode problems
| # | Problem | Difficulty | Frequency |
|---|---|---|---|
| 28 | Find the Index of the First Occurrence in a String | easy | foundational |
| 214 | Shortest Palindrome | hard | classic |
| 459 | Repeated Substring Pattern | easy | frequently asked |
| 1392 | Longest Happy Prefix | hard | company favorite |
| 1567 | Maximum Length of Subarray With Positive Product | medium | less common |
| 2223 | Sum of Scores of Built Strings | hard | less common |
| 3006 | Find Beautiful Indices in the Given Array I | medium | company favorite |
| 796 | Rotate String | easy | frequently asked |
Common mistakes
- Resetting length to 0 on every mismatch in the LPS build instead of falling back to lps[length - 1], which destroys the O(m) build complexity and may even compute a wrong table.
- Backing up the text pointer i after a mismatch — KMP's whole point is that i only ever moves forward; backing up turns it into naive search.
- Confusing 'longest prefix that is also a suffix' with 'longest border' and including the full string itself, producing an LPS array that is too long by one.
- Returning from the search after the first match when the problem actually wants all matches — remember to set j = lps[j - 1] and continue scanning instead of returning early.
- Failing to handle the empty-pattern edge case, which under naive semantics matches at every position including length n + 1; pick a convention up front and stick with it.
Frequently asked questions
What does the failure function (LPS) array actually represent?
For each index i in the pattern, lps[i] is the length of the longest proper prefix of pattern[0..i] that is also a suffix of pattern[0..i]. 'Proper' means strictly shorter than the full string — the entire prefix is never counted as a suffix of itself. This table tells you exactly how much you can shift the pattern after a mismatch without missing a possible match.
When should I use KMP versus Rabin-Karp?
KMP gives deterministic O(n + m) and never has hash collisions, which is what you want when correctness is non-negotiable or the alphabet is small. Rabin-Karp is easier to extend to multi-pattern search and 2D matching, and its expected complexity is also O(n + m) — but worst-case is O(n * m) without double-hashing, and adversarial inputs can trigger collisions.
Can KMP find overlapping matches?
Yes, by setting j = lps[j - 1] after recording a match instead of resetting j to 0. This causes the scan to consider the next position that shares a suffix with the matched prefix, which is exactly the next possible overlap start.
What is the time complexity of building the LPS array?
O(m). At each iteration either i increases or length decreases, and length never goes below zero. Since i goes from 1 to m and length is bounded by i, the total work is at most 2m, which is O(m). The amortized argument is the same one that proves the search phase runs in O(n).
Solve it now
Free. No sign-up. Python and JavaScript run instantly in your browser.
Practice these live with InterviewChamp.AI
Drill the KMP String Matching pattern under real interview conditions with instant feedback on your reasoning, complexity claims, and code.
Practice these live with InterviewChamp.AI →