642. Design Search Autocomplete System
hardAsked at GleanThis is arguably the most Glean-relevant problem that exists — it asks you to build the autocomplete system that is Glean's core product differentiator, combining a Trie with a frequency-ranked top-K retrieval engine.
By Alex Chen, Founder, InterviewChamp.AI · Last verified
Source citations
Public interview reports confirming this problem appears in Glean loops.
- Glassdoor (2026-Q1)— Glean interviewers have asked this problem verbatim in multiple onsite reports — it is considered the highest-signal design-coding hybrid in their question bank.
- Blind (2025-12)— Ranked #1 most Glean-specific problem in community prep threads. Candidates with strong Trie + ranking knowledge consistently report receiving offers.
Problem
Design a search autocomplete system for a search engine. Users may input a sentence (at least one word and end with a special character '#'). Given a string array sentences and an integer array times representing historical queries and their respective frequencies, implement the AutocompleteSystem class: AutocompleteSystem(sentences, times) initializes the object with historical data. List<String> input(char c) given a character c, if c == '#', stores the current input as a new historical sentence and returns an empty list. Otherwise, returns the top 3 historical sentences with the same prefix as the current input, sorted by frequency (descending). Ties are broken lexicographically.
Constraints
n == sentences.length == times.length1 <= n <= 1001 <= sentences[i].length <= 1001 <= times[i] <= 50c is a lowercase English letter, '#', or a space character.Each call to input will have the characters typed so far form a prefix of a non-empty sentence.At most 5000 calls to input.
Examples
Example 1
AutocompleteSystem(['i love you','island','iroman','i love leetcode'],[5,3,2,2]); input('i'); input(' '); input('a'); input('#')[["i love you","island","i love leetcode"],["i love you","i love leetcode"],[],[]]Explanation: After 'i', return top 3 sentences starting with 'i' by frequency. After 'i ', return sentences starting with 'i '. After 'i a', no match. '#' saves 'i a' with frequency 1.
Approaches
1. Trie with frequency map at each node
Build a Trie where each node stores a map of sentence → frequency for all sentences passing through it. On each input character, traverse the Trie and return the top 3 sentences by frequency from the current node's map. On '#', insert the accumulated prefix as a new sentence.
- Time
- O(p * s * log 3) per input where p = prefix length, s = matching sentences
- Space
- O(total characters across all sentences)
class TrieNode {
constructor() {
this.children = new Map();
this.sentences = new Map(); // sentence -> frequency
}
}
class AutocompleteSystem {
constructor(sentences, times) {
this.root = new TrieNode();
this.curr = this.root;
this.prefix = '';
for (let i = 0; i < sentences.length; i++) {
this._insert(sentences[i], times[i]);
}
}
_insert(sentence, freq) {
let node = this.root;
for (const ch of sentence) {
if (!node.children.has(ch)) node.children.set(ch, new TrieNode());
node = node.children.get(ch);
node.sentences.set(sentence, (node.sentences.get(sentence) || 0) + freq);
}
}
input(c) {
if (c === '#') {
this._insert(this.prefix, 1);
this.prefix = '';
this.curr = this.root;
return [];
}
this.prefix += c;
if (this.curr !== null && this.curr.children.has(c)) {
this.curr = this.curr.children.get(c);
} else {
this.curr = null; // prefix not in trie
}
if (this.curr === null) return [];
// get top 3 by frequency desc, then lex asc
return [...this.curr.sentences.entries()]
.sort((a, b) => b[1] - a[1] || a[0].localeCompare(b[0]))
.slice(0, 3)
.map(([s]) => s);
}
}Tradeoff: Each Trie node stores a frequency map for all matching sentences. Lookup is fast (O(1) Trie traversal), but sorting all matching sentences per input character is O(s log s). For the given constraints (n ≤ 100) this is efficient enough.
2. Trie + pre-sorted top-K at each node
Instead of storing all frequencies and sorting on query, maintain a pre-sorted top-3 list at each TrieNode. On insert, update and re-sort affected nodes. On query, just return the stored top-3 directly.
- Time
- O(p * n) per insert (update all nodes), O(1) per query after Trie traversal
- Space
- O(total characters * 3)
// Sketch — full code builds on approach 1
// Each TrieNode stores: top3 = [] (sorted list of up to 3 [freq, sentence] pairs)
// insert: traverse and update each node's top3 if the new sentence changes rankings
// input: return this.curr.top3.map(([_, s]) => s)
// This is the production-optimal variant — mention it as the follow-up optimization
console.log('See approach 1 for the full implementation; this is the optimization direction.');Tradeoff: O(1) query after Trie walk — optimal for high-query-rate systems. More complex insert logic. Describe this as the production optimization after presenting approach 1.
Glean-specific tips
Open with: 'This is literally Glean's search bar.' Interviewers often smile and say 'yes.' Then lay out the component breakdown before any code: (1) Trie for prefix navigation, (2) frequency map at each node for ranking, (3) top-3 extraction with tie-breaking. Glean specifically values the tie-breaking rule (lexicographic ascending for same frequency) — handle it in your sort comparator and mention it proactively. Be prepared to discuss the trade-off between storing all sentences per node vs. pre-computing top-K at each node.
Common mistakes
- Forgetting to reset this.curr to root on '#' — subsequent inputs would continue from the wrong node.
- Not handling the case where this.curr is null (prefix not found in Trie) — return [] instead of crashing.
- Incorrect tie-breaking — when frequencies are equal, sort lexicographically ascending (a < b); do not sort by insertion order.
- Inserting the completed sentence with the wrong frequency on '#' — always add 1 to the existing frequency; don't replace it.
Follow-up questions
An interviewer at Glean may pivot to one of these next:
- What if you need to support deletion (remove a sentence from autocomplete history)?
- How would you scale this to millions of historical queries across multiple users?
- Design a fuzzy autocomplete that returns results even if the prefix has a typo (edit distance 1).
Solve it now
Free. No sign-up. Python and JavaScript run instantly in your browser.
FAQ
Why store frequencies at every Trie node rather than only at terminal nodes?
Querying at a prefix requires knowing all sentences that start with that prefix and their frequencies. Storing the full frequency map at each intermediate node makes prefix-query O(1) at the cost of O(sentences) extra memory per node.
How do you handle spaces in the prefix?
Treat space as just another character in the Trie — include it as a key in the children map. The problem explicitly states c can be a space character, so your Trie must handle it without special-casing.
Practice these live with InterviewChamp.AI
Drill Design Search Autocomplete System and other Glean interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.
Practice these live with InterviewChamp.AI →