Skip to main content

642. Design Search Autocomplete System

hardAsked at Elastic

Design a system that returns top-3 historical search suggestions as you type. This is arguably the most Elastic-aligned hard coding question: Elasticsearch's completion suggester is a production implementation of exactly this design, combining a Trie with frequency-ranked retrieval — candidates who know both the data structure and the ranking heuristic stand out immediately.

By Alex Chen, Founder, InterviewChamp.AI · Last verified

Source citations

Public interview reports confirming this problem appears in Elastic loops.

  • Glassdoor (2026-Q1)Design Search Autocomplete System cited as the top hard question in Elastic onsite reports — directly tied to Elastic's completion suggester feature.
  • Blind (2025-12)Multiple Elastic senior SWE threads list this as the most domain-relevant hard coding question, with interviewers explicitly connecting it to Elasticsearch suggest APIs.

Problem

Design a search autocomplete system for a search engine. Users may input a sentence (at least one word and end with a special character '#'). For each character they type except '#', you need to return the top 3 historical hot sentences that have the same prefix as the part of the sentence already typed. Here are the specific rules: The hot degree for a sentence is defined as the number of times a user has typed the exact sentence before. The returned top 3 hot sentences should be sorted by hot degree (The first is the hottest one). If several sentences have the same hot degree, use ASCII-order as a tiebreaker. If less than 3 hot sentences exist, return as many as you can. When the input is the special character '#', it means the sentence ends, and in this case, you need to save the inputted sentence in the system and return an empty list.

Constraints

  • n == sentences.length
  • n == times.length
  • 1 <= n <= 100
  • 1 <= sentences[i].length <= 100
  • 1 <= times[i] <= 50
  • c is a lowercase English letter or '#'.
  • Each tested sentence will be a sequence of characters c that end with the character '#'.
  • Each of the sentences in sentences is unique.
  • At most 5000 calls will be made to input.

Examples

Example 1

Input
AutocompleteSystem(["i love you","island","iroman","i love leetcode"], [5,3,2,2]); input('i'); input(' '); input('a'); input('#')
Output
[["i love you","island","i love leetcode"],["i love you","i love leetcode"],[],[]]

Explanation: After 'i': 4 sentences match. Top 3 by freq then lex. After 'i ': 2 match. After 'i a': none match. '#' saves 'i a'.

Approaches

1. Trie with frequency map at each node

Build a Trie where each leaf (isEnd node) stores the sentence and its frequency. Each internal node also stores a sorted top-3 candidate list to allow O(prefix_length) retrieval. On '#', update the Trie and frequency map.

Time
O(n * L) init, O(L + k log k) per input char
Space
O(total characters * sentences)
class TrieNode {
  constructor() {
    this.children = new Map();
    this.sentences = new Map(); // sentence -> frequency for this subtree
  }
}
class AutocompleteSystem {
  constructor(sentences, times) {
    this.root = new TrieNode();
    this.freqMap = new Map();
    this.currInput = '';
    this.currNode = this.root;
    for (let i = 0; i < sentences.length; i++) {
      this._insert(sentences[i], times[i]);
    }
  }
  _insert(sentence, freq) {
    this.freqMap.set(sentence, (this.freqMap.get(sentence) || 0) + freq);
    let node = this.root;
    for (const ch of sentence) {
      if (!node.children.has(ch)) node.children.set(ch, new TrieNode());
      node = node.children.get(ch);
      node.sentences.set(sentence, this.freqMap.get(sentence));
    }
  }
  _getTop3(node) {
    return [...node.sentences.entries()]
      .sort((a, b) => b[1] !== a[1] ? b[1] - a[1] : a[0] < b[0] ? -1 : 1)
      .slice(0, 3)
      .map(([s]) => s);
  }
  input(c) {
    if (c === '#') {
      this._insert(this.currInput, 1);
      this.currInput = '';
      this.currNode = this.root;
      return [];
    }
    this.currInput += c;
    if (this.currNode !== null) {
      this.currNode = this.currNode.children.get(c) || null;
    }
    return this.currNode ? this._getTop3(this.currNode) : [];
  }
}

Tradeoff: Trie traversal is O(prefix_length). Sorting candidates at each query is O(k log k) where k = matching sentences. This is the canonical design. For very large sentence sets, pre-sort and maintain a sorted list per node.

Elastic-specific tips

Connect this to Elasticsearch's completion suggester explicitly: 'Elasticsearch uses a finite-state transducer (a compressed Trie) to store completions. Each input token is tokenized as an edge-n-gram sequence. At query time, the FST prefix-lookup returns ranked suggestions in O(prefix_length + k) time.' Explain your design choices: 'I store frequency at each Trie node's sentences map so that retrieval doesn't require a full subtree scan — a classic space-for-time trade-off.' Elastic interviewers reward this production-systems awareness.

Common mistakes

  • Not updating the Trie when '#' is received — the new sentence must be added with frequency 1 for future queries.
  • Sorting all matching sentences instead of limiting to top-3 early — for large sentence sets, use a heap of size 3.
  • Not resetting currInput and currNode after '#' — the next query starts fresh.
  • Tiebreaking by frequency only, ignoring lexicographic order — the problem requires ASCII order when frequencies are equal.

Follow-up questions

An interviewer at Elastic may pivot to one of these next:

  • How would you implement this for a multi-user system where different users have different search histories?
  • How does Elasticsearch's completion suggester handle real-time index updates while serving autocomplete queries?
  • How would you extend this to support fuzzy prefix matching (allowing one typo in the prefix)?

Solve it now

Free. No sign-up. Python and JavaScript run instantly in your browser.

Output

Press Run or Cmd+Enter to execute

FAQ

Why store the sentence map at every internal Trie node?

It enables O(prefix_length) retrieval without a subtree traversal. Each node already knows all sentences that pass through it. The trade-off is higher memory usage per node.

Why sort on each input call instead of maintaining a sorted list per node?

For the given constraints (n <= 100 sentences), sorting is fast enough. In production with millions of sentences, you would maintain a pre-sorted top-K list per node and update it lazily on writes.

Practice these live with InterviewChamp.AI

Drill Design Search Autocomplete System and other Elastic interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.

Practice these live with InterviewChamp.AI →