642. Design Search Autocomplete System

Q: Why store the frequency map at every intermediate Trie node?

Each node represents a prefix; the frequency map must contain all sentences that share that prefix. Storing only at leaf nodes would require traversing the entire subtree on every input character.

Q: How would you optimise for very frequent queries?

Pre-sort the top-k completions at each node during construction and update them incrementally on '#' inputs using a bounded sorted list. This reduces per-input retrieval from O(S log S) to O(k).

Q: What data structure would you use for the frequency map in a production system?

A skip list or sorted array of (frequency, sentence) pairs supports O(log S) frequency update and O(k) top-k retrieval. For extreme scale, an approximate top-k structure (count-min sketch + heap) reduces memory.

hardAsked at Cohere

Design a real-time autocomplete system that suggests the top-3 most frequently typed sentences matching the current prefix. Cohere asks this design-heavy hard problem because prefix-indexed retrieval with frequency ranking is the core of their enterprise search and query-suggestion products.

By Sam K., Founder, InterviewChamp.AI · Last verified 2026-05-22

Source citations

Public interview reports confirming this problem appears in Cohere loops.

Glassdoor (2026-Q1)— Cohere senior SWE candidates report Design Search Autocomplete as a hard system-design-meets-coding problem asked for senior roles.
Blind (2025-11)— Cohere Blind threads list this as a high-signal hard problem combining Trie design with frequency ranking.

Problem

Design a search autocomplete system for a search engine. Users may input a sentence (at least one word and end with a special character '#'). For each character they type except '#', you need to return the top 3 historical hot sentences that have the same prefix as the part of the sentence already typed. Here are the specific rules: The hot degree for a sentence is defined as the number of times a user typed the exactly same sentence before. The returned top 3 hot sentences should be sorted by hot degree (The first is the hottest one). If several sentences have the same hot degree, use ASCII-code order (smaller one appears first). If less than 3 hot sentences exist, return as many as you can. When the user types '#', it means he finishes the input of a sentence. You should return an empty list. You should then record the sentence as a new historical entry (incrementing its frequency by 1) and reset the current partial input.

Constraints

n == sentences.length == times.length
1 <= n <= 100
1 <= sentences[i].length <= 100
1 <= times[i] <= 50
1 <= data.length <= 5000
data[i] is a lowercase English letter or '#'.
It is guaranteed that the input is always valid.

Examples

Example 1

Input

AutocompleteSystem(["i love you","island","iroman","i love leetcode"],[5,3,2,2]); input('i'); input(' '); input('a'); input('#')

Output

[["i love you","island","i love leetcode"],["i love you","i love leetcode"],[],[]]

Explanation: After 'i', top 3 match by freq. After 'i ', only 'i love' sentences match. After 'i a', no match. '#' records the new sentence.

Approaches

1. Trie with frequency map at each node

Build a Trie where each node stores a frequency map of complete sentences that pass through it. On input('#') add the current sentence to the Trie and reset. On each character, descend the Trie and return the top-3 from the current node's frequency map.

Time: O(n·L) build, O(L·S) per input where L=sentence length, S=sentences at node
Space: O(n·L·S)

class TrieNode {
  constructor() {
    this.children = {};
    this.freq = new Map(); // sentence -> frequency
  }
}
class AutocompleteSystem {
  constructor(sentences, times) {
    this.root = new TrieNode();
    this.curr = this.root;
    this.input = '';
    for (let i = 0; i < sentences.length; i++) {
      this._add(sentences[i], times[i]);
    }
  }
  _add(sentence, count) {
    let node = this.root;
    for (const ch of sentence) {
      if (!node.children[ch]) node.children[ch] = new TrieNode();
      node = node.children[ch];
      node.freq.set(sentence, (node.freq.get(sentence) || 0) + count);
    }
  }
  _topK(node, k = 3) {
    return [...node.freq.entries()]
      .sort((a, b) => b[1] - a[1] || a[0].localeCompare(b[0]))
      .slice(0, k)
      .map(([s]) => s);
  }
  input(c) {
    if (c === '#') {
      this._add(this.input, 1);
      this.input = '';
      this.curr = this.root;
      return [];
    }
    this.input += c;
    if (!this.curr || !this.curr.children[c]) {
      this.curr = null;
      return [];
    }
    this.curr = this.curr.children[c];
    return this._topK(this.curr);
  }
}

Tradeoff: Intuitive Trie design. Each node stores all sentences reachable from it, which allows O(S log S) top-k retrieval. Space is O(n·L·S) because each sentence is stored at every node along its path — acceptable for the given constraints.

Cohere-specific tips

This is a system-design-meets-coding problem that Cohere uses to assess enterprise search instincts. Lead with a high-level design: 'I will use a Trie for prefix lookup and store a frequency-ranked list of completions at each node.' Then discuss trade-offs: 'For a production system with millions of queries, I would pre-sort completions at each Trie node and maintain a max-heap of size k instead of sorting on every input character.' Mention that Cohere's enterprise search products face exactly this problem — the Trie can be sharded across nodes, and frequency updates can be batched to avoid write contention.

Common mistakes

Not resetting curr to the root on '#' — the next input starts a fresh query.
Setting curr to null too eagerly — once the prefix has no match, all subsequent characters also have no match, but the '#' reset logic must still fire.
Forgetting lexicographic tiebreaking — sort by frequency descending, then by sentence ascending.
Storing frequencies at the Trie root only — each intermediate node must carry the frequency map to support arbitrary prefix queries.

Follow-up questions

An interviewer at Cohere may pivot to one of these next:

How would you design this for 1 billion historical queries with millisecond p99 latency?
How would you add personalisation — weighting recent queries more heavily?
Design a distributed autocomplete system where the Trie is partitioned across multiple servers.

Solve it now

Free. No sign-up. Python and JavaScript run instantly in your browser.