208. Implement Trie (Prefix Tree)
mediumAsked at Hugging FaceBuild a prefix tree that supports insert, search, and startsWith. Hugging Face uses this because a Trie is the canonical data structure for prefix-based token lookup in tokenizer vocabularies — understanding it deeply signals you can reason about the internals of text processing infrastructure.
By Alex Chen, Founder, InterviewChamp.AI · Last verified
Source citations
Public interview reports confirming this problem appears in Hugging Face loops.
- Glassdoor (2026-Q1)— Multiple Hugging Face onsite reports cite Trie implementation as a high-signal medium problem for NLP infrastructure roles.
- Blind (2025-12)— Hugging Face interview threads identify Trie as a near-mandatory question for any role touching tokenization or search infrastructure.
Problem
A trie (pronounced as 'try') or prefix tree is a tree data structure used to efficiently store and retrieve keys in a dataset of strings. There are various applications of this data structure, such as autocomplete and spellchecker. Implement the Trie class: Trie() initializes the trie object. void insert(String word) inserts the string word into the trie. boolean search(String word) returns true if the string word is in the trie (i.e., was inserted before), and false otherwise. boolean startsWith(String prefix) returns true if there is a previously inserted string that has the prefix prefix, and false otherwise.
Constraints
1 <= word.length, prefix.length <= 2000word and prefix consist only of lowercase English letters.At most 3 * 10^4 calls in total will be made to insert, search, and startsWith.
Examples
Example 1
Trie trie = new Trie(); trie.insert("apple"); trie.search("apple"); trie.search("app"); trie.startsWith("app"); trie.insert("app"); trie.search("app");[null, true, false, true, null, true]Explanation: "apple" is inserted. search("app") is false (not inserted as a complete word). startsWith("app") is true (prefix of "apple"). After inserting "app", search("app") becomes true.
Approaches
1. Array-based children (26 slots)
Each node has an array of 26 children (one per lowercase letter) and an isEnd flag. Insert walks the path, creating nodes as needed. Search and startsWith walk without creating.
- Time
- O(L) per operation where L is word length
- Space
- O(alphabet_size * total_characters)
class TrieNode {
constructor() {
this.children = new Array(26).fill(null);
this.isEnd = false;
}
}
class Trie {
constructor() {
this.root = new TrieNode();
}
_charIdx(ch) { return ch.charCodeAt(0) - 97; }
insert(word) {
let node = this.root;
for (const ch of word) {
const idx = this._charIdx(ch);
if (!node.children[idx]) node.children[idx] = new TrieNode();
node = node.children[idx];
}
node.isEnd = true;
}
_traverse(str) {
let node = this.root;
for (const ch of str) {
const idx = this._charIdx(ch);
if (!node.children[idx]) return null;
node = node.children[idx];
}
return node;
}
search(word) {
const node = this._traverse(word);
return node !== null && node.isEnd;
}
startsWith(prefix) {
return this._traverse(prefix) !== null;
}
}Tradeoff: O(L) per operation, O(26 * N) space where N is the total number of characters inserted. Fast in practice. The fixed-size array makes child lookup O(1) but wastes space for sparse alphabets.
2. Map-based children (sparse)
Use a Map instead of a 26-element array for children. More memory-efficient for sparse alphabets or non-ASCII inputs.
- Time
- O(L) per operation
- Space
- O(N) where N is total inserted characters
class TrieNode {
constructor() {
this.children = new Map();
this.isEnd = false;
}
}
class Trie {
constructor() {
this.root = new TrieNode();
}
insert(word) {
let node = this.root;
for (const ch of word) {
if (!node.children.has(ch)) node.children.set(ch, new TrieNode());
node = node.children.get(ch);
}
node.isEnd = true;
}
_traverse(str) {
let node = this.root;
for (const ch of str) {
if (!node.children.has(ch)) return null;
node = node.children.get(ch);
}
return node;
}
search(word) {
const node = this._traverse(word);
return node !== null && node.isEnd;
}
startsWith(prefix) {
return this._traverse(prefix) !== null;
}
}Tradeoff: Better space for sparse alphabets or Unicode. Slightly slower constant factor due to Map overhead. This is the approach to reach for when the character set is not known to be limited to 26 letters.
Hugging Face-specific tips
Hugging Face will almost certainly ask: 'How does this relate to tokenizer vocabulary lookup?' Explain: 'A vocabulary trie lets you find the longest matching token prefix at each position in O(L) time — this is exactly how byte-pair encoding and unigram tokenizers scan input text. The startsWith operation is used to prune the search when no vocabulary token starts with the current prefix.' This connection is central to the Hugging Face domain and will set you apart from candidates who treat this as a pure algorithms problem.
Common mistakes
- Setting isEnd during insert on every node, not just the last — only the terminal node should be marked.
- Returning true in search when the node exists but isEnd is false — prefix found ≠ word found.
- Forgetting that startsWith only needs the node to exist, not isEnd to be true.
- Using 0-indexed char arithmetic but forgetting to verify the input is lowercase ASCII — add a guard if the problem might have mixed case.
Follow-up questions
An interviewer at Hugging Face may pivot to one of these next:
- Design Search Autocomplete System (LC 642) — extend the Trie to rank completions by frequency.
- Replace Words (LC 648) — use a Trie to find the shortest root for each word in a sentence.
- How would you serialize and deserialize a Trie for storage in a key-value store?
Solve it now
Free. No sign-up. Python and JavaScript run instantly in your browser.
FAQ
When should I use the array approach vs. the Map approach?
Array (26 slots) is faster and simpler for strictly lowercase English input. Map is more flexible for Unicode, mixed-case, or sparse alphabets. Clarify the input character set with the interviewer first.
What is the space complexity of a Trie vs. storing all strings in a hash set?
A Trie shares prefixes, so it's more space-efficient when strings share long common prefixes (e.g., a vocabulary with many similar tokens). A hash set stores each string independently.
How does a Trie support deletion?
Mark isEnd = false for the target word. Optionally prune nodes bottom-up if they have no children and isEnd is false. This is a common follow-up not covered in LC 208.
Practice these live with InterviewChamp.AI
Drill Implement Trie (Prefix Tree) and other Hugging Face interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.
Practice these live with InterviewChamp.AI →