146. LRU Cache

Q: Why doubly-linked and not singly-linked?

Removing an arbitrary node in O(1) requires a pointer to its predecessor. A singly-linked list forces O(n) traversal to find the previous node.

Q: Why dummy head and tail?

They eliminate null-checks for inserting at the head or removing from the tail. Every pointer rewire becomes uniform — no special cases.

Q: Is the JS Map solution acceptable at Hugging Face?

Generally yes if you explicitly state you're relying on the ECMAScript insertion-order guarantee. Then offer the DLL version if they want a language-agnostic solution.

mediumAsked at Hugging Face

Design a cache that evicts the least-recently-used entry when full. Hugging Face uses this because it mirrors a real problem they solve at scale — caching tokenizer outputs and hosted model inference results where stale entries must be evicted under memory pressure to serve millions of API requests efficiently.

By Sam K., Founder, InterviewChamp.AI · Last verified 2026-05-22

Source citations

Public interview reports confirming this problem appears in Hugging Face loops.

Glassdoor (2026-Q1)— Multiple Hugging Face SWE onsite reports cite LRU Cache as a core medium design problem in backend and infrastructure rounds.
Blind (2025-12)— Hugging Face threads identify LRU Cache as a high-signal medium that directly maps to their inference caching infrastructure.

Problem

Design a data structure that follows the constraints of a Least Recently Used (LRU) cache. Implement the LRUCache class: LRUCache(capacity) initializes the LRU cache with a positive size capacity. int get(int key) returns the value of the key if it exists, otherwise -1. void put(int key, int value) updates the value if the key exists, otherwise inserts the key-value pair. When the number of keys exceeds the capacity, evict the least recently used key. Both get and put must run in O(1) average time complexity.

Constraints

1 <= capacity <= 3000
0 <= key <= 10^4
0 <= value <= 10^5
At most 2 * 10^5 calls will be made to get and put.

Examples

Example 1

Input

LRUCache(2); put(1,1); put(2,2); get(1); put(3,3); get(2); put(4,4); get(1); get(3); get(4)

Output

[null,null,null,1,null,-1,null,1,3,4]

Explanation: After put(3,3), key 2 is evicted (LRU). After put(4,4), key 1 had been recently accessed via get(1), so key 3 is evicted instead.

Approaches

1. JS Map (insertion-order)

JS Map preserves insertion order. On get or put, delete and re-insert to move the key to the end (most recent). Evict the first key (least recent) when over capacity.

Time: O(1) amortized
Space: O(capacity)

class LRUCache {
  constructor(capacity) {
    this.capacity = capacity;
    this.cache = new Map();
  }
  get(key) {
    if (!this.cache.has(key)) return -1;
    const val = this.cache.get(key);
    this.cache.delete(key);
    this.cache.set(key, val);
    return val;
  }
  put(key, value) {
    if (this.cache.has(key)) this.cache.delete(key);
    this.cache.set(key, value);
    if (this.cache.size > this.capacity) {
      this.cache.delete(this.cache.keys().next().value);
    }
  }
}

Tradeoff: O(1) amortized per the ECMAScript spec. Concise, but relies on language-specific behavior. Always state this explicitly in the interview.

2. Hash map + doubly-linked list (canonical)

A Map gives O(1) key → node lookup. A doubly-linked list with dummy head and tail gives O(1) move-to-front and evict-from-tail. The map stores key → node pointers.

Time: O(1) get and put
Space: O(capacity)

class Node {
  constructor(key, val) {
    this.key = key; this.val = val;
    this.prev = this.next = null;
  }
}
class LRUCache {
  constructor(capacity) {
    this.capacity = capacity;
    this.map = new Map();
    this.head = new Node(0, 0);
    this.tail = new Node(0, 0);
    this.head.next = this.tail;
    this.tail.prev = this.head;
  }
  _remove(node) {
    node.prev.next = node.next;
    node.next.prev = node.prev;
  }
  _insertFront(node) {
    node.next = this.head.next;
    node.prev = this.head;
    this.head.next.prev = node;
    this.head.next = node;
  }
  get(key) {
    if (!this.map.has(key)) return -1;
    const node = this.map.get(key);
    this._remove(node);
    this._insertFront(node);
    return node.val;
  }
  put(key, value) {
    if (this.map.has(key)) this._remove(this.map.get(key));
    const node = new Node(key, value);
    this._insertFront(node);
    this.map.set(key, node);
    if (this.map.size > this.capacity) {
      const lru = this.tail.prev;
      this._remove(lru);
      this.map.delete(lru.key);
    }
  }
}

Tradeoff: Strict O(1) for all operations, no reliance on language-specific behavior. This is the canonical solution Hugging Face expects for infrastructure roles. The doubly-linked list is the key insight — singly-linked requires O(n) to remove an arbitrary node.

Hugging Face-specific tips

Hugging Face will likely ask: 'How does this apply to your inference caching design?' Connect it directly: 'A production inference cache for hosted models uses exactly this structure — the key is a hash of the input prompt, the value is the cached output embedding or response, and LRU eviction under memory pressure ensures the most-used results stay warm.' Also mention TTL-based invalidation as a follow-up. Always explain why you need a doubly-linked list (O(1) removal of any node) before writing the code.

Common mistakes

Using a singly-linked list — O(n) node removal without a predecessor pointer.
Forgetting to delete the evicted node from the map — the map and list go out of sync.
Not moving a node to the front on get() — a cache hit counts as recent use.
Not using dummy head and tail nodes — every edge case (empty list, single element) requires special-casing without them.

Follow-up questions

An interviewer at Hugging Face may pivot to one of these next:

LFU Cache (LC 460) — evict the least-frequently-used item; requires frequency buckets.
How would you make this cache thread-safe for concurrent requests?
How would you distribute this cache across multiple servers while maintaining LRU semantics globally?

Solve it now

Free. No sign-up. Python and JavaScript run instantly in your browser.