Skip to main content

217. Contains Duplicate

easyAsked at Hugging Face

Return true if any value appears at least twice in an array. Hugging Face uses this as a hash-set baseline — the same deduplication logic that filters repeated tokens, removes duplicate dataset examples, and deduplicates model card identifiers in the Hub registry.

By Alex Chen, Founder, InterviewChamp.AI · Last verified

Source citations

Public interview reports confirming this problem appears in Hugging Face loops.

  • Glassdoor (2025-11)Reported as a quick hash-set warm-up in Hugging Face phone screen feedback.
  • Blind (2025-08)Hugging Face threads list Contains Duplicate as a first-principles hash-set question used for initial screening.

Problem

Given an integer array nums, return true if any value appears at least twice in the array, and return false if every element is distinct.

Constraints

  • 1 <= nums.length <= 10^5
  • −10^9 <= nums[i] <= 10^9

Examples

Example 1

Input
nums = [1,2,3,1]
Output
true

Explanation: 1 appears at index 0 and index 3.

Example 2

Input
nums = [1,2,3,4]
Output
false

Explanation: All values are distinct.

Example 3

Input
nums = [1,1,1,3,3,4,3,2,4,2]
Output
true

Explanation: Multiple values repeat.

Approaches

1. Hash Set (single pass)

Add each number to a Set as you iterate. If you encounter a number already in the Set, return true immediately.

Time
O(n)
Space
O(n)
function containsDuplicate(nums) {
  const seen = new Set();
  for (const num of nums) {
    if (seen.has(num)) return true;
    seen.add(num);
  }
  return false;
}

Tradeoff: O(n) time, O(n) space. Early exit as soon as the first duplicate is found. This is the canonical answer.

2. Sort then compare neighbors

Sort the array. Duplicates will be adjacent — scan for any nums[i] === nums[i+1].

Time
O(n log n)
Space
O(1) or O(log n) for sort stack
function containsDuplicate(nums) {
  nums.sort((a, b) => a - b);
  for (let i = 0; i < nums.length - 1; i++) {
    if (nums[i] === nums[i + 1]) return true;
  }
  return false;
}

Tradeoff: O(n log n) time, O(1) extra space (modifies input). Acceptable only if the array is allowed to be mutated and space is at a premium. Always discuss the trade-off explicitly.

Hugging Face-specific tips

Hugging Face interviewers appreciate connecting this to real ML pipeline problems: 'Dataset deduplication uses the same hash-set approach — we fingerprint each training example and skip any that have been seen before. At Hugging Face scale, that means distributed Bloom filters instead of an in-memory Set, but the core idea is identical.' This framing demonstrates that you think about the production implications of simple primitives.

Common mistakes

  • Comparing Set.size to nums.length after building the full Set — this works but loses the early-exit benefit.
  • Using a plain object as the set and forgetting that numeric keys stringify — use a Set for correctness.
  • Mutating the array with sort when the caller might not expect it — mention this trade-off.
  • O(n²) brute-force comparison — always pivot to the hash-set approach.

Follow-up questions

An interviewer at Hugging Face may pivot to one of these next:

  • Contains Duplicate II (LC 219) — duplicate within k indices of each other; use a sliding-window Set.
  • Contains Duplicate III (LC 220) — values within t of each other and indices within k; use a sorted set.
  • How would you scale duplicate detection across a distributed dataset of billions of examples?

Solve it now

Free. No sign-up. Python and JavaScript run instantly in your browser.

Output

Press Run or Cmd+Enter to execute

FAQ

Is new Set(nums).size < nums.length a valid one-liner?

Yes — it's O(n) time and space and is readable. The only downside is it processes all elements even after the first duplicate. Mention this if asked about efficiency.

Why not just sort and compare adjacent elements?

Sorting is O(n log n) vs O(n) for the hash set, and it mutates the input array. The hash set is strictly superior unless memory is a hard constraint.

What if the array is already sorted?

Then the adjacent-comparison approach is O(n) with O(1) space — the best option in that specific case.

Practice these live with InterviewChamp.AI

Drill Contains Duplicate and other Hugging Face interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.

Practice these live with InterviewChamp.AI →