217. Contains Duplicate
easyAsked at Hugging FaceReturn true if any value appears at least twice in an array. Hugging Face uses this as a hash-set baseline — the same deduplication logic that filters repeated tokens, removes duplicate dataset examples, and deduplicates model card identifiers in the Hub registry.
By Alex Chen, Founder, InterviewChamp.AI · Last verified
Source citations
Public interview reports confirming this problem appears in Hugging Face loops.
- Glassdoor (2025-11)— Reported as a quick hash-set warm-up in Hugging Face phone screen feedback.
- Blind (2025-08)— Hugging Face threads list Contains Duplicate as a first-principles hash-set question used for initial screening.
Problem
Given an integer array nums, return true if any value appears at least twice in the array, and return false if every element is distinct.
Constraints
1 <= nums.length <= 10^5−10^9 <= nums[i] <= 10^9
Examples
Example 1
nums = [1,2,3,1]trueExplanation: 1 appears at index 0 and index 3.
Example 2
nums = [1,2,3,4]falseExplanation: All values are distinct.
Example 3
nums = [1,1,1,3,3,4,3,2,4,2]trueExplanation: Multiple values repeat.
Approaches
1. Hash Set (single pass)
Add each number to a Set as you iterate. If you encounter a number already in the Set, return true immediately.
- Time
- O(n)
- Space
- O(n)
function containsDuplicate(nums) {
const seen = new Set();
for (const num of nums) {
if (seen.has(num)) return true;
seen.add(num);
}
return false;
}Tradeoff: O(n) time, O(n) space. Early exit as soon as the first duplicate is found. This is the canonical answer.
2. Sort then compare neighbors
Sort the array. Duplicates will be adjacent — scan for any nums[i] === nums[i+1].
- Time
- O(n log n)
- Space
- O(1) or O(log n) for sort stack
function containsDuplicate(nums) {
nums.sort((a, b) => a - b);
for (let i = 0; i < nums.length - 1; i++) {
if (nums[i] === nums[i + 1]) return true;
}
return false;
}Tradeoff: O(n log n) time, O(1) extra space (modifies input). Acceptable only if the array is allowed to be mutated and space is at a premium. Always discuss the trade-off explicitly.
Hugging Face-specific tips
Hugging Face interviewers appreciate connecting this to real ML pipeline problems: 'Dataset deduplication uses the same hash-set approach — we fingerprint each training example and skip any that have been seen before. At Hugging Face scale, that means distributed Bloom filters instead of an in-memory Set, but the core idea is identical.' This framing demonstrates that you think about the production implications of simple primitives.
Common mistakes
- Comparing Set.size to nums.length after building the full Set — this works but loses the early-exit benefit.
- Using a plain object as the set and forgetting that numeric keys stringify — use a Set for correctness.
- Mutating the array with sort when the caller might not expect it — mention this trade-off.
- O(n²) brute-force comparison — always pivot to the hash-set approach.
Follow-up questions
An interviewer at Hugging Face may pivot to one of these next:
- Contains Duplicate II (LC 219) — duplicate within k indices of each other; use a sliding-window Set.
- Contains Duplicate III (LC 220) — values within t of each other and indices within k; use a sorted set.
- How would you scale duplicate detection across a distributed dataset of billions of examples?
Solve it now
Free. No sign-up. Python and JavaScript run instantly in your browser.
FAQ
Is new Set(nums).size < nums.length a valid one-liner?
Yes — it's O(n) time and space and is readable. The only downside is it processes all elements even after the first duplicate. Mention this if asked about efficiency.
Why not just sort and compare adjacent elements?
Sorting is O(n log n) vs O(n) for the hash set, and it mutates the input array. The hash set is strictly superior unless memory is a hard constraint.
What if the array is already sorted?
Then the adjacent-comparison approach is O(n) with O(1) space — the best option in that specific case.
Practice these live with InterviewChamp.AI
Drill Contains Duplicate and other Hugging Face interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.
Practice these live with InterviewChamp.AI →