30. Contains Duplicate
easyAsked at DatadogDetermine if any value appears at least twice in an array. Datadog uses this as the hashset warmup before count-distinct and HyperLogLog discussions for high-cardinality tag streams.
By Alex Chen, Founder, InterviewChamp.AI · Last verified
Source citations
Public interview reports confirming this problem appears in Datadog loops.
- Glassdoor (2026-Q1)— Datadog phone screen — leads to HyperLogLog follow-ups.
- Blind (2025-10)— Datadog tagged.
Problem
Given an integer array nums, return true if any value appears at least twice in the array, and return false if every element is distinct.
Constraints
1 <= nums.length <= 10^5-10^9 <= nums[i] <= 10^9
Examples
Example 1
nums = [1,2,3,1]trueExample 2
nums = [1,2,3,4]falseExample 3
nums = [1,1,1,3,3,4,3,2,4,2]trueApproaches
1. Sort + scan
Sort; if any two adjacent values are equal, return true.
- Time
- O(n log n)
- Space
- O(1) or O(n) depending on sort
function containsDuplicate(nums) {
nums = nums.slice().sort((a, b) => a - b);
for (let i = 1; i < nums.length; i++) {
if (nums[i] === nums[i - 1]) return true;
}
return false;
}Tradeoff: Sort cost dominates. Fine for small inputs; beaten by hashset for n = 10^5.
2. Hashset early-exit (optimal)
Add to a Set; if Set.add returns false (or if you check before adding), there's a duplicate.
- Time
- O(n) average
- Space
- O(n)
function containsDuplicate(nums) {
const seen = new Set();
for (const x of nums) {
if (seen.has(x)) return true;
seen.add(x);
}
return false;
}Tradeoff: O(n) expected. The early exit is critical for high-throughput stream variants.
Datadog-specific tips
Datadog will follow up with: 'What if you have 10 billion items and can't fit them all in memory?' Discuss probabilistic data structures: Bloom filter for set-membership with false positives; HyperLogLog for cardinality estimation. They expect you to know these by name.
Common mistakes
- Using Array.from(new Set(nums)).length !== nums.length — works but no early exit.
- Comparing nums.length to new Set(nums).size — same issue.
- Forgetting that the Set approach is O(n) AVERAGE, not worst case (hash collisions).
Follow-up questions
An interviewer at Datadog may pivot to one of these next:
- Contains Duplicate II (LC 219) — at most k indices apart.
- Contains Duplicate III (LC 220) — values within a range AND indices within k.
- HyperLogLog — approximate cardinality with O(1) memory per stream.
Solve it now
Free. No sign-up. Python and JavaScript run instantly in your browser.
FAQ
When would you prefer sort over hashset?
When the input is already nearly sorted, or memory is tight enough to forbid O(n) extra space, or you're in a system where hash collisions are adversarial.
What about Bloom filters?
For false-positive-tolerant duplicate detection at huge scale — they answer 'definitely not seen' or 'probably seen' in O(1) memory per element. Datadog uses these for log dedup.
Practice these live with InterviewChamp.AI
Drill Contains Duplicate and other Datadog interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.
Practice these live with InterviewChamp.AI →