4. Remove Duplicates from Sorted Array

Q: Why two pointers and not one?

The slow pointer marks where to WRITE; the fast pointer is where to READ. They diverge whenever you skip a duplicate, which is the whole point of in-place compaction.

Q: Does this work on unsorted input?

No — the algorithm assumes duplicates are adjacent. Unsorted needs a hash set or sort first.

easyAsked at Databricks

Modify a sorted array in-place to remove duplicates and return the new length. Databricks uses this to test the two-pointer / read-write head pattern that shows up in every distributed dedup operator.

By Sam K., Founder, InterviewChamp.AI · Last verified 2026-05-20

Source citations

Public interview reports confirming this problem appears in Databricks loops.

LeetCode Discuss (2025-10)— Databricks SDE-II phone screen warm-up.
Glassdoor (2026-Q1)— Followed by 'how does Spark's distinct() work on a sorted partition?'

Problem

Given an integer array nums sorted in non-decreasing order, remove the duplicates in-place such that each unique element appears only once. The relative order of the elements should be kept the same. Return k after placing the final result in the first k slots of nums.

Constraints

1 <= nums.length <= 3 * 10^4
-100 <= nums[i] <= 100
nums is sorted in non-decreasing order.

Examples

Example 1

Input

nums = [1,1,2]

Output

2, nums = [1,2,_]

Example 2

Input

nums = [0,0,1,1,1,2,2,3,3,4]

Output

5, nums = [0,1,2,3,4,_,_,_,_,_]

Approaches

1. Set + rebuild

Throw into a Set, write back.

Time: O(n)
Space: O(n)

function removeDuplicates(nums) {
  const s = [...new Set(nums)];
  for (let i = 0; i < s.length; i++) nums[i] = s[i];
  return s.length;
}

Tradeoff: Works but ignores the sorted-input invariant. Databricks wants in-place.

2. Read-pointer / write-pointer (two pointers)

Slow pointer marks the next write slot; fast pointer scans. Write only when fast sees a new value.

Time: O(n)
Space: O(1)

function removeDuplicates(nums) {
  if (nums.length === 0) return 0;
  let slow = 0;
  for (let fast = 1; fast < nums.length; fast++) {
    if (nums[fast] !== nums[slow]) {
      slow++;
      nums[slow] = nums[fast];
    }
  }
  return slow + 1;
}

Tradeoff: O(1) extra space. The sorted invariant means duplicates are adjacent — that's why one comparison is enough.

Databricks-specific tips

Databricks grades the in-place version because the read/write-head pattern is exactly how their sort-distinct operator runs on a sorted partition without an extra allocation. Be ready to discuss how this scales: on a Spark DataFrame, after sortWithinPartitions you can run this exact two-pointer dedup with zero shuffle. Mentioning that bonus shows you understand operator pipelining.

Common mistakes

Comparing nums[fast] to nums[fast-1] instead of nums[slow] — works because of sort but conceptually conflates read and write heads.
Returning slow instead of slow + 1 — off-by-one.
Allocating a new array — the problem explicitly requires in-place modification.

Follow-up questions

An interviewer at Databricks may pivot to one of these next:

Allow at most 2 duplicates (LC 80).
Unsorted input — what's the best you can do in-place?
Distributed: dedup a Spark DataFrame after sorting within partitions.

Solve it now

Free. No sign-up. Python and JavaScript run instantly in your browser.

Output

Press Run or Cmd+Enter to execute

FAQ

Why two pointers and not one?

The slow pointer marks where to WRITE; the fast pointer is where to READ. They diverge whenever you skip a duplicate, which is the whole point of in-place compaction.

Does this work on unsorted input?

No — the algorithm assumes duplicates are adjacent. Unsorted needs a hash set or sort first.

4. Remove Duplicates from Sorted Array

Source citations

Problem

Constraints

Examples

Approaches

1. Set + rebuild

2. Read-pointer / write-pointer (two pointers)

Databricks-specific tips

Common mistakes

Follow-up questions

Solve it now

Editor Settings

FAQ

More Databricks coding interview questions

Companies that also ask Remove Duplicates from Sorted Array