Skip to main content

1. Two Sum

easyAsked at Databricks

Given an array of integers, return indices of the two numbers that add up to a target. Databricks uses this as a warm-up to see if you naturally reach for a hash map and to gauge whether you can articulate the brute-force-to-optimal tradeoff in distributed terms.

By Alex Chen, Founder, InterviewChamp.AI · Last verified

Source citations

Public interview reports confirming this problem appears in Databricks loops.

  • Glassdoor (2025-11)Databricks phone screen warm-up; expect the follow-up about scaling to a Spark DataFrame.
  • LeetCode Discuss (2026-Q1)Listed as the canonical Databricks SDE-I phone-screen opener.

Problem

Given an array of integers nums and an integer target, return indices of the two numbers such that they add up to target. You may assume that each input would have exactly one solution, and you may not use the same element twice. You can return the answer in any order.

Constraints

  • 2 <= nums.length <= 10^4
  • -10^9 <= nums[i] <= 10^9
  • Only one valid answer exists.

Examples

Example 1

Input
nums = [2,7,11,15], target = 9
Output
[0,1]

Explanation: nums[0] + nums[1] == 9.

Example 2

Input
nums = [3,2,4], target = 6
Output
[1,2]

Approaches

1. Brute force double loop

Check every pair (i, j) with i < j.

Time
O(n^2)
Space
O(1)
function twoSum(nums, target) {
  for (let i = 0; i < nums.length; i++) {
    for (let j = i + 1; j < nums.length; j++) {
      if (nums[i] + nums[j] === target) return [i, j];
    }
  }
}

Tradeoff: Quadratic. Never mention this as your final answer at Databricks — they want the hash-map insight first.

2. One-pass hash map

Walk once; for each x, check if target - x is already in the map. Otherwise store x's index.

Time
O(n)
Space
O(n)
function twoSum(nums, target) {
  const seen = new Map();
  for (let i = 0; i < nums.length; i++) {
    const need = target - nums[i];
    if (seen.has(need)) return [seen.get(need), i];
    seen.set(nums[i], i);
  }
}

Tradeoff: Linear time, single pass. The key insight is reframing a pair search as a 'have I seen the complement' lookup.

Databricks-specific tips

Databricks interviewers will pivot from this single-machine problem to 'now imagine nums is a Spark DataFrame with 10^11 rows.' Be ready to discuss why the naive self-join on (df, target - df.value) is shuffle-heavy and how a broadcast hash join or a map-side aggregation could replace it. Articulating the lazy-evaluation model (transformations stack until an action triggers a shuffle) is the bonus signal they want.

Common mistakes

  • Returning values instead of indices — read the prompt twice; LC and Databricks both want indices.
  • Using a two-pass approach (build map, then scan) when one pass is cleaner and avoids double-counting i == j.
  • Forgetting that nums[i] + nums[j] could overflow in some languages — at Databricks JVM/Scala roles this matters.

Follow-up questions

An interviewer at Databricks may pivot to one of these next:

  • What if the array is sorted? (Two pointers, O(1) space.)
  • What if you must return ALL pairs that sum to target? (Hash map of value -> list of indices.)
  • Scale to a Spark DataFrame with 100B rows — broadcast small side or self-join with salting?

Solve it now

Free. No sign-up. Python and JavaScript run instantly in your browser.

Output

Press Run or Cmd+Enter to execute

FAQ

Why a hash map and not sorting + two pointers?

Sorting is O(n log n) and you lose the original indices. The hash-map version is O(n) and preserves indices directly.

What if there are duplicates?

The one-pass map handles it naturally: when you reach the second copy of a value, the first copy's index is already stored, so you find the pair.

Practice these live with InterviewChamp.AI

Drill Two Sum and other Databricks interview questions under real-loop conditions with instant feedback on your reasoning, complexity claims, and code.

Practice these live with InterviewChamp.AI →