Skip to main content

SQL Interview Questions for 2026: 45+ Questions on Joins, Indexes, Window Functions, Transactions, and the Live Coding Round

SQL interview questions in 2026 fall into three buckets: concept recall (joins, normalization, ACID), query writing (the live coding round where you type SQL while someone watches), and optimization reasoning (why is this query slow, how would you index it). The new-grad gap is that you can read SQL fine but freeze when asked to write a window function or explain an execution plan out loud. This guide gives 45+ questions across all three, with answer outlines you can say in your own voice.

By Sam K., Founder, InterviewChamp.AI · Last updated

24 min read

What are the most common SQL interview questions in 2026?

The most common SQL interview questions in 2026 are: write a query to find the second-highest salary, get the top N rows per group, calculate a running total, find duplicate rows, explain the difference between INNER JOIN and LEFT JOIN, and explain why a slow query is slow and where you would add an index. Almost every SQL round mixes a few concept questions with at least one live query you type while someone watches. The watching part is what trips up new grads who can read SQL but have never written it under pressure.

That is the whole game. You are not graded on whether you can recognize correct SQL. You are graded on whether you can produce working SQL against an unfamiliar schema, out loud, in five minutes, while explaining your reasoning. This guide sorts the surface area into the buckets interviewers actually test, gives you the question banks for each, and shows you the recurring coding patterns that cover most of what gets asked.

What SQL interviews test in 2026

SQL interviews in 2026 test three distinct skills, and most candidates over-prepare the first one. Concept recall is whether you can explain join types, normalization, indexing, and ACID without reading off a slide. Query writing is whether you can sit at a blank editor and produce a correct query against a schema you have never seen. Optimization reasoning is whether you can look at a slow query or an execution plan and say what is wrong and how to fix it.

The distribution most new-grad candidates report from their SQL rounds in the 2025-2026 hiring cycle:

  • 30-40% concept recall (joins, normalization, ACID, WHERE vs HAVING)
  • 40-50% live query writing (the coding round on a shared screen or a take-home)
  • 15-25% optimization reasoning (slow query diagnosis, index choices, reading a plan)

The split shifts hard by role. A data analyst loop is mostly query writing with light optimization. A data engineer or backend engineer loop pushes optimization to 30% or more, because those roles own the queries that run in production at scale. A general software engineer loop often treats SQL as one section inside a broader technical round, testing the fundamentals plus one or two queries.

The 2026 hiring environment rewards the candidate who can write SQL live. Take-home assessments and proctored remote screens are still common, but the live shared-screen round has become the default filter at most mid-market and large employers, because it is the cheapest way to tell whether a resume's "proficient in SQL" line is real. A candidate who can recite the difference between a clustered and non-clustered index but freezes when asked to write a top-3-per-category query reads as someone who studied for the interview without doing the reps.

Key terms

The vocabulary below is the floor. If any of these is fuzzy, your concept-recall answers will wobble and your query-writing will be slow because you will be reasoning about mechanics instead of the problem.

Join
An operation that combines rows from two or more tables based on a related column, producing a result set that draws columns from each table.
Index
A separate data structure (usually a B-tree) that lets the database find matching rows without scanning the whole table, trading extra storage and slower writes for faster reads.
Normalization
Organizing tables to remove redundant data and prevent update anomalies, expressed as a series of normal forms (1NF, 2NF, 3NF, and beyond).
Transaction
A unit of work containing one or more statements that either all commit together or all roll back together, providing the ACID guarantees.
Window function
A function that computes a value across a set of related rows defined by an OVER clause without collapsing those rows into one, unlike an aggregate with GROUP BY.
Execution plan
The step-by-step strategy the query planner chooses to run a query, showing whether it uses an index seek or a full table scan, the join algorithm, and the estimated row counts.

Three more that show up constantly in answers. A primary key is the column (or set of columns) that uniquely identifies each row and cannot be NULL. A foreign key is a column that references the primary key of another table, enforcing referential integrity so you cannot insert an order for a customer that does not exist. Cardinality is the number of distinct values in a column, and it drives index decisions: a high-cardinality column like an email address is a good index candidate, while a low-cardinality column like a boolean flag usually is not.

SQL join interview questions (8 Q)

Joins are the most-tested concept because they are the foundation of every non-trivial query. The questions below cover the join types and the traps interviewers set.

Q1. What are the different types of joins?

INNER JOIN returns only rows with a match in both tables. LEFT JOIN returns all rows from the left table plus matches from the right, with NULLs where there is no match. RIGHT JOIN is the mirror image. FULL OUTER JOIN returns all rows from both tables, matched where possible and NULL-filled where not. CROSS JOIN returns the Cartesian product, every row of A paired with every row of B. The interview signal is knowing that a LEFT JOIN is the right tool when you want to keep rows that have no match, like customers who placed zero orders.

Q2. What is the difference between INNER JOIN and LEFT JOIN?

INNER JOIN keeps only matched rows. LEFT JOIN keeps every left-table row regardless of match. The trap interviewers love: filtering a LEFT JOIN on a right-table column in the WHERE clause. If you write the filter in WHERE, the NULL rows from unmatched left records fail the condition and disappear, silently converting your LEFT JOIN into an INNER JOIN. To keep unmatched rows, move the right-table condition into the ON clause:

-- Wrong: this drops customers with no 2026 orders
SELECT c.name, o.total
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
WHERE o.order_year = 2026;

-- Right: keeps every customer, NULL total if no 2026 order
SELECT c.name, o.total
FROM customers c
LEFT JOIN orders o
  ON o.customer_id = c.id AND o.order_year = 2026;

Q3. What is a self-join and when do you use one?

A self-join joins a table to itself, usually with table aliases. You use it to compare rows within the same table: finding employees who earn more than their manager (join employees to employees on manager_id), or comparing this month's row to last month's. The interview-relevant point is that there is nothing special about the mechanics; the same table just appears twice with two aliases.

Q4. What is a cross join and when is it dangerous?

A CROSS JOIN produces the Cartesian product: every row of the first table paired with every row of the second. A 1,000-row table crossed with a 1,000-row table yields 1,000,000 rows. The danger is the accidental cross join, which happens when you forget the join condition. Two tables of a few hundred thousand rows each with a missing ON clause can produce billions of rows and lock up the database. Interviewers sometimes show a query missing its join predicate to see if you spot it.

Q5. What is the difference between a join and a subquery?

A join combines columns from multiple tables into one result set. A subquery is a query nested inside another, used in a WHERE, FROM, or SELECT clause, often to compute a value or a filter set. Many problems can be solved either way. The interview-relevant nuance: correlated subqueries (which reference the outer query and run once per outer row) can be slow on large tables, and rewriting them as a join often speeds things up.

Q6. How do you find rows in table A that have no match in table B?

This is the anti-join, and it appears constantly. The clean way is a LEFT JOIN with a NULL check, or NOT EXISTS:

-- Customers who have never placed an order
SELECT c.id, c.name
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
WHERE o.id IS NULL;

NOT IN also works but breaks subtly if the subquery returns any NULLs, so NOT EXISTS or the LEFT JOIN NULL check is the safer answer to give.

Q7. What does a FULL OUTER JOIN return and when would you use it?

It returns every row from both tables, matched where a join condition holds and NULL-filled on the side that has no match. The realistic use case is reconciliation: comparing two datasets that should match, like a list of expected payments versus received payments, where you want to see both the unmatched expected rows and the unmatched received rows in one result.

Q8. What is the difference between UNION and UNION ALL?

Both stack two result sets vertically (the columns must match in count and type). UNION removes duplicate rows, which forces a sort or hash step and costs time. UNION ALL keeps all rows including duplicates and is faster. The interview signal is reaching for UNION ALL by default and only using UNION when you genuinely need duplicates removed, because the deduplication is not free.

SQL indexing interview questions (6 Q)

Indexing is where query-writing meets optimization. Backend and data-engineer loops push hard here. Analyst loops touch it lightly.

Q9. What is an index and how does it work?

An index is a separate data structure, usually a B-tree, that stores the indexed column's values in sorted order with pointers back to the table rows. Instead of scanning every row to find a match, the database walks the B-tree, which is a logarithmic-time lookup rather than a linear scan. The cost is extra storage and slower writes, because every insert, update, or delete must also maintain the index.

Q10. When does an index help and when does it hurt?

An index helps reads that filter or join on the indexed column, especially equality and range lookups on large tables. It hurts in three cases: on write-heavy tables where the maintenance cost outweighs the read benefit, on low-cardinality columns like a boolean where the index barely narrows the search, and on small tables where a full scan is already cheap. The senior answer names the trade-off explicitly: indexes speed reads and slow writes, so you add them deliberately, not by default.

Q11. What is the difference between a clustered and a non-clustered index?

A clustered index determines the physical order of the rows in the table, so there can be only one per table (the primary key is usually clustered). A non-clustered index is a separate structure pointing back to the rows, and you can have many. The interview-relevant consequence: a lookup on the clustered index is fast because the data is right there in order, while a non-clustered index lookup may need an extra step to fetch the full row.

Q12. What is a composite index and does column order matter?

A composite index covers multiple columns. Order matters a great deal. An index on (last_name, first_name) speeds queries that filter on last_name alone or on both, but not queries that filter on first_name alone, because the index is sorted by last_name first. This is the leftmost-prefix rule, and it is a favorite follow-up: interviewers ask which queries a given composite index can serve.

Q13. Why might a query with a matching index still do a full table scan?

The most common reason is wrapping a function around the indexed column, which blocks the index. WHERE YEAR(created_at) = 2026 cannot use an index on created_at because the database must compute YEAR() on every row. Rewrite it as a range:

-- Blocks the index on created_at
WHERE YEAR(created_at) = 2026;

-- Uses the index: a sargable range predicate
WHERE created_at >= '2026-01-01'
  AND created_at <  '2027-01-01';

Other causes: the planner estimates the query returns most of the table (a scan is then cheaper than many index lookups), or statistics are stale.

Q14. What is a covering index?

A covering index includes every column the query needs, so the database can answer the query from the index alone without going back to the table. If a query selects customer_id and total and filters on customer_id, an index on (customer_id, total) covers it. The benefit is skipping the row lookups entirely. The cost is a wider index that takes more space and more write maintenance.

SQL normalization interview questions (5 Q)

Normalization questions reveal whether you understand schema design, not just query syntax. They are common in data-platform and backend rounds.

Q15. What is normalization?

Normalization is organizing tables to reduce redundancy and prevent anomalies when you insert, update, or delete data. The goal is that each fact lives in exactly one place, so updating it does not require touching multiple rows. It is described in normal forms, each stricter than the last.

Q16. Explain 1NF, 2NF, and 3NF.

First normal form: every column holds a single atomic value, no repeating groups or comma-separated lists in one cell. Second normal form: the table is in 1NF and every non-key column depends on the whole primary key, not just part of a composite key. Third normal form: the table is in 2NF and no non-key column depends on another non-key column (no transitive dependencies). A clean way to say it in an interview: "every non-key column depends on the key, the whole key, and nothing but the key."

Q17. What is denormalization and why would you do it on purpose?

Denormalization is deliberately adding redundancy back into a schema to speed reads. Read-heavy analytics and reporting systems denormalize so a dashboard query does not have to join six tables every time it loads. The trade-off is that you now maintain the redundant data consistently, usually through application logic or scheduled jobs. The mature answer is that normalization and denormalization are tools for different access patterns, not a right-and-wrong choice.

Q18. What problems does normalization prevent?

Three classic anomalies. An update anomaly: a fact stored in many rows means an update must touch all of them, and missing one leaves the data inconsistent. An insertion anomaly: you cannot record a new fact because you lack other unrelated data the row requires. A deletion anomaly: deleting one row accidentally erases a fact you wanted to keep. Naming these three by name is the signal that you understand why normalization exists rather than treating it as a rule to memorize.

Q19. When is a fully normalized schema the wrong choice?

When the workload is read-heavy and join cost dominates, as in analytics warehouses and reporting layers. Star and snowflake schemas in data warehousing deliberately denormalize dimensions to keep queries fast. The right answer in a backend interview is rarely "always go to the highest normal form"; it is "normalize the transactional system, then denormalize selectively for the read paths that need it."

SQL aggregation interview questions: GROUP BY and HAVING (5 Q)

Aggregation questions test whether you understand the order SQL evaluates a query, which is where the WHERE versus HAVING confusion lives.

Q20. What is the difference between WHERE and HAVING?

WHERE filters individual rows before grouping. HAVING filters groups after a GROUP BY has aggregated them. You cannot use an aggregate like COUNT() or SUM() in WHERE because the aggregation has not happened yet at that stage. "Find customers with more than five orders" needs the count in HAVING:

SELECT customer_id, COUNT(*) AS order_count
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5;

Q21. In what order does SQL logically evaluate a query?

The logical order is FROM, then WHERE, then GROUP BY, then HAVING, then SELECT, then ORDER BY, then LIMIT. This order explains two facts interviewers probe: why you cannot reference a SELECT alias in the WHERE clause (WHERE runs before SELECT), and why aggregates belong in HAVING not WHERE (HAVING runs after grouping). Knowing this sequence answers a whole family of "why does this error" questions.

Q22. What does COUNT(*) count versus COUNT(column)?

COUNT(*) counts all rows in the group, including rows where columns are NULL. COUNT(column) counts only rows where that specific column is not NULL. This is a frequent trap: COUNT(email) on a table where some emails are NULL returns a smaller number than COUNT(*), and confusing the two produces wrong analytics.

Q23. How do NULLs behave in aggregate functions?

Aggregates skip NULLs. SUM, AVG, MIN, and MAX ignore NULL values entirely, and AVG divides by the count of non-NULL rows, not the total row count. This matters: the average of [10, 20, NULL] is 15, not 10, because the NULL is excluded from both the sum and the divisor. Stating this explicitly signals you understand NULL semantics, which is one of the most common sources of subtle SQL bugs.

Q24. Can you filter on an aggregate without a GROUP BY?

Yes. HAVING works on a query with no GROUP BY by treating the entire result as a single group. SELECT COUNT(*) FROM orders HAVING COUNT(*) > 100 returns the count only if there are more than 100 orders. It is uncommon in practice but a fair test of whether you understand that HAVING operates on groups, and a query with no GROUP BY is one implicit group.

SQL subqueries vs CTEs interview questions (4 Q)

This pairing is a 2026 favorite because CTEs and window functions together are how modern SQL is written.

Q25. What is a CTE and how is it different from a subquery?

A CTE (common table expression) is a named temporary result set defined with a WITH clause at the top of a query, referenced by name in the main query. A subquery is an unnamed query nested inline. Functionally they overlap heavily. The differences that matter in an interview: a CTE can be referenced multiple times in the same query, reads top-to-bottom which is easier to follow for complex logic, and supports recursion, which subqueries do not.

Q26. When should you prefer a CTE?

Prefer a CTE when the same derived set is used more than once, when nesting would otherwise go three or more levels deep, or when you need recursion (walking a hierarchy like an org chart or a category tree). Prefer a plain inline subquery for a single simple filter where naming it adds ceremony without clarity. In most modern engines the planner optimizes both similarly, so the choice is about readability far more often than raw speed.

Q27. What is a recursive CTE and what is it for?

A recursive CTE references itself to walk hierarchical or graph-shaped data. The classic use is an org chart: start with the CEO, then repeatedly join to find direct reports until you reach the leaves. The structure is an anchor member (the starting rows) UNION ALL a recursive member (the rows that join back to the CTE). Interviewers ask it to separate candidates who have only written flat queries from those who have handled tree-shaped data.

Q28. What is a correlated subquery and why can it be slow?

A correlated subquery references a column from the outer query, so it must run once for every row the outer query produces. On a large outer table that can mean millions of executions. The fix is often to rewrite it as a join or as a window function, which the planner can execute in a single pass. Recognizing the correlation and knowing the rewrite is the optimization signal interviewers look for.

SQL window function interview questions (6 Q)

This is the single highest-leverage section. Top-N-per-group and running-total questions appear in most live coding rounds, and both need window functions.

Q29. What is a window function and how is it different from GROUP BY?

A window function computes a value across a set of related rows but does not collapse them. GROUP BY reduces each group to one row; a window function keeps every row and adds the computed column alongside. You define the window with OVER, optionally PARTITION BY to reset per group and ORDER BY to define order within the window. That is the core distinction interviewers want: same aggregation power, but row-level output preserved.

Q30. What is the difference between ROW_NUMBER, RANK, and DENSE_RANK?

All three assign a number to rows within an ordered window. ROW_NUMBER gives a unique sequential number with no ties. RANK gives tied rows the same number and then skips, so two rows tied at rank 1 are followed by rank 3. DENSE_RANK gives tied rows the same number and does not skip, so the tie at 1 is followed by 2. The interview-relevant choice: use ROW_NUMBER when you need exactly one row per group (like the single latest order), and RANK or DENSE_RANK when ties should share a position.

Q31. How do you get the top N rows per group?

This is the most common live SQL coding question. Use a window function to number rows within each partition, then filter:

-- Top 3 best-selling products per category
SELECT category, product, revenue
FROM (
  SELECT category, product, revenue,
         ROW_NUMBER() OVER (
           PARTITION BY category
           ORDER BY revenue DESC
         ) AS rn
  FROM product_sales
) ranked
WHERE rn <= 3;

You cannot filter on the window function in the same SELECT (it is computed after WHERE), so you wrap it in a subquery or CTE and filter on the alias. Knowing that wrapping requirement is the detail that separates people who have written this from people who have only read about it.

Q32. How do you calculate a running total?

Use SUM over a window ordered by the running dimension:

SELECT order_date, amount,
       SUM(amount) OVER (
         ORDER BY order_date
         ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
       ) AS running_total
FROM orders;

The ROWS BETWEEN frame is the precise part. Without an explicit frame, the default frame in many engines includes peer rows with the same ORDER BY value, which can produce a result you did not intend. Stating the frame explicitly is the senior move.

Q33. What do LAG and LEAD do?

LAG returns a value from a previous row in the window; LEAD returns a value from a following row. They are how you compare a row to its neighbor without a self-join: month-over-month growth, the gap between consecutive logins, the change from the prior reading. LAG(amount, 1) gives the previous row's amount; subtract it from the current row to get the delta. Reaching for LAG/LEAD instead of a self-join signals fluency with modern SQL.

Q34. What is the difference between PARTITION BY and GROUP BY?

GROUP BY collapses each group into one output row. PARTITION BY, inside a window function's OVER clause, defines the group the window computes over but keeps every row. So AVG(salary) OVER (PARTITION BY department) shows every employee alongside their department's average salary in the same row, whereas GROUP BY department would return one row per department. Same grouping idea, different output shape.

SQL transaction and ACID interview questions (4 Q)

Transaction questions test whether you understand correctness under concurrency, which matters most for backend and platform roles.

Q35. What is ACID?

Four guarantees. Atomicity: all statements in a transaction commit together or roll back together. Consistency: a transaction takes the database from one valid state to another, respecting all constraints. Isolation: concurrent transactions do not interfere with each other, governed by the isolation level. Durability: once committed, the change survives a crash or power loss. The canonical example is a money transfer: atomicity guarantees you never debit one account without crediting the other.

Q36. What are the SQL isolation levels?

Four standard levels, from weakest to strongest: Read Uncommitted (can read uncommitted changes, allowing dirty reads), Read Committed (only reads committed data), Repeatable Read (the same query returns the same rows within a transaction), and Serializable (transactions behave as if run one at a time). Higher isolation prevents more anomalies but reduces concurrency. Most production systems default to Read Committed and step up only for code paths that need stronger guarantees.

Q37. What are dirty reads, non-repeatable reads, and phantom reads?

A dirty read is reading data another transaction wrote but has not committed (and might roll back). A non-repeatable read is re-reading a row within the same transaction and getting a different value because another transaction committed a change in between. A phantom read is re-running a range query and getting new rows that another transaction inserted. Each isolation level prevents a progressively larger set of these: Read Committed stops dirty reads, Repeatable Read stops non-repeatable reads, Serializable stops phantoms too.

Q38. What is a deadlock and how do you prevent it?

A deadlock happens when two transactions each hold a lock the other needs, so neither can proceed. The database detects it and kills one transaction so the other completes. Prevention strategies: acquire locks in a consistent order across all code paths, keep transactions short so locks are held briefly, and use the lowest isolation level that is still correct for the use case. Knowing that the database resolves deadlocks by aborting a victim transaction (which your application must be ready to retry) is the practical detail interviewers want.

The live SQL coding round

The live coding round is where most SQL interviews are decided, and it is the format new grads under-rehearse. You are given a schema, often two to four tables, and asked to write a query on a shared screen or in a collaborative editor while the interviewer watches and listens. Reading correct SQL is easy. Producing it under observation, narrating as you go, is a different muscle.

What the round actually grades, in order:

  1. Do you ask clarifying questions first? Before writing, confirm the schema: which table holds the foreign key, can the left side have zero matches, should NULLs be included, is there a tie-breaking rule for "the latest" row. Two or three good clarifying questions buy thinking time and signal that you know requirements shape the query.
  2. Do you narrate your approach before typing? Say the plan out loud: "I'll join orders to customers, group by customer, then filter with HAVING." Interviewers grade reasoning, and a silent candidate who types a correct query still scores lower than one who explains it.
  3. Can you write it correctly and incrementally? Strong candidates build up: get the join working, run it, add the grouping, run it, add the filter. They do not write 30 lines and hope. Incremental building is how you catch your own mistakes live.
  4. Can you reason about correctness and edge cases? After it runs, say what could break: "this assumes every order has a valid customer_id; if there are orphaned orders, I'd switch to a LEFT JOIN." That unprompted edge-case thinking is the senior signal even at the entry level.

The recurring prompts are a small set. Drill these six until you can write each one cold, both with a window function and with a plain subquery, because some interviewers ban window functions to make you prove you understand the mechanics:

PromptFirst-choice approachFallback if window functions are banned
Second-highest salaryDENSE_RANK() then filter = 2Correlated subquery or LIMIT 1 OFFSET 1
Top N rows per groupROW_NUMBER() partitioned, filter <= NCorrelated subquery counting greater-than rows
Running totalSUM() OVER (ORDER BY ...)Self-join summing all earlier rows
Find duplicate rowsCOUNT(*) OVER (PARTITION BY key) then filter > 1GROUP BY key HAVING COUNT(*) > 1
Rows in A with no match in BLEFT JOIN ... WHERE b.id IS NULLNOT EXISTS subquery
Compare row to prior rowLAG() over an ordered windowSelf-join on a sequence or date offset

A worked example of the second-highest-salary prompt, the version that handles ties correctly:

-- Second-highest distinct salary (ties share a rank)
WITH ranked AS (
  SELECT salary,
         DENSE_RANK() OVER (ORDER BY salary DESC) AS rnk
  FROM employees
)
SELECT DISTINCT salary
FROM ranked
WHERE rnk = 2;

Using DENSE_RANK rather than ROW_NUMBER is deliberate: if two people share the top salary, DENSE_RANK correctly treats the next distinct salary as second, while ROW_NUMBER would wrongly return the second copy of the top salary. Saying that out loud is exactly the kind of reasoning that wins the round.

This is the part you cannot fake by reading. You have to rehearse writing SQL out loud against a schema you have not seen. If you want reps without scheduling a friend for every session, you can walk into your SQL round able to say the answer out loud after drilling live prompts in a mock where a schema appears and you react to a model query in your own words.

How to prepare for SQL interview questions

A focused two-week plan, calibrated for a new grad who can read SQL but has not written it live under observation. Compress it if you are further along.

  1. Rebuild the fundamentals against a real schema (days 1-3). Load a sample customers-orders-products database. Write every join type by hand. Write GROUP BY with HAVING. Internalize the WHERE versus HAVING distinction and the logical evaluation order until it is automatic. Narrate why each query returns what it returns. The goal is production, not recognition: you should type a working join with no reference open.
  2. Master window functions (days 4-6). This is the highest-leverage topic. Drill ROW_NUMBER, RANK, DENSE_RANK with PARTITION BY and ORDER BY. Write a running total with an explicit ROWS BETWEEN frame. Write a LAG/LEAD neighbor comparison. Then write top-3-per-category cold. Clear that bar and you have cleared the highest bar most rounds set.
  3. Drill the six recurring coding patterns (days 7-9). Second-highest value, find duplicates, top N per group, running total, anti-join, self-join comparison. Write each twice, once with a window function and once with a subquery. Time yourself: under five minutes per query, schema-reading included.
  4. Learn the optimization story (days 10-12). Understand what an index is, when it helps and when it hurts. Read one execution plan and learn to spot a full table scan versus an index seek. Memorize the three most common slow-query causes: a missing index on a join or filter column, a function wrapped around an indexed column, and an accidental cross join. This is what separates backend and data-engineer candidates from analysts.
  5. Rehearse the live coding round out loud (days 13-14). Take an unfamiliar schema, set a timer, and write queries while narrating, exactly as on a shared screen. Lead with clarifying questions, then build incrementally, then call out edge cases. Do at least eight prompts. The first run is awkward; by the eighth, narrating-while-typing is natural, which is the actual skill the round grades.
  6. Run two timed mock SQL rounds (end of week 2). Forty-five minutes each. First mock: concept recall plus three query prompts. Second mock: one extended schema with a top-N-per-group question and an optimization discussion. Narrate everything. A live interview assistant you can start for a $3 trial can pose a fresh schema on demand and give you a quiet model answer to react to, so a full rep costs the price of a coffee instead of a friend's evening.

The non-negotiable step is step five. Interviewers can tell within the first minute whether you have written SQL live before or only studied it. There is no shortcut for that calibration except doing the reps.

Common mistakes

The mistakes that sink the most new-grad SQL rounds in the 2025-2026 hiring cycle, each paired with the fix:

  • Filtering a LEFT JOIN in WHERE and silently dropping the unmatched rows. The fix: put right-table conditions in the ON clause when you need to preserve NULL rows, and double-check by asking yourself whether a row with no match should survive.
  • Putting an aggregate in WHERE instead of HAVING. The fix: remember the logical order. Aggregation happens after WHERE, so any condition on COUNT, SUM, or AVG belongs in HAVING. Stating the evaluation order out loud prevents this live.
  • Trying to filter on a window function in the same SELECT. The fix: window functions are computed after WHERE, so wrap the query in a CTE or subquery and filter on the aliased column. This is the number-one bug in live top-N-per-group attempts.
  • Reciting index definitions but never naming the trade-off. The fix: every indexing answer should end with "indexes speed reads and slow writes," then name when you would and would not add one. Definitions without trade-offs read as memorization.
  • Writing the whole query silently, then hoping it runs. The fix: narrate the plan first, build incrementally, run after each piece. A correct silent query scores lower than a correct narrated one, and an incremental build catches your own mistakes before the interviewer does.

Related guides


About the author: Sam K. is the founder of InterviewChamp.AI, building AI interview prep for the new-grad CS market and writing about the modern interview gauntlet from the inside.

Related guides

Interview Process

DevOps Interview Questions for 2026: 45+ Questions Across CI/CD, IaC, Observability, Kubernetes, Deployments, SRE, and Incident Response

DevOps interview questions in 2026 cluster into six areas: CI/CD pipelines, infrastructure-as-code, monitoring and observability, containers and orchestration, deployment strategies (blue-green and canary), and the SRE plus incident-response thinking that ties them together. The new-grad gap is reasoning about production trade-offs you've never owned on call. This guide gives 45+ questions across those areas, the culture questions interviewers slip in, and the honest framing that works when you've never carried a pager.

Sam K. ·

Read more →
Interview Process

Docker Interview Questions for 2026: 40+ Q's Across Images vs Containers, the Layer Cache, Dockerfiles, Volumes, Networking, Compose, and the Orchestration Intro New Grads Get Wrong

Docker interview questions in 2026 split between definition recall (what's an image vs a container, what's a layer) and the build-and-debug scenarios that separate the candidate who ran `docker run` once from the one who actually shipped a containerized app. Expect questions on the layer cache, Dockerfile authoring, volumes, networking, docker-compose, multi-stage builds, containers vs VMs, and a closing intro to orchestration. This guide gives 40+ questions with answer outlines and a three-week prep plan that doesn't ask you to fake production experience.

Sam K. ·

Read more →
Interview Process

JavaScript Interview Questions for 2026: 45+ Questions on Closures, the Event Loop, Promises, this-Binding, and the Live-Coding Tasks Interviewers Actually Ask

JavaScript interview questions in 2026 cluster into three buckets: language internals (closures, the event loop, prototypal inheritance, this-binding, hoisting), async reasoning (promises, async/await, microtasks), and live coding (debounce, throttle, deep clone, array transforms). The new-grad trap is reciting MDN definitions you can't apply when the follow-up asks why your code logs that value. This guide gives 45+ questions with answer outlines, the live-coding tasks you'll actually face, and how to sound like you've shipped JavaScript instead of memorized it.

Sam K. ·

Read more →

Frequently asked questions

What SQL interview questions should I expect in 2026?
Expect three buckets. Concept recall (joins, normalization, ACID, the difference between WHERE and HAVING) makes up roughly 30-40% and screens out people who only memorized syntax. The live SQL coding round (write a query against a given schema while someone watches) is 40-50% and is where most rounds are decided. Optimization reasoning (why is this query slow, where would you add an index, read this execution plan) is 15-25% and rises sharply for data-engineer and backend roles. Entry-level analyst loops lean toward query writing; backend and data-platform loops lean toward optimization.
What are the most common SQL coding interview questions?
The recurring patterns are: find the second-highest or Nth-highest value, find duplicate rows, get the top N rows per group, calculate a running total or moving average, find rows in table A with no match in table B, and write a self-join to compare a row to a prior row. Most live SQL coding questions are a variation on one of these six. If you can write each of them cold using a window function and using a plain subquery, you cover the majority of what gets asked in the 2025-2026 hiring cycle.
What is the difference between an INNER JOIN and a LEFT JOIN?
An INNER JOIN returns only rows that have a match in both tables. A LEFT JOIN returns every row from the left table plus the matching rows from the right table, filling the right-side columns with NULL where there is no match. The interview-relevant trap is filtering a LEFT JOIN in the WHERE clause: putting a condition on the right table in WHERE silently turns the LEFT JOIN back into an INNER JOIN because NULL rows fail the condition. Put right-table conditions in the ON clause instead when you want to keep unmatched left rows.
What is database normalization and why do interviewers ask about it?
Normalization is organizing tables to reduce redundancy and prevent update anomalies, usually described in normal forms. First normal form removes repeating groups, second removes partial dependencies on a composite key, and third removes transitive dependencies (non-key columns depending on other non-key columns). Interviewers ask because it reveals whether you understand why a schema is shaped the way it is. The honest senior answer also covers denormalization: read-heavy analytics systems deliberately denormalize to avoid expensive joins, so the right answer is rarely 'always normalize to the highest form.'
What are window functions and when do you use them?
A window function performs a calculation across a set of rows related to the current row without collapsing them into one row, unlike GROUP BY which collapses groups into a single row each. You use the OVER clause to define the window, optionally with PARTITION BY to reset per group and ORDER BY to define ordering. Common uses: ranking rows (ROW_NUMBER, RANK, DENSE_RANK), running totals (SUM OVER an ordered window), and comparing a row to a neighbor (LAG, LEAD). They are the single highest-leverage SQL topic to drill because top-N-per-group and running-total questions appear in most live coding rounds.
What is ACID in databases?
ACID is four guarantees a transaction provides. Atomicity means all statements in the transaction succeed or none do. Consistency means the transaction moves the database from one valid state to another, respecting constraints. Isolation means concurrent transactions do not interfere, governed by isolation levels. Durability means once committed, the change survives a crash. Interviewers ask ACID to test whether you understand why you wrap a multi-step money transfer in a transaction: without atomicity, a crash mid-transfer could debit one account without crediting the other.
What is the difference between WHERE and HAVING in SQL?
WHERE filters individual rows before any grouping happens. HAVING filters groups after a GROUP BY has aggregated them. You cannot use an aggregate like COUNT or SUM in a WHERE clause because the aggregation has not happened yet, but you can in HAVING. A common interview question is 'find customers with more than five orders' which requires GROUP BY customer then HAVING COUNT bigger than five. Putting that count in WHERE is the classic mistake that signals shaky fundamentals.
When should I use a CTE instead of a subquery?
A CTE (common table expression, the WITH clause) and a subquery often produce the same result, so the choice is usually about readability and reuse. Use a CTE when the same derived result is referenced more than once, when you are nesting logic three or more levels deep, or when you need a recursive query, which only CTEs support. Use an inline subquery for a single, simple, one-off filter. Modern query planners optimize both similarly in most engines, so the decision is rarely about raw speed for typical workloads.
How do I prepare for a SQL interview as a new grad?
Spend two weeks. Week one: rebuild the fundamentals (join types, GROUP BY versus HAVING, normalization, ACID) and write each by hand against a sample schema. Week two: drill the six recurring coding patterns and the optimization story (indexes, execution plans, common slow-query causes). The non-negotiable step is writing SQL out loud against a real schema while narrating your logic, because the live coding round tests whether you can produce a query under observation, not whether you can recognize a correct one in a multiple-choice quiz.
Do SQL interviews allow you to look things up?
It depends on the format. Take-home SQL assessments and many remote screens let you use documentation and a real database to test your query. Live coding rounds, especially on a shared screen, usually expect you to write a working query without looking up exact syntax, though most interviewers will give you a function name if you describe what you want. The safest preparation is to know the core syntax cold (joins, GROUP BY, window functions, CTEs) so you are not dependent on lookups when the format does not allow them.