(editing) Data Science Interview Prep - SQL
Interview Questions & Answers
SQL
Questions from:
Medium “Google Data Scientist Interview Questions Step by Step” by John. H
GitHub “Data Science Interview Questions & Answers” by Yossef Hosni
How would you calculate the median in SQL?
- There are many ways to calculate the median in SQL, but one approach is to use a combination of the COUNT and RANK functions. First, count the total number of values in the column using COUNT, then rank each value using RANK (with an ORDER BY clause). Finally, select the value with a rank equal to half of the total count.
How would you handle missing values in a dataset using SQL?
- One approach to handling missing values is to use the CASE statement. This allows you to set specific conditions for replacing missing values with another value, such as zero or the average of the column’s non-missing values.
What’s the difference between using a UNION and a UNION ALL in SQL?
-
The main difference between using a UNION and a UNION ALL in SQL is that UNION automatically removes duplicate records from the results, whereas UNION ALL includes all duplicates.
By default, UNION performs a distinct operation on the results, which can be helpful when you want to ensure that all returned rows are unique. On the other hand, UNION ALL does not perform any duplicate removal, making it faster in cases where you know the datasets do not overlap or when duplicate records are needed in the result set.
Leave a comment