Be interview ready by practicing these top SQL Join interview questions. Answers and approach included.
Joins in SQL are commands that we can use to pull data or combine data from different tables based on a common column. To join two tables (table_1, table_2) with a common key present as “id” using an SQL Join, we can use the SQL Join Query:
SELECT * FROM table_1, table_2 WHERE table_1.id = table_2.id;
SQL joins are important because:
There are 4 different types of SQL Joins:
Inner Join would combine two tables based on the common values that are present on both tables. We can run an inner join on the table_1 and table_2 using the common column “id” by: SELECT * FROM table_1 INNER JOIN table_2 ON table_1.id = table_2.id;
Left Join would combine the records of one table (left table) and any matching records that are present on the other table (right table). Here’s how you would run a Left SQL Join on table_1 as left table and table_2 as right table. “id” as the common key.
SELECT * FROM table_1 LEFT JOIN table_2 ON table_1.id = table_2.id;
Right JOIN would do the opposite of what we expected from the Left SQL Join. Based on the common column, only matching values are merged from the left table. Here’s how you would run a Right SQL Join on table_1 as left table and table_2 as right table - with “id” as the common key.
SELECT * FROM table_1 RIGHT JOIN table_2 ON table_1.id = table_2.id;
Full JOIN or a full outer Join returns all the rows from the tables we join. For example, if we merge table_1 and table_2 even with zero matching id values - it will return all rows from table_1 and table_2. Here’s how we can run a Full SQL Join.
SELECT * FROM table_1 FULL OUTER JOIN table_2 ON table_1.id = table_2.id;
Merge SQL Joins combine records from different tables and sort them based on the key columns. For example, if we were merging table_1 and table_2 based on the column “id”, the output will be sorted by “id” values.
Left Join retains the values in the left table, while only pulling matched values from the right table. Right Join, however, retains the values in the right table, while only pulling values from the left table.
table_1
id | values |
---|---|
1 | 100 |
2 | 312 |
3 | 157 |
4 | 199 |
table_2
id | values |
---|---|
1 | 57 |
2 | 98 |
8 | 357 |
112 | 1206 |
If we Left JOIN table_1 as left table and table_2 as right table with “id” as the key: SELECT * FROM table_1 LEFT JOIN table_2 ON table_1.id = table_2.id;
Our output, in this case, will be:
id | values_L | values_R |
---|---|---|
1 | 57 | 57 |
2 | 98 | 98 |
3 | 357 | |
4 | 1206 |
If we Right Join table_1 as left table and table_2 as right table with “id” as the key: SELECT * FROM table_1 RIGHT JOIN table_2 ON table_1.id = table_2.id;
Our output, in this case, will be:
id | values_L | values_R |
---|---|---|
1 | 57 | 57 |
2 | 98 | 98 |
8 | 357 | |
112 | 1206 |
The difference between Joins and Nested queries are:
Here’s an example of performance difference between and statement usage between SQL Join and Nested queries:
Example - SQL Join SELECT store.p_name FROM store JOIN pricing ON pricing.p_id = store.id JOIN discount ON discount.p_id = pricing.p_id WHERE year = 2019 GROUP BY pricing.price ORDER BY pricing.price
Example - Nested queries SELECT store.p_name FROM store WHERE store.id IN (SELECT pricing.p_id FROM pricing WHERE pricing.p_id IN (SELECT discount.p_id FROM discount WHERE year = 2004)) ORDER BY pricing.price
The SQL Join syntax took ~ 25 seconds to process, while the nested queries took ~ 2.3 seconds to process.
table_1
id | sales |
---|---|
1 | 57 |
2 | 98 |
3 | 58 |
4 | 79 |
table_2
id | pricing |
---|---|
1 | 10 |
2 | 4 |
3 | 15 |
4 | 12 |
table_3
id | discount |
---|---|
1 | 57 |
2 | 98 |
3 | 35 |
4 | 12 |
To join these tables, we can run this SQL Join query:
SELECT t1.id, t1.sales, t2.pricing, t3.discount FROM table_1 as t1 JOIN table_2 as t2 ON t2.id = t1.id JOIN table_3 as t3 ON t3.id = t2.id;
We can use a Self Join to join a table to itself. We can use table_1 to understand how we can use SQL Self Join to get the category of the product with the table.
id | office_location | code | hq |
---|---|---|---|
1 | Houston | 12 | 1 |
2 | Dallas | 14 | 1 |
3 | New Jersey | 20 | 3 |
SELECT office.id, office.office_location, office.code, hq.hq as hqName FROM table_1 office JOIN table_1 hq ON employee.ManagerId = manager.Id
The output of this self SQL Join is:
id | office_location | code | hq |
---|---|---|---|
1 | Houston | 12 | Houston |
2 | Dallas | 14 | Houston |
3 | New Jersey | 20 | New Jersey |
Define a foreign key in your tables that you can use to perform SELECT operations with SQL Joins.
To best structure data to support many-to-many relationships focus on the classic design pattern of separating entities with relations.
Relational databases are designed to support join or query overhead so separating by entities serves as the best starting point.
A Natural SQL Join is join that creates a table by using the common column and data type. Here’s an example of what will happen if we run Natural Join on table_1 and table_2
table_1
month | size |
---|---|
Aug | 1223 |
Sep | 997 |
table_2
month | cost |
---|---|
Aug | 80 |
Sep | 46 |
Natural Join query on table_1 and table_2: SELECT * FROM table_1 NATURAL JOIN table_2;
The output
month | size | cost |
---|---|---|
Aug | 1223 | 80 |
Sep | 997 | 46 |
If you are ever asked this question in an interview, make sure to keep in mind that the interviewer is trying to understand how you would deal with complex SQL queries. The interviewer is trying to understand how you would deal with potential failure points.
In an ideal scenario, a candidate would start by stating that they would test as they write the query. This would help the interviewer understand that the candidate is carefully looking for SQL Join expansion, making sure that the Join works as intended and that there are unaccounted exclusions.
Denormalization improves the issue with expensive joins. We can apply these 3 different types of denormalization:
To review your code quality, you may be asked to improve the readability of an existing SQL Join query.
Here’s an example SQL Join query - our goal is to make it a bit more readable. FROM price s INNER JOIN stock C INNER JOIN location A ON C.a_Id = A.a_Id ON s.p_Id = C.c_Id AND s.s_code<>A.a_code
Readable SQL Join query: FROM price s INNER JOIN (SELECT * FROM stock C INNER JOIN location A ON C.a_Id = A.a_Id ) ca ON s.c_Id = ca.c_Id AND s.s_code<>ca.a_code
Note that the difference between both SQL Joins isn’t too big, but to write maintainable SQL, we recommend that you creatively think about how to improve the readability of your SQL joins.
Right SQL Joins are rarely used. And, even when used they are generally written when there’s an existing SQL Join query written in order to avoid reordering an existing SQL Join query.
To experience professionals, you might be judged on how you select SQL Join operations. Here’s an example from MAANG SQL interviews. The candidate was asked to select one out of these two Joins:
The choice here should be purely based on the modern syntax. The first one is an “explicit Join” and it naturally leads to a lower number of mistakes.
Copyright ©2024 Workstory Inc.