Similarities & Differences: Left Join vs. Left Outer Join
Categories
Today, we have a clash of titans: LEFT JOIN vs. LEFT OUTER JOIN! Are they different at all? Why answer ‘yes’ or ‘no’ when we can write an article on that?
JOIN is an SQL clause that is crucial when you want to work with more than one table. When would you want to do that? Errrm, always? The database logic is such that stuffing all data into one table is against the rules of database normalization. If you don’t need JOIN to query it, it’s not a database. It’s an Excel sheet.
Now that we’re on the same page with its importance let’s see which JOINs are there in SQL.
Types of JOINs You Must Know
JOINs allow you to use more than one table in one query. Joining tables generally in SQL means ‘merging’ data from two or more tables and using data from multiple tables.
Why do we mention two or more tables? Because it’s possible to join 3 or more tables in SQL. And it doesn’t stop there! The SQL JOINs allow you to join an endless number of tables.
The four main types of SQL JOINs you should know in the middle of the night and suffering from fever are:
- JOIN
- LEFT JOIN
- RIGHT JOIN
- FULL JOIN
JOIN is a type of join that returns only values that are the same in both tables you’re joining.
When talking about LEFT JOIN and RIGHT JOIN, it’s always important to visualize the tables you’re joining.
LEFT JOIN will return all the records from the left table and only the matching rows from the right table.
RIGHT JOIN will do exactly the opposite. It outputs all the records from the right table and the matching rows from the left one.
Don’t worry if you don’t get this left, right, left, right. You’ll see what we mean when we get to the example.
FULL JOIN doesn’t discriminate – it returns all records from all tables. Left, right, it doesn’t matter.
One additional JOIN we should mention is a CROSS JOIN. It returns a Cartesian product, which combines each row from one table with each from the second table. One of the main reasons you should know it is so you can avoid it. Or at least be careful when using it on large datasets because it could cost you dearly.
Some sources mention self-join as a separate JOIN, but it’s not technically a distinct type of JOIN. It is simply a table joined with itself. You can use any of the above join types to self-join a table.
LEFT JOIN: Syntax and Usage
The syntax for every JOIN is the same. The only thing that changes is the JOIN keyword you will be using.
LEFT JOIN Syntax
The syntax for the LEFT JOIN is:
SELECT column_name
FROM table1
LEFT JOIN table2 ON table1.column_name = table2.column_name;
The first line of code selects the columns from the tables you want to join.
You specify the first table in the FROM clause. Then you use the keyword that will call the join type you want. In this example, it’s LEFT JOIN.
But what does ‘left’ mean here? If you visualize tables side by side, table1 will be the left table, and table2 will be the right.
The tables are joined on a matching condition between them. In other words, on the column the two tables have in common. To state this condition, use the keyword ON. After that, specify which column from the first table should be equal to which column from the second table.
LEFT JOIN Usage
This type of JOIN is used when you want to show all data from the left table and only the matching ones from the right. In a way, you’re filtering data from the right table.If there are some rows in the left table that couldn’t be found in the right table, those non-existing values will be shown as NULL.
LEFT OUTER JOIN: Syntax and Usage
Now, let’s see the syntax for this JOIN type.
LEFT OUTER JOIN Syntax
Here’s how the syntax changes.
SELECT column_name
FROM table1
LEFT OUTER JOIN table2 ON table1.column_name = table2.column_name;
There’s not much change here. Almost everything is exactly the same! Except there is LEFT OUTER JOIN instead of LEFT JOIN.
LEFT OUTER JOIN Usage
This type of JOIN is used the same way as LEFT JOIN. It, too, outputs all data from the left table and only the matching rows from the right. When there are no matching rows in the right table, it will show NULL.
LEFT JOIN vs. LEFT OUTER JOIN
So, what is the difference between Left Join and Left Outer Join? The definitions sound the same, so let’s try to see the difference in an example.
SQL LEFT JOIN Example
We’ll use the Airbnb question to show you how LEFT JOIN works. For an explanation of other JOIN types, you should refer to SQL JOIN Interview Questions.
Interview Question Date: May 2020
Display the average number of times a user performed a search which led to a successful booking and the average number of times a user performed a search but did not lead to a booking. The output should have a column named action with values 'does not book' and 'books' as well as a 2nd column named average_searches with the average number of searches per action. Consider that the booking did not happen if the booking date is null. Be aware that search is connected to the booking only if their check-in dates match.
Link to the question: https://platform.stratascratch.com/coding/10124-bookings-vs-non-bookings
Dataset
The question gives you two tables. The first one is airbnb_contacts.
It’s a table of all contacts between the (potential) guest and (potential) host.
The second table is airbnb_searches. It contains data about the users’ searches of accommodation.
Solution
The first step is to write a query that will give us the unique guest ID, check-in date, and booking time. We want only searches where the time of booking in the table airbnb_contacts is not null to get all the searches that resulted in booking.
SELECT DISTINCT id_guest, ds_checkin, ts_booking_at
FROM airbnb_contacts
WHERE ts_booking_at IS NOT NULL;
id_guest | ds_checkin | ts_booking_at |
---|---|---|
fad1a097-a511-4f44-a603-6a271c1f159e | 2014-10-05 | 2014-10-05 11:21:05 |
2889fccc-37ab-4a66-8d64-41b31314c7fc | 2014-10-31 | 2014-10-09 11:16:40 |
bdaf2e68-86dd-40d9-a5a1-9cc95ea25d91 | 2014-10-16 | 2014-10-12 10:11:00 |
7e309181-e61e-426b-baef-dd031d5660d3 | 2014-10-27 | 2014-10-08 12:37:05 |
d418a1ab-b181-40a7-90fe-7216e40dc354 | 2014-10-22 | 2014-08-28 17:33:01 |
Now, this SELECT statement will become a subquery in the LEFT JOIN. This is because the question asks you to show searches that became booking and those that didn’t. Remember, LEFT JOIN shows all data from the left table. In our case, table airbnb_searches is the left one; we want all searches from it.
Then, we join it with the query we wrote above, and it becomes our right table. On what condition do we join them? The first condition is that the user becomes a guest – therefore, id_user needs to equal id_guest.
The other joining condition is that check-in dates are the same in both tables; this is stated in the question.
We also gave aliases to both tables, so we don’t need to write their full names whenever we reference them.
SELECT *
FROM airbnb_searches s
LEFT JOIN
(SELECT DISTINCT id_guest,
ds_checkin,
ts_booking_at
FROM airbnb_contacts
WHERE ts_booking_at IS NOT NULL) c ON s.id_user = c.id_guest
AND s.ds_checkin = c.ds_checkin;
ds | id_user | ds_checkin | ds_checkout | n_searches | n_nights | n_guests_min | n_guests_max | origin_country | filter_price_min | filter_price_max | filter_room_types | filter_neighborhoods | id_guest | ds_checkin | ts_booking_at |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2014-10-11 | 67aece73-e112-4e9e-9e05-8a2a94b003b9 | 5 | 1 | 1 | IT | 0 | 99 | ,Private room | |||||||
2014-10-01 | 6cbb33d1-6ecc-4f74-8b6a-a43d07d484b6 | 2014-10-04 | 2014-10-07 | 11 | 3 | 3 | 3 | ES | 0 | 567 | ,Entire home/apt,Entire home/apt,Private room,Entire home/apt,Private room,Shared room | ||||
2014-10-03 | aa9cf5bf-5667-4212-8018-1cb8beee530e | 2014-11-14 | 2014-11-16 | 17 | 2 | 2 | 2 | GB | 0 | 171 | ,Entire home/apt | aa9cf5bf-5667-4212-8018-1cb8beee530e | 2014-11-14 | 2014-10-03 17:51:58 | |
2014-10-09 | 3e6c2466-74fe-44c0-a6f3-dda79755d30a | 2015-02-26 | 2015-03-02 | 9 | 4 | 1 | 4 | GB | 0 | 240 | ,Entire home/apt | ||||
2014-10-13 | a09bf912-b21d-4859-b194-8512c30695f6 | 2014-10-18 | 2014-10-22 | 7 | 4 | 1 | 2 | GB | ,Entire home/apt |
The next step is to give data some labels. We do that using the CASE WHEN statement. The data will be marked as ‘books’ when the booking time is not null, and the column ds_checkin is the same in both joined tables.
If the condition is not met, the data will be labeled as ‘does not book’. The column is named as action.
SELECT CASE
WHEN c.ts_booking_at IS NOT NULL
AND c.ds_checkin = s.ds_checkin THEN 'books'
ELSE 'does not book'
END AS action
FROM airbnb_searches s
LEFT JOIN
(SELECT DISTINCT id_guest,
ds_checkin,
ts_booking_at
FROM airbnb_contacts
WHERE ts_booking_at IS NOT NULL) c ON s.id_user = c.id_guest
AND s.ds_checkin = c.ds_checkin;
action |
---|
does not book |
does not book |
books |
does not book |
does not book |
We now only need to find the averages for each search category. We do that by using AVG() on the column n_searches and GROUP BY the column action.
Output
The query returns this output.
action | average_searches |
---|---|
books | 23.33 |
does not book | 22.01 |
If we used JOIN instead of LEFT JOIN, you would get the wrong average in the ‘does not book’ line.
action | average_searches |
---|---|
books | 23.33 |
Why? Because it would include only users that become guests. All those who only searched for accommodation would be excluded from the calculation.
And now, the LEFT OUTER JOIN example!
SQL LEFT OUTER JOIN Example
Let’s look at this Microsoft question.
Link to the question: https://platform.stratascratch.com/coding/10081-find-the-number-of-employees-who-received-the-bonus-and-who-didnt
Dataset
The dataset consists of two tables: employee and bonus.
The employee data is given below.
id | first_name | last_name | age | sex | employee_title | department | salary | target | bonus | city | address | manager_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
5 | Max | George | 26 | M | Sales | Sales | 1300 | 200 | 150 | Max@company.com | California | 2638 Richards Avenue | 1 |
13 | Katty | Bond | 56 | F | Manager | Management | 150000 | 0 | 300 | Katty@company.com | Arizona | 1 | |
11 | Richerd | Gear | 57 | M | Manager | Management | 250000 | 0 | 300 | Richerd@company.com | Alabama | 1 | |
10 | Jennifer | Dion | 34 | F | Sales | Sales | 1000 | 200 | 150 | Jennifer@company.com | Alabama | 13 | |
19 | George | Joe | 50 | M | Manager | Management | 100000 | 0 | 300 | George@company.com | Florida | 1003 Wyatt Street | 1 |
The data shows the list of all Microsoft employees with all the relevant information.
The bonus table is a list of employees that got the bonus.
worker_ref_id | bonus_amount | bonus_date |
---|---|---|
1 | 5000 | 2020-02-16 |
2 | 3000 | 2011-06-16 |
3 | 4000 | 2020-02-16 |
1 | 4500 | 2020-02-16 |
2 | 3500 | 2011-06-16 |
As you can see, there are not many of them.
Solution
The task is to find all employees who received the bonus, and then those who didn’t. We need both tables to do that.
In other words, this means we need all the employees from the employee table. Then we’ll use the bonus table to determine whether the employee received a bonus or not. This calls for the LEFT OUTER JOIN.
SELECT e.id,
CASE
WHEN bonus_date IS NULL THEN 0
ELSE 1
END AS has_bonus
FROM employee e
LEFT OUTER JOIN bonus b ON e.id = b.worker_ref_id;
The tables are joined on the columns id and worker_ref_id. We also use the CASE WHEN statement to give employees a value of 0 or 1, depending on whether they received a bonus or not.
This is the query’s (partial) output.
Now that we have determined which employees have received bonuses and which don’t, we should count them.
The above query will become a subquery. The main query groups data by the has_bonus column of a subquery and uses the COUNT() function to count occurrences. No point in counting employees several times, so we should use a DISTINCT clause.
SELECT has_bonus,
COUNT(DISTINCT base.id) AS n_employees
FROM
(SELECT e.id,
CASE
WHEN bonus_date IS NULL THEN 0
ELSE 1
END AS has_bonus
FROM employee e
LEFT OUTER JOIN bonus b ON e.id = b.worker_ref_id) AS base
GROUP BY has_bonus;
This is it! Run the query to see the output.
Output
And the output is shown below.
There are 27 employees without the bonus and only three that received the bonus.
Now, the question is would this work with JOIN? If we replaced the LEFT OUTER JOIN with it, here’s what the output would be.
As in the previous question, using JOIN would mean you’d miss the whole category of data: employees that didn’t receive the bonus.
Now, wait a minute! If LEFT JOIN and LEFT OUTER JOIN are used to avoid the same mistake, how are they different?
What’s the difference between SQL LEFT JOIN and LEFT OUTER JOIN, Then?
Let’s try to answer the question by replacing LEFT OUTER JOIN with LEFT JOIN.
This is the same code as above, but with LEFT JOIN.
SELECT has_bonus,
COUNT(DISTINCT base.id) AS n_employees
FROM
(SELECT e.id,
CASE
WHEN bonus_date IS NULL THEN 0
ELSE 1
END AS has_bonus
FROM employee e
LEFT JOIN bonus b ON e.id = b.worker_ref_id) AS base
GROUP BY has_bonus;
Take a look at the output!
The output is exactly the same! So, what is the big answer to the big question? Is LEFT JOIN different from LEFT OUTER JOIN?
The answer is: NO! There is not the slightest difference between LEFT JOIN and LEFT OUTER JOIN. Except that in the first case, the keyword OUTER is omitted.
The thing is, SQL accepts both syntaxes. That’s why it’s allowed to write LEFT JOIN instead of LEFT OUTER JOIN. But both join ‘types’ are outer join. And they both output all data from the left table and only the matching data from the right table.
Some of you may ask if there’s a difference in performance of LEFT JOIN vs LEFT OUTER JOIN. Nope. Still no difference. No matter how you write it, both joins will return data in equal time.
Which Should I Use: LEFT JOIN or LEFT OUTER JOIN?
Result- and performance-wise, it really doesn’t matter.
However, if you’re a beginner SQL user, we recommend using a full name: LEFT OUTER JOIN. That way, you’ll explicitly state that your JOIN is outer join. You won't have to think whenever you return to your code; the word OUTER will be there. Until you master joins, this way, you efficiently practice distinguishing between an inner and outer join and which to use.
If you’re a more experienced user, you’ll probably appreciate the idea of writing less code whenever it’s possible. In this case, use LEFT JOIN. You already know well that it’s an outer join.
Summary
In today’s article “LEFT JOIN vs LEFT OUTER JOIN”, we learned that there’s no difference between LEFT JOIN and LEFT OUTER JOIN. Which one you choose to use virtually boils down to a personal preference.
Whatever you like more, make sure you know fully what you’re doing and why you’re using LEFT (OUTER) JOIN instead of some other join type. Becoming really sure of that requires writing a lot of code. And making a lot of mistakes. The coding questions we have on StrataScratch don’t punish your mistakes but offer a great opportunity to solve an ocean of problems involving LEFT JOIN.
OK, and LEFT OUTER JOIN, too!