Simplifying SQL Queries with Aliases: A How-To Guide
Categories
SQL aliases are a simple yet efficient way of simplifying your SQL queries. In this guide, we’ll show you all the benefits and provide examples of their use.
Today, we’re going to talk about one of the techniques for simplifying SQL queries. If you’re writing complex SQL code, it can sometimes be hard to read or understand it, even if you wrote it yourself. Using aliases is something SQL professionals use to write code quicker and make it more understandable.
It’s a simple technique that you’re going to absolutely love!
What is SQL Alias?
In SQL, an alias is a temporary name usually given by a user to a column or a table. This name does not change the actual name of a column or a table, as it is used only when a query is executed.
Aliases can also be assigned to subqueries and CASE statements.
SQL Alias Syntax
The keyword for assigning aliases is AS. However, the aliases can be assigned even without the keyword. I’d recommend always using AS; it will make your code more readable and easier to recognize if there’s an alias.
Column Alias Syntax
Here’s how you assign the alias to a column.
SELECT column AS alias_name
FROM table;
Without AS, it looks like this.
SELECT column alias_name
FROM table;
Both codes would work, but the first version is more readable.
Table Alias Syntax
The syntax for the table alias is basically the same, only applied to a table instead of a column.
SELECT column
FROM table AS alias;
or
SELECT column
FROM table alias;
Subquery Alias Syntax
SELECT column
FROM (SELECT column_1,
column_2,
…
FROM table) AS alias;
You can do the same without AS.
CASE Statement Alias Syntax
You assign an alias to a CASE statement in the same manner.
SELECT column,
CASE
WHEN condition THEN result
ELSE default_result
END AS alias;
Again, you can do that without AS.
While these are all examples of using aliases, they have slightly different purposes.
The primary purpose of the column and CASE aliases is to give the output column a simpler and more descriptive name. The point should be that from the column alias, it is easy to discern what data in the column represents.
Regarding table aliases, they are usually used to shorten the table’s name. In complex queries, the tables can be referenced multiple times. Writing a table’s full name many times can be tiring, especially if the table’s name is long. By using aliases, you’ll write less code and, hence, do it quicker. Additionally, the table can be joined with itself, which can’t be done without giving ‘both’ tables different aliases.
Giving a subquery an alias makes it possible to reference its result in the main query. This is mandatory; without it, the code won’t execute.
Benefits of Using SQL Aliases
In a way, I already talked about the benefits of using SQL aliases. But let’s list them explicitly and talk briefly about them.
1. Improved Code Readability
With complex data tasks, SQL code can become very complex. This complexity results in hundreds of code lines written, sometimes. With that, code becomes hard to understand even for experts. One way to help yourself (and others!) understand the code is to use aliases.
Giving understandable and descriptive names to tables and columns adds at least a little bit of natural language element to programming languages. You must say that understanding the purpose of a certain column is much easier when it’s named, say, average_salary_by_department rather than 01_column_amount.
Also, when you give aliases to subqueries, the code is split into logical chunks, making it easier to comprehend what each subquery does.
2. Simplified Code
In complex queries, there are usually many JOIN statements, which include many tables. Some tables can be used in several joins, which means repeating the table names each time you join them. This can be very mundane and time-consuming, especially when the tables’ names are long.
Assigning a simpler alias (sometimes even consisting of only one letter) makes your code less elaborate and easier to write. Imagine that you have to write the table name data_dump_analytics_end_of_month tens of times whenever you reference it in FROM or JOIN.
Give it an alias, dd, and your fingers will thank you. Writing two characters instead of 32 (yes, I counted it!) is much easier and quicker.
3. Improving Coding Productivity
Speaking of quickness, writing shorter columns, tables, subqueries, or CASE statement names will make your query much faster.
Of course, the longer the names, the higher the possibility of making a typo. You run the query, and then it throws an error saying that the table doesn’t exist. Then, you need to review the query, find a mistake, and correct it. Then you rerun it and realize you misspelled the table name in some other code line, go back to it, and fix it. You get the picture of why SQL aliases improve your coding productivity!
4. Avoiding Naming Conflicts
In all the above cases, SQL aliases are not mandatory but help. There are also situations when using aliases is mandatory.
The most common example is when you self-join tables. This means you join a table with itself. This doesn’t work unless you give tables different aliases. One table acts as two different tables, which is achieved by giving the same table two different aliases.
You’ll also be listing several columns from ‘both’ tables. The columns are named the same in both tables, so the database won’t know which table to take a column from unless you use an alias as an ‘acting’ table name and make it clear which table and column you’re referring to.
A similar scenario can happen even if you join two different tables; they can have several columns with the same name. This often occurs with the ID columns, usually simply named id. The database doesn’t know which ID column you’re referring to unless you reference the table before the column name. Again, using a short table alias instead of the table’s long name is much easier.
Practical Examples of SQL Aliases
We’ll now leave the theory behind us and see how using aliases works in actual SQL code. For that, we’ll use SQL interview questions from our platform.
SQL Alias as a Column Name
Here’s a question by Spotify.
Interview Question Date: January 2024
You're tasked with analyzing a Spotify-like dataset that captures user listening habits. For each user, calculate the total listening time and the count of unique songs they've listened to. In the database duration values are displayed in seconds. Round the total listening duration to the nearest whole minute.
The output should contain three columns: 'user_id', 'total_listen_duration', and 'unique_song_count'.
Link to the question: https://platform.stratascratch.com/coding/10367-aggregate-listening-data
The task is to calculate the total listening time and the count of unique songs for every user.
The table used will be listening_habits.
user_id | song_id | listen_duration |
---|---|---|
101 | 5001 | 240 |
101 | 5002 | 0 |
102 | 5001 | 300 |
102 | 5003 | 0 |
101 | 5001 | 240 |
The solution uses ROUND(), SUM(), and COALESCE() to sum the total listening time by the user and round the time to the nearest whole minute.
SELECT user_id,
ROUND(SUM(COALESCE(listen_duration, 0)) / 60.0) ,
COUNT(DISTINCT song_id)
FROM listening_habits
GROUP BY user_id;
If you run the code, you get this result.
user_id | round | count |
---|---|---|
101 | 8 | 2 |
102 | 5 | 2 |
103 | 6 | 1 |
104 | 6 | 2 |
105 | 4 | 1 |
106 | 3 | 1 |
107 | 4 | 1 |
108 | 8 | 2 |
109 | 0 | 1 |
The calculated columns have names round and count. Round of what, count of what? That’s something only you know, the author of the code. Even you will forget what you meant by this in five minutes. Every time you go back to this code, you’ll probably have to analyze it again to understand what it does and what the output means. Everybody else? You send the code output to someone and wait for the questions to start flowing in: What is this? What does this column mean? Could you please translate it?
Avoiding this is simple if you use SQL aliases.
Name one column total_listen_duration, and the other unique_song_count, and everything will be perfectly clear.
SELECT user_id,
ROUND(SUM(COALESCE(listen_duration, 0)) / 60.0) AS total_listen_duration,
COUNT(DISTINCT song_id) AS unique_song_count
FROM listening_habits
GROUP BY user_id;
user_id | total_listen_duration | unique_song_count |
---|---|---|
101 | 8 | 2 |
102 | 5 | 2 |
103 | 6 | 1 |
104 | 6 | 2 |
105 | 4 | 1 |
Pretty neat!
SQL Alias as a Table Name
Let’s move on to giving aliases to the tables. The Amazon question wants you to find movies whose runtime is less than or equal to the flight's duration for flight 101.
Interview Question Date: January 2024
As a data scientist at Amazon Prime Video, you are tasked with enhancing the in-flight entertainment experience for Amazon’s airline partners. Your challenge is to develop a feature that suggests individual movies from Amazon's content database that fit within a given flight's duration. For flight 101, find movies whose runtime is less than or equal to the flight's duration.
The output should list suggested movies for the flight, including 'flight_id', 'movie_id', and 'movie_duration'."
Link to the question: https://platform.stratascratch.com/coding/10360-movie-duration-match
There are two tables, you can use. The first one is entertainment_catalog.
movie_id | title | duration |
---|---|---|
1 | The Great Adventure | 120 |
2 | Space Journey | 90 |
3 | Ocean Mystery | 60 |
4 | The Lost City | 150 |
5 | Mountain Quest | 110 |
The second table is named flight_schedule.
flight_id | flight_duration | flight_date |
---|---|---|
101 | 240 | 2024-01-01 |
102 | 180 | 2024-01-02 |
103 | 240 | 2024-01-03 |
104 | 150 | 2024-01-04 |
105 | 300 | 2024-01-05 |
The code is relatively straightforward: join the tables on the condition that the movie is equal to or shorter than the flight duration, filter data to show only flight 101, and there you go.
SELECT flight_schedule.flight_id,
entertainment_catalog.movie_id,
entertainment_catalog.duration AS movie_duration
FROM flight_schedule
JOIN entertainment_catalog ON entertainment_catalog.duration <= flight_schedule.flight_duration
WHERE flight_schedule.flight_id = 101
ORDER BY entertainment_catalog.duration;
But you need to write the full table names when joining. If you also want to make it more understandable which table each column is coming from, you’ll need to write the full names of the tables in front of each column you use. Writing entertainment_catalog or flight_schedule, let me count, eight times unnecessarily prolongs the writing of this quite simple query.
Avoid that by using aliases! If you give tables aliases fs and ec, you’ll need to write the full names only in FROM and JOIN. After that, you just reference the table by its alias, which consists of two letters, not ten or more.
See also how the code suddenly becomes more readable and seemingly less complex.
SELECT fs.flight_id,
ec.movie_id,
ec.duration AS movie_duration
FROM flight_schedule AS fs
JOIN entertainment_catalog AS ec ON ec.duration <= fs.flight_duration
WHERE fs.flight_id = 101
ORDER BY ec.duration;
flight_id | movie_id | movie_duration |
---|---|---|
101 | 3 | 60 |
101 | 9 | 75 |
101 | 8 | 85 |
101 | 2 | 90 |
101 | 6 | 95 |
SQL Alias as a CASE Statement Name
The CASE statement results in a new column added to the output, so giving it an alias has the same purpose as giving an alias to any regular or calculated column.
I’ll demonstrate this on a question by Block.
Interview Question Date: May 2023
Identify the number of employees within each department that share the same birth month. Your output should list the department, birth month, and the number of employees from that department who were born in that month. If a month has no employees born in it within a specific department, report this month as having 0 employees. The "profession" column stores the department names of each employee.
Link to the question: https://platform.stratascratch.com/coding/10355-employees-with-same-birth-month
We need to identify the number of employees within each department that share the same birth month.
We’ll work with the table named employee_list.
first_name | last_name | profession | employee_id | birthday | birth_month |
---|---|---|---|---|---|
John | Smith | Engineer | 1 | 1985-02-15 | 2 |
Sarah | Johnson | Doctor | 2 | 1970-11-13 | 11 |
Johnson | Miller | Teacher | 3 | 1988-07-08 | 7 |
Emma | Johnson | Doctor | 4 | 1968-08-04 | 8 |
Paul | Johnson | Manager | 5 | 1986-01-14 | 1 |
As per the question requirement, we need to show each month and the number of employees born that month.
In the solution, we use SQL CASE WHEN extensively. First, we extract the month from the column birthday, assign the value of 1 when the condition in CASE is met, and then count the number of occurrences, which equals the number of employees.
SELECT profession AS department,
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 1 THEN 1 END),
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 2 THEN 1 END),
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 3 THEN 1 END),
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 4 THEN 1 END),
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 5 THEN 1 END),
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 6 THEN 1 END),
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 7 THEN 1 END),
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 8 THEN 1 END),
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 9 THEN 1 END),
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 10 THEN 1 END),
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 11 THEN 1 END),
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 12 THEN 1 END)
FROM employee_list
GROUP BY profession
ORDER BY profession;
The above code gives us this.
department | count | count | count | count | count | count | count | count | count | count | count | count |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Accountant | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 |
Doctor | 3 | 0 | 3 | 1 | 2 | 0 | 3 | 4 | 1 | 1 | 3 | 0 |
Engineer | 0 | 2 | 1 | 2 | 4 | 3 | 0 | 3 | 1 | 2 | 0 | 1 |
Lawyer | 2 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 |
Manager | 2 | 0 | 3 | 0 | 2 | 0 | 0 | 2 | 3 | 1 | 2 | 0 |
Nurse | 0 | 0 | 1 | 0 | 0 | 2 | 0 | 1 | 0 | 1 | 0 | 0 |
Software Dev. | 1 | 1 | 0 | 0 | 2 | 1 | 2 | 2 | 2 | 0 | 1 | 2 |
Teacher | 0 | 2 | 1 | 0 | 1 | 0 | 1 | 1 | 2 | 2 | 1 | 4 |
There are 12 columns named count, and you, of course, have no idea which column refers to which month. You can only assume that the columns are listed chronologically. As Under Siege 2, the influential source of philosophical thinking, teaches us: “Assumption is the mother of all fuck ups.”
You can avoid the said outcome by giving each CASE statement an alias, like this.
SELECT profession AS department,
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 1 THEN 1 END) AS "Month_1",
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 2 THEN 1 END) AS "Month_2",
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 3 THEN 1 END) AS "Month_3",
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 4 THEN 1 END) AS "Month_4",
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 5 THEN 1 END) AS "Month_5",
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 6 THEN 1 END) AS "Month_6",
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 7 THEN 1 END) AS "Month_7",
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 8 THEN 1 END) AS "Month_8",
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 9 THEN 1 END) AS "Month_9",
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 10 THEN 1 END) AS "Month_10",
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 11 THEN 1 END) AS "Month_11",
COUNT(CASE WHEN EXTRACT(MONTH FROM birthday) = 12 THEN 1 END) AS "Month_12"
FROM employee_list
GROUP BY profession
ORDER BY profession;
The output is now self-explanatory.
department | Month_1 | Month_2 | Month_3 | Month_4 | Month_5 | Month_6 | Month_7 | Month_8 | Month_9 | Month_10 | Month_11 | Month_12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Accountant | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 |
Doctor | 3 | 0 | 3 | 1 | 2 | 0 | 3 | 4 | 1 | 1 | 3 | 0 |
Engineer | 0 | 2 | 1 | 2 | 4 | 3 | 0 | 3 | 1 | 2 | 0 | 1 |
Lawyer | 2 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 |
Manager | 2 | 0 | 3 | 0 | 2 | 0 | 0 | 2 | 3 | 1 | 2 | 0 |
SQL Alias as a Subquery Name
I’ll show you how to give a subquery an alias in this final example. The question by Forbes asks you to find the three most profitable companies in the world.
Find the 3 most profitable companies in the world. Sort the result based on profits in descending order. If multiple companies have the same profit, assign them the same rank and include all tied companies in the top results. Output the result along with the corresponding company name.
Link to the question: https://platform.stratascratch.com/coding/10354-most-profitable-companies
We’re given the forbes_global_2010_2014 table.
company | sector | industry | continent | country | marketvalue | sales | profits | assets | rank |
---|---|---|---|---|---|---|---|---|---|
ICBC | Financials | Major Banks | Asia | China | 215.6 | 148.7 | 42.7 | 3124.9 | 1 |
China Construction Bank | Financials | Regional Banks | Asia | China | 174.4 | 121.3 | 34.2 | 2449.5 | 4 |
Agricultural Bank of China | Financials | Regional Banks | Asia | China | 141.1 | 136.4 | 27 | 2405.4 | 8 |
JPMorgan Chase | Financials | Major Banks | North America | United States | 229.7 | 105.7 | 17.3 | 2435.3 | 20 |
Berkshire Hathaway | Financials | Investment Services | North America | United States | 309.1 | 178.8 | 19.5 | 493.4 | 17 |
The solution has two subqueries. Giving subqueries an alias is mandatory, which we did. The first subquery is named sq1, the second is sq2.
SELECT company,
profit
FROM
(SELECT *,
RANK() OVER (ORDER BY profit DESC) AS rank
FROM
(SELECT company,
SUM(profits) AS profit
FROM forbes_global_2010_2014
GROUP BY company) AS sq1) AS sq2
WHERE rank <=3;
Of course, the code will return the expected output. However, the names we’ve given to subqueries are really stupid. Can anyone understand what sq1 and sq2 do only by reading the aliases? Not a chance!
You can make your code more readable by assigning descriptive aliases to the subqueries, for instance, sum_of_profits and company_ranking. With these names, one glance at the subquery name is enough to understand what this part of the code does.
SELECT company,
profit
FROM
(SELECT *,
RANK() OVER (ORDER BY profit DESC) AS rank
FROM
(SELECT company,
SUM(profits) AS profit
FROM forbes_global_2010_2014
GROUP BY company) sum_of_profits) company_ranking
WHERE rank <=3;
department | Month_1 | Month_2 | Month_3 | Month_4 | Month_5 | Month_6 | Month_7 | Month_8 | Month_9 | Month_10 | Month_11 | Month_12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Accountant | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 |
Doctor | 3 | 0 | 3 | 1 | 2 | 0 | 3 | 4 | 1 | 1 | 3 | 0 |
Engineer | 0 | 2 | 1 | 2 | 4 | 3 | 0 | 3 | 1 | 2 | 0 | 1 |
Lawyer | 2 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 |
Manager | 2 | 0 | 3 | 0 | 2 | 0 | 0 | 2 | 3 | 1 | 2 | 0 |
Tips for Using SQL Aliases Effectively
There are four guidelines I advise you to follow to use SQL aliases' full potential.
1. Use Clear and Descriptive Aliases
Aliases can make your code easier to understand. To achieve that, you must use descriptive and self-explanatory aliases.
When using giving table aliases, the goal is usually just to shorten the table name. So, shortening the table name order_analytics to o or oa is fine.
But when using aliases with column names or subqueries, it’s better to give descriptive names. In the first case, your reports will be easier to understand if the column names are clear. In the second case, you saw how descriptive subqueries’ names make it easier to read the code.
2. Be Consistent With Naming Conventions
Strive to apply the same naming convention across all the queries. It’ll make your code look tidier and more understandable, which helps if the same code is used across the team or multiple teams.
It’s not that important which naming convention you choose. It’s much more important to stick to it once you decide on a naming convention. For example, you can choose that table names will be written in lowercase, and the alias will be the first letter of the table’s each word. So, the table order becomes o, and the table employee_salary becomes es. When it comes to naming subqueries and columns, you again decide to write everything in lowercase but separate each word with an underscore. For example, the column which counts the number of employees becomes number_of_employees.
Whatever you do, stick to the convection you choose! There’s nothing worse than seeing an alias o, then Emp_Sal, and the column named NO_of_EMP. Even writing it hurts my eyes!
3. Avoid Ambiguities
This advice is especially important when joining tables.
When assigning aliases to the tables, it’s important not to create confusion by aliases being similar or the same as the already existing table by two aliases being too similar.
For example, if you’re joining tables customers and customer_contacts, don’t give them aliases c and cc. This will confuse you, I guarantee it! It’s much better to give the cust and cont aliases.
4. Use the AS Keyword
I’ve mentioned this already, but it’s worth repeating: when assigning aliases, use the keyword AS even though it’s possible to do without it. Using AS will make your code clearer and easier to recognize since the alias always follows the keyword AS.
Common Mistakes and How to Avoid Them
There are generally no fatal mistakes when using SQL aliases. However, you can make your code more confusing than it should be.
One example is non-intuitive aliases, which you can avoid by using descriptive aliases.
Another thing you should avoid is using SQL aliases too excessively. If you write a simple query that references only one table and does that only once, and the columns in SELECT do not require referencing the table, there’s really no point in giving the table an alias.
Conclusion
SQL aliases are an easy but effective way of simplifying your queries. They can result in writing your code quicker and making your code lines shorter. This also adds to code readability, which helps you whenever you come back to the same code or anyone who has to use the code you’ve written. Additionally, aliases allow you to divide complex code into logical blocks and give more intuitive names to the code output columns.
While SQL aliases are not complicated, you should practice using them so they become second nature. They are omnipresent in SQL codes written by professionals. You’ll see that in every bit more complex SQL code, many of which are in our coding interview questions section.