SQL COALESCE() Function: A Guide for PostgreSQL Users
Categories
Learn how to handle null values with ease and efficiency using the versatile COALESCE() function in PostgreSQL, with practical examples and insights.
In the world of database management, SQL plays an important role in accessing and manipulating data efficiently. Considering the mountains of data to manipulate, we are often faced with null values which refer to the absence of a value in a particular data field. Handling null values is especially important when our data manipulation techniques involve calculations, comparisons, or data retrievals. SQL Coalesce() function is specifically crafted to address this exact predicament.
What is COALESCE() Function in SQL?
SQL COALESCE() is a versatile function that allows you to evaluate multiple expressions and return the first non-null value. The manner in which it handles null values is efficient and concise. So you must know how to use COALESCE() function in your queries to handle null values better.
You can use the SQL COALESCE() function to replace multiple lines of complex conditional statements as it simplifies all the hassle.
Syntax and Parameters of SQL COALESCE Function
Let us look at the syntax of a SQL COALESCE() function in a general sense.
COALESCE(value1, value2, ..., valuen)
Here, the values viz., value1, value2,... valuen, are the columns or expressions that we wish to evaluate for null values.
Out of these values, the SQL COALESCE() function returns the first non-null value from the list. It is this functionality that is beneficial when dealing with columns that might contain null values.
The reason we called the COALESCE() function ‘versatile’ is that it provides a way for us to replace the null value with an alternate or default value. This neat trick makes our life easier when we go on to perform calculations and manipulations of those columns riddled with null values.
Here is a more practical syntax of the SQL COALESCE() function that highlights its provision for an alternate value.
COALESCE(value1, alternate_value)
As shown above, say value1 or column1 is the one that has the null values, then we can provide an alternate value with which to replace the null values so that we are left with a consistent dataset. If the value is not null, the original value is returned.
Once a non-null value has encountered, the rest of the list of expressions will NOT be evaluated.
Functions Similar to COALESCE in SQL
There are various SQL functions that are similar that are available in other formats of relational database management systems (RDBMS).
You might be familiar with the concept if you have used the NVL() function in Oracle which also accommodates the COALESCE() function. The NVL() function also takes two arguments, the former being the expression to be evaluated for null values and the latter, the replacement value. If the value is null, the replacement value is returned and if the value is not null, the original expression is returned.
NVL(expression, alternate_value)
Likewise, in SQL Server, we can find a similar function called ISNULL(). This function follows the same syntax of taking two arguments: the expression and the replacement value.
ISNULL(expression, alternate_value)
If we consider the world of MySQL, we can find the IFNULL() function that follows the same syntax and returns an alternate value when the prior expression is null.
IFNULL(expression, alternate_value)
To learn more about this function, check out our post “SQL IFNULL() Function”.
In PostgreSQL, we also have functions that perform similarly but operate differently.
For instance, the NULLIF() function takes two arguments as shown below. It returns NULL, if value1 is equal to value2. Otherwise, it returns value1.
NULLIF(value1, value2)
How to Use the SQL COALESCE() Function
By now you know that the COALESCE() function is a crucial aspect of SQL in validating and replacing null values with default values.
The SQL COALESCE() function can also be used with multiple arguments, as shown in the first syntax. It will evaluate each of those columns or expressions in the mentioned order until it reaches a non-null value to be returned.
There are some properties of the SQL COALESCE() function to keep in mind:
- The arguments must be of the same data type.
- The function can contain multiple arguments.
- It always evaluates for an integer first. So, an integer followed by a character expression yields an integer as an output.
Let’s check out an example to understand bullet point 3.
Say, you have written the following query:
SELECT COALESCE(NULL, NULL,1,’SARAH’)
The output of this query is ‘1’.
But if we have it the other way around.
SELECT COALESCE(NULL, NULL, ‘SARAH’, 1)
An error will be thrown reading, ‘Conversion failed when converting the varchar value ‘SARAH’ to data type int’. So the COALESCE() function always evaluates for an integer first.
COALESCE(): A Shortcut for CASE Expressions in SQL
The COALESCE() function is simply a syntactic shortcut for the CASE expression. Let me explain how.
Let’s consider the CASE expression shown below:
SELECT
CASE
WHEN column1 IS NOT NULL THEN column1
WHEN column2 IS NOT NULL THEN column2
WHEN column3 IS NOT NULL THEN column3
ELSE 'Alternative'
END AS result
FROM your_table;
If we wrote this using COALESCE() function, it would do the same thing but more concisely. See below:
SELECT COALESCE(column1, column2, column3, 'Alternative') AS result
FROM your_table;
SQL COALESCE() Function Examples
1. Simple Replacement
Let’s start with a simple example of the replacement of a single column with an alternate value.
Say, we are provided with a table that contains the customer contact information of a store. The columns are ID and phone number. Not all the customers have provided their phone numbers. In this scenario, we can fill the gaps in the database with ‘N/A’.
Here is the input table ‘customer_contact’:
The query would look like this:
SELECT ID, COALESCE(Phone_number, 'N/A') AS Phone_Number
FROM customer_contact;
And the output will be:
In this example, the COALESCE() function replaced the NULL values in the Phone_Numbers column with ‘N/A’ so that there is consistency in the column.
2. Multiple Columns
In case there are multiple columns with null values that you wish to handle, then the SQL COALESCE() function accommodates that as well. Let us check out the following example.
Table: product_details
In this table, both the Manufacturer and Supplier columns contain null values. Fortunately,, COALESCE() function can handle multiple columns as well. The query turns out to be as follows.
SELECT ID, Product_Name,
COALESCE(Manufacturer, Supplier, ‘N/A’) AS Provider
FROM product_details
The COALESCE() function returns the first non-null value from the Manufacturer and Supplier columns. If both values are NULL, then ‘N/A’ will replace the null value. This way, we can consolidate the provider information from multiple columns into one. In the output table below, we can see how it looks in the end.
3. Complex Calculations
Let us now, look at an example that involves a complex calculation and how the SQL COALESCE() function can be used to make such a calculation in the query.
The input table here is ‘order_details’ which has the following data.
The table contains the columns ID, product names, prices and quantities. Some entries of price and quantity information are null. In order for the user to calculate the Order_Total which is the product of price and quantity, these NULL values have to be replaced with some meaning value to avoid undefined entries in the table.
So, let us write the query as below:
SELECT ID, Product, COALESCE(Price * Quantity, 0) AS Order_Total
FROM order_details;
Therefore, the output of this query is:
Using SQL COALESCE for Data Validation
SQL COALESCE() function is utilized to ensure that valid data is retuned when comparing values or a default value is returned. This functionality is widely used in SQL all over the world. Let us look into this example where the goal is to validate the phone numbers of the emergency contacts of employees.
Say we have the following table from the database to consider: EmergencyContact
Here, the columns ‘homephone’, ‘workphone’, and ‘cellphone’ all contain NULL values. Since our requirement is to have non-null values in the phone number fields for employees that do not have an emergency contact, we will replace the ‘NULL’ values with ‘N/A’.
SELECT
firstname+''+lastname fullname,
relationship,
COALESCE(homephone, workphone, cellphone, 'NA') phone
FROM
EmergencyContact
The output of this query will give us the first non-null value out of the three phone number columns such that the employees with at least one contact number will be updated accordingly in the table.
Here is the output:
Using SQL COALESCE for String Concatenation
Now, we are about to concatenate some values from a table and let us check out how we can handle the null values while we perform this String concatenation.
In our example, we will concatenate the first_name, middle_name and the last_name columns from a Personnel table. Since not every name contains a middle name, there might be null values in the middle_name column.
In PostgreSQL or SQL Server, when you concatenate a non-null value with a null value, it only returns ‘null’ because every time it encounters null, it returns the same.
To counter this issue, we can use the COALESCE() function to replace the null values with an empty value as shown below.
Say we have the input table: Personnel
The query to replace null values from the middle_name column is as follows:
SELECT first_name +' '+COALESCE(middle_name,'') +' '+ last_name
AS full_name
FROM Personnel
The output of the query looks like this:
Using SQL COALESCE to Pivot Data
Another great function of the SQL COALESCE() function is to handle the initial null values when pivoting multiple rows into a single row. Let us check out the example below that demonstrates this function.
Let’s consider the ‘sales’ table given below.
Our aim is to pivot the data to get the total quantity sold for each product in every year so that we have separate columns for the North and South regions.
Let’s check out the query for the same.
SELECT product_id,
COALESCE(SUM(CASE WHEN region = 'North' THEN quantity END), 0) AS north_quantity,
COALESCE(SUM(CASE WHEN region = 'South' THEN quantity END), 0) AS south_quantity
FROM sales
GROUP BY product_id;
In this query, the SQL COALESCE() function is coupled with the SUM and CASE statements to pivot data. The CASE statement conditionally sums the quantity based on the region. If the region is ‘North’, then it calculates the sum of quantity for the North region; else it returns null.
And to replace the resulting null values with ‘0’, we are using the COALESCE() function.
The output of this query displays the total quantity sold for each product in each region. The output table is as follows:
Using SQL COALESCE with Computed Columns
Let us explore another use for the COALESCE() function in SQL.
In the following example, we will use the COALESCE() function to compare values from the table below.
Now, we will compute the total salary paid for each of these employees using the COALESCE() function and create a computed column.
More often than not, tables require the computation of values that depend on multiple existing columns. These computed columns rely on one or more other columns for their values. To effectively handle null values in these scenarios, the COALESCE() function is employed to create computed columns.
Here is the query:
SELECT
EMP_NO, EMP_NAME, JOB, MANAGER, JOINING_DATE, HOURLY_WAGE,
SALARY, COMMISSION, NUMSALES,
CAST(COALESCE(HOURLY_WAGE * 40 * 52, SALARY, SALARY+(COMMISSION * NUMSALES)) AS decimal(10,2)) AS TOTAL_SALARY
FROM EMP
The ‘TOTAL_SALARY’ column computed represents the total calculated salary of each employee which depends on the employee’s hourly wage, salary, commission, and the number of sales.
If ‘HOURLY_WAGE’ is a non-null value, then it is multiplied by 40 for 40 hours per week and then again by 52 which is 52 weeks per year. If the value is null, however, the COALESCE() function ensures the salary is not null and adds the commission multiplied by the number of sales.
The output of the table with the computed column is as follows:
Summary
In this article, we discussed the SQL COALESCE() function and its usage in handling null values in various specific scenarios. The versatility and adaptability of this function will be extremely beneficial for all data analysts out there in handling null values in calculation, comparisons as well as data retrievals. We explored a multitude of examples in this context to get a deeper understanding of the function.
To learn more SQL functions and interview questions in varied difficulty levels, be sure to explore our post “SQL interview questions”. Practice makes perfect, and StrataScratch will definitely nudge you toward perfection.