Introduction to IFNULL() Function in SQL

SQL IFNULL() Function


Learn all about the SQL IFNULL() function in MySQL and PostgreSQL with the example of a hard-level Interview Question asked by Google.

We are often faced with NULL values in datasets and calculations in SQL. In most cases, we will need to handle such NULL values with some operation or by feeding alternate values. In this article, we will introduce you to the SQL IFNULL function that will help you do just that.

What is SQL IFNULL()?

The first thing you need to know is that this IFNULL function is used in MySQL. Don’t worry! We will also discuss an equivalent function in PostgreSQL later in the article.

This SQL IFNULL function is critical in handling NULL values. It is considered to be a MySQL Flow Control function.

In any Flow Control function, there is an expression to be checked. The flow of data can be controlled by determining what the output should look like. In general, it can take the form of a boolean i,e, true or false, a particular column, or even a static value.

The SQL IFNULL() function can be used in the data manipulation statements such as SELECT, WHERE, GROUP BY, or ORDER BY. In this article, we will demonstrate the use of the function with the SELECT statement.

Syntax of SQL IFNULL() Function

Syntax of SQL IFNULL Function

Let’s look into the syntax of the SQL IFNULL function and try to understand how it works.

IFNULL(expression, alternate_value)

The SQL IFNULL() function takes two arguments. The first one is the expression, which checks if the value of the expression is NULL. If so, the function returns the second argument i.e. the alternate_value.

On the other hand, if the value of the expression is NOT NULL, then the same value is returned.

Let’s see this in action using simple examples first.

SELECT IFNULL(10, 0);

In this example, the first expression is NOT NULL, therefore, the output is 10 itself.

SELECT IFNULL(NULL, 5);

Here, the first expression is NULL, which means we need to substitute it with the alternate value mentioned in the second expression. The output, therefore, is 5.

The second expression doesn’t necessarily have to be an integer. It can be a string as well. Look at the example below:

SELECT IFNULL(NULL, ‘test’);

The output of this statement is ‘test’. The application of this functionality can be diverse depending on the situation. The main goal is to replace the NULL values in the dataset with an alternate value so that any operation done on those values later on, do not end in an error.

PostgreSQL Statement Equivalent to MySQL IFNULL() Function

The IFNULL() function does not exist in PostgreSQL. Knowing that PostgreSQL is also often used by data scientists, it is good to know the equivalent function that fits the language you’re working on.

PostgreSQL offers the COALESCE() function that retrieves the first NON-NULL value. In other words, COALESCE() function evaluates the arguments from left to right, till it finds the first NON-NULL value and returns it. The remaining arguments are not evaluated.

The syntax of the COALESCE() function is as follows:

COALESCE(arg1, arg2, …);

For instance, the below statement will return 5 as it is the first NON-NULL argument.

SELECT COALESCE(NULL, 5, 10);

The functionality is primarily the same as the MySQL IFNULL() function. It evaluates if the first expression is NULL, just like the IFNULL() function in MySQL. If NULL, the second expression is evaluated, and so on, until a NON-NULL value is found.

Using SQL IFNULL() to Solve the Advanced Interview Question by Google

Using SQL IFNULL to Solve the Advanced Interview Question by Google

In this example, we apply the SQL IFNULL() function to a hard-level interview question asked by Google. Let’s take a look at the question.


Tables: user_flags, flag_review

Link to the question: https://platform.stratascratch.com/coding/2105-videos-removed-on-latest-date

Watch the video tutorial for this question on our YouTube channel.

The problem asks you to find the latest date when the flags got reviewed for each unique user in the dataset. And then we are asked to find the total number of distinct videos that were removed on that date.

The output is expected to include the first and last names of each unique user, the latest date when the flags got reviewed, as well as the number of videos removed. However, only the users who have had at least one video reviewed by YouTube can be included in the output. If no videos were removed on a certain date, the output must be 0.

1. Exploring the dataset

Google has provided us with two datasets, namely, user_flags, and flag_review. Let us look into these datasets, one at a time.

Here is a preview of the user_flags table.

Table: user_flags
user_firstnameuser_lastnamevideo_idflag_id
RichardHassony6120QOlsfU0cazx3
MarkMayCt6BUPvE2sM1cn76u
GinaKormandQw4w9WgXcQ1i43zk
MarkMayCt6BUPvE2sM1n0vef
MarkMayjNQXAC9IVRw1sv6ib

This table contains the first and last names of the users, the ID of the video, and the flag ID corresponding to that video.

This table contains NULL values in each of its columns, which means that these might be a cause for concern when we begin to operate on them.

Now, let us look at the schema of the flag_review table.

Here’s a preview of the table:

Table: flag_review
flag_idreviewed_by_ytreviewed_datereviewed_outcome
0cazx3FALSE
1cn76uTRUE2022-03-15REMOVED
1i43zkTRUE2022-03-15REMOVED
1n0vefTRUE2022-03-15REMOVED
1sv6ibTRUE2022-03-15APPROVED

This table contains the flag_id column as well. It also includes ‘reviewed_by_yt’ column which seems to contain binary values. The value is 1 if the video was reviewed by YouTube and 0 if it wasn’t. There is also a ‘reviewed_date’ column and a ‘reviewed_outcome’ column that shows whether the video was in fact, removed or approved as a result of the review by YouTube.

2. Writing out the approach

It is a good practice to write down your approach before solving any coding question. This allows you to think of any possible scenarios that might need to be covered. So, we will begin solving the question by first laying down the steps.

Step 1: Merge the two tables on flag_id

To get all of the information in one table, we can merge the two tables on the common column, flag_id. This way, we can scrutinize whether the videos flagged for review were reviewed by YouTube or not, and also, find out the outcome and the date of the review.

There is another condition we need to fulfill to continue working on the merged dataset. That is to only include the users who have had at least one of their flagged videos reviewed by YouTube i.e. the reviewed_by_yt should have a value of 1 for at least one of the videos pertaining to each unique user.

Step 2: Calculate the total number of distinct videos removed on the latest reviewed date

In the next step, we will need to count the number of distinct videos from the user_flags table. Since there are NULL values in the user_flags table, we will ignore the video_ids that contain a NULL value. This is one place where we can use the SQL IFNULL() function. We will count only the video_ids that do NOT have NULL values.

And since we only need the videos that were removed as a result of the YouTube review, we will add a condition in the WHERE clause. We will perform this while we turn our query from the last step into a CTE (Common Table Expression).

Step 3: Assign 0 to the videos that were NOT removed after the YouTube review

Since the question asks us to evaluate each unique user, we will also have videos that were approved with NULL values displayed in the count column. We are expected to handle this in a way that the output of these users is displayed as 0. So, again we will employ the IFNULL() functionality in our problem. Again, to begin another subquery we will turn the subquery of step 2 into another CTE.

In this subquery, we are tying all our operations together. We will select the first and the last names of each unique user as well as the latest reviewed date from the first CTE, and the count of the videos of each user, if NOT NULL and 0, if they are NULL.

3. Coding the solution

With our approach laid out, let us begin coding the solution.

Step 1: Merge the two tables on flag_id

SELECT user_firstname,
       user_lastname,
       MAX(reviewed_date) AS latest_date
FROM user_flags u
JOIN flag_review r USING (flag_id)
WHERE reviewed_by_yt = 1
GROUP BY user_firstname,
         user_lastname

Here, we have used the MAX() function to find the latest of the ‘reviewed_date’ column. And we’ve grouped the merged table by the first and last names of each of the users who’ve had at least one of the videos reviewed by YouTube.

Let us run the query and find the output:

SQL IFNULL Example Output

Step 2: Calculate the total number of distinct videos removed on the latest reviewed date

WITH a AS
  (SELECT user_firstname,
          user_lastname,
          MAX(reviewed_date) AS latest_date
   FROM user_flags u
   JOIN flag_review r USING (flag_id)
   WHERE reviewed_by_yt = 1
   GROUP BY user_firstname,
            user_lastname)

SELECT reviewed_date,
       IFNULL(COUNT(DISTINCT video_id), 0) AS num_video
FROM flag_review r
JOIN user_flags f USING (flag_id)
WHERE reviewed_outcome = 'REMOVED'
GROUP BY reviewed_date

So, here we are placing the subquery into a CTE named ‘a’.

And then, performing an operation in the IFNULL() function’s first expression. The COUNT(DISTINCT video_id) evaluates if the count of the number of distinct video_ids is NULL or NOT NULL. If it is NOT NULL, then the value of the COUNT() function is returned. If it is NULL, then we display an alternate value, which in our case is 0.

This function, of course, only applies to dataset of filtered videos where the ‘review_outcome’ is ‘REMOVED’. This step is crucial in solving the problem as this tackles the main part of the problem, i.e. to display the date and number of removed videos.

Let us run the query to see the output:

SQL IFNULL Example Output 2

Step 3: Assign 0 to the videos that were NOT removed after the YouTube review

Let us now turn the second subquery into a CTE as well, named ‘b’.

Now we will select the desired columns from the two CTEs and also perform only final check before we validate the solution.

WITH a AS
  (SELECT user_firstname,
          user_lastname,
          MAX(reviewed_date) AS latest_date
   FROM user_flags u
   JOIN flag_review r USING (flag_id)
   WHERE reviewed_by_yt = 1
   GROUP BY user_firstname,
            user_lastname),

     b AS
  (SELECT reviewed_date,
          IFNULL(COUNT(DISTINCT video_id), 0) AS num_video
   FROM flag_review r
   JOIN user_flags f USING (flag_id)
   WHERE reviewed_outcome = 'REMOVED'
   GROUP BY reviewed_date)

SELECT a.user_firstname,
       a.user_lastname,
       a.latest_date,
       IFNULL(b.num_video, 0) AS num_video
FROM a
LEFT JOIN b ON a.latest_date = b.reviewed_date

So, we have included the user’s first and last names from the first CTE ‘a’ and also the latest reviewed date from the same CTE.

From the second CTE, we are selecting the number of distinct videos. However, we are performing yet another SQL IFNULL() function on this column.

If the number of distinct videos does not contain a value, that is it is NULL, then we are presenting 0 as the substitute value to be displayed in the output table. If it is NOT NULL, we will display the value of ‘num_video’ found in the second CTE, ‘b’.

Finally, we are performing a left join on the latest reviewed date columns of the two CTEs because we have the name of each unique user in the first CTE, and the number of distinct videos is calculated in the second for the removed videos. So, it only makes sense to perform a LEFT JOIN instead of an INNER or CROSS JOIN.

If you’re interested in learning the similarities and differences between Left Join and Left Outer Join, check out our post “Left Join vs. Left Outer Join”.

Now, let us run the query and see the final solution:

All required columns and the first 5 rows of the solution are shown

user_firstnameuser_lastnamelatest_daten_removed
William Kwan2022-03-141
DanielBell2022-03-161
GinaKorman2022-03-172
RichardHasson2022-03-180
EvelynJohnson2022-03-172

Conclusion

Wasn’t that an interesting problem? We surely think so. We hope you learned all about the SQL IFNULL() function and how it can be used anywhere from the easiest of scenarios to a complex problem like the one we solved above. Whether you are using IFNULL() in MySQL or COALESCE() function in PostgreSQL, the core concept is the same and can be used to handle NULL values anywhere. For more interesting concepts, keep visiting us here at Stratascratch!

SQL IFNULL() Function


Become a data expert. Subscribe to our newsletter.