Article
· Jan 24, 2023 4m read

5 useful SQL functions to take your SQL skills to the next level

Hi Community,

In this article, I listed 5 useful SQL functions with explanations and query examples 👇🏻

These 5 functions are

  • COALESCE
  • RANK
  • DENSE_RANK
  • ROW_NUMBER
  • Function to Get Running Totals

So Let us start with COALESCE function

#COALESCE

The COALESCE function evaluates a list of expressions in left-to-right order and returns the value of the first non-NULL expression. If all expressions evaluate to NULL, NULL is returned.

Following statement will return first not null value which is 'intersystems'

SELECT COALESCE(NULL, NULL, NULL,'intersystems', NULL,'sql')

Let us create below  table for further example

CREATE TABLE EXPENSES(
    TDATE     DATE NOT NULL,
    EXPENSE1   NUMBER NULL,
    EXPENSE2   NUMBER NULL,
    EXPENSE3   NUMBER NULL,
    TTYPE  CHAR(30) NULL)

Now let us insert some dummy data to test our function

 INSERT INTO sqluser.expenses (tdate, expense1,expense2,expense3,ttype )  
  SELECT {d'2023-01-01'}, 500,400,NULL,'Present'
  UNION ALL
  SELECT {d'2023-01-01'}, NULL,50,30,'SuperMarket'
  UNION ALL 
  SELECT {d'2023-01-01'}, NULL,NULL,30,'Clothes' 
  UNION ALL
  SELECT {d'2023-01-02'}, NULL,50,30 ,'Present'
  UNION ALL
  SELECT {d'2023-01-02'}, 300,500,NULL,'SuperMarket'
  UNION ALL 
  SELECT {d'2023-01-02'}, NULL,400,NULL,'Clothes'   
  UNION ALL
  SELECT {d'2023-01-03'}, NULL,NULL,350 ,'Present'
  UNION ALL
  SELECT {d'2023-01-03'}, 500,NULL,NULL,'SuperMarket'
  UNION ALL 
  SELECT {d'2023-01-04'}, 200,100,NULL,'Clothes'
  UNION ALL
  SELECT {d'2023-01-06'}, NULL,NULL,100,'SuperMarket'
  UNION ALL 
  SELECT {d'2023-01-06'}, NULL,100,NULL,'Clothes'  

Select the data

Now by using COALESCE function we will retrieve first not NULL value from expense1,expense2 and expense 3 columns

SELECT TDATE,
COALESCE(EXPENSE1,EXPENSE2,EXPENSE3),
TTYPE
FROM sqluser.expenses ORDER BY 2   

#RANK vs DENSE_RANK vs ROW_NUMBER functions

  • RANK()— assigns a ranking integer to each row within the same window frame, starting with 1. Ranking integers can include duplicate values if multiple rows contain the same value for the window function field.
  • ROW_NUMBER() — assigns a unique sequential integer to each row within the same window frame, starting with 1. If multiple rows contain the same value for the window function field, each row is assigned a unique sequential integer.
  • DENSE_RANK() leaves no gaps after a duplicate rank.

In SQL, there’s several ways that you can assign a rank to a row, which we’ll dive into with an example. Consider once again the same example as before, but now we want to know what is the highest expenses.

We want to know where do I spend the most money. There are different ways to do it. We can use all ROW_NUMBER() , RANK() and DENSE_RANK() . We will order the previous table using all three functions and see what are the main differences between them using the following query:

Below is our query:

The main difference between al three functions is the way they deal with ties. We will further deep-dive their differences:

  • ROW_NUMBER()returns a unique number for each row starting at 1. When there are ties, it arbitrarily assigns a number if a second criteria is not defined.
  • RANK()returns a unique number for each row starting at 1, except for when there are ties, then it will assign the same number. As well, a gap will follow a duplicate rank.
  • DENSE_RANK() leaves no gaps after a duplicate rank.

#Calculating Running Totals

The running total is probably one of the most useful window functions especially when you want to visualize growth. Using a window function with SUM(), we can calculate a cumulative aggregation.

To do so, we just need to sum a variable using the aggregator SUM() but order this function using a TDATE column. 

You can observe the corresponding query as follows:

As you can observe in the table above, now we have the accumulated aggregation of the amount of money spent as the date passes by.

 

Conclusion

SQL is great. Functions used above might be useful when dealing data analysis, data science, and any other data-related field.

This is why you should care to keep improving your SQL skills.


Thanks

Discussion (13)2
Log in or sign up to continue

I'm loving the examples, however I'm a little confused by the last running total.  Either I'm misunderstanding what it's trying to do or its not doing what I think it should.

The ACC_EXPENSES column seems to be only taking the first value for a given date and adding it into the running total instead of all the values for a given date. In the below image the date 1/1/23 has 3 rows, 300, 230 & 330, which total 860 as seen in the TOTAL_EXPENSES column. 

The ACC_EXPENSES value for 1/2/23 should be the sum of 1/1/23, (860) and the sum of 1/2/23, (880), but instead its showing 730, which is the sum of the 2 rows I've highlighted above.