Ranking Functions – the basics
This is a first post in the series about advanced query techniques which make pretty cool use of ranking functions. Today, I will present the basics.
Starting from SQL Server 2005 we have 4 ranking functions: RANK, DENSE_RANK, NTITLE, ROW_NUMBER. Let’s start with an example:
SELECT LastName, ROW_NUMBER() OVER ( ORDER BY LastName ) AS Number FROM Person.Contact
Here we numbered LastNames starting at 1 in LastName order.
Now, we want to achieve the same effect but to restart for each group (partition) with different first letter.
SELECT LastName, ROW_NUMBER() OVER ( PARTITION BY SUBSTRING( LastName, 1, 1 ) ORDER BY LastName ) AS Number FROM Person.Contact
Every ranking function supports PARTITION BY clause which divides the result set into partitions and ORDER BY which determines the order in which numbers are applied to the rows.
ROW_NUMBER – returns the sequential number of a row within a partition of a result set, starting at 1 for the first row in each partition.
RANK – returns the rank of each row within the partition of a result set. If two or more rows tie for a rank, each tied rows receives the same rank. For example, two rows with the same LastName are both ranked one. The next LastName is ranked three.
DENSE_RANK – works almost as RANK, the only exception is that the numbers do not have gaps and always have consecutive ranks.
NTILE – distributes the rows into a specified number of groups. For each row, returns the number of the group to which the row belongs. We specify the number of group we want to get as a parameter.
SELECT LastName, NTILE (4) OVER ( PARTITION BY SUBSTRING( LastName, 1, 1 ) ORDER BY LastName DESC ) AS Number FROM ( SELECT DISTINCT LastName FROM Person.Contact ) t
Each partition would be divided into four groups. If the number of rows in a partition is not divisible by group number, this will cause groups of two sizes that differ by one member. Larger groups come before smaller groups in the order specified by the OVER clause.