Click here to monitor SSC
  • Av rating:
  • Total votes: 33
  • Total comments: 10
Dwain Camps

The Performance of the T-SQL Window Functions

17 January 2014

Window Functions in SQL greatly simplify a whole range of financial and statistical aggregations on sets of data. Because there is less SQL on the page, it is easy to assume that the performance is better too: but is it? Dwain gets out the test harness to investigate.

SQL has always had Aggregate functions like SUM(), MAX(); without them, it would be impossible to do many routine SQL queries. The SQL Standard introduced window functions in ANSI SQL 2003 to deal with a range of processing tasks that required aggregating on a partition of a set, and extended in ANSI SQL:2008.  In SQL 2005, Microsoft introduced the first of the class of window functions in two flavors: Ranking and Aggregates, and released further functions in subsequent releases. 

When were Window Functions introduced in SQL Server?

Let’s take a quick look at some abridged history of the implementation of window functions in SQL Server.

SQL

FUNCTION Types

FUNCTIONS

Remark

2005

Ranking

ROW_NUMBER, DENSE_RANK, RANK, NTILE

Supports PARTITION BY (optional) and ORDER BY (required) in the OVER clause.

Aggregates

SUM, AVG, COUNT, MIN, MAX

Requires PARTITION BY in the OVER clause.

2008

Ranking

ROW_NUMBER, DENSE_RANK, RANK, NTILE

No relevant changes to supported functionality.

Aggregates

SUM, AVG, COUNT, MIN, MAX

2012

Ranking

ROW_NUMBER, DENSE_RANK, RANK, NTILE

No relevant changes to supported functionality.

Aggregates

SUM, AVG, COUNT, MIN, MAX

Window frame capability (ROWS/RANGE) was added.

Analytic

CUME_DIST, LEAD, FIRST_VALUE, PERCENTILE_CONT, LAG, PERCENTILE_DISC, LAST_VALUE, PERCENT_RANK

Added

There are plenty of excellent articles that explain the workings of the SQL window aggregate functions, for example:

They're useful, but do they perform well?

I’ve seen very few discussions of the performance characteristics of these functions, so today we will attempt to do just that.  We will assume that you have a basic understanding of how both SQL aggregate and window aggregate functions work.  If you are unsure or they are new to you, you should probably read either/both Joe Celko’s or Wayne Sheffield’s articles first.

Before window functions existed, we had to produce quite complex code to do those tasks for which window functions were designed.  Nowadays, we find window functions so useful and flexible that their use has become ubiquitous in SQL.

Although there is no doubt that the window functions add richness to the SQL language, greatly simplifying the syntax and queries they appear in, we’re still left with the nagging doubt as to whether they are as fast as the older methods.  They’re more easily maintained, but are they faster? This is what we want to find out. We haven’t the space to look at every window function in detail, so we’ll limit our testing to just a few, and offer some references where you can look for additional comparisons.

Sample Data and a Simple Example

The simple table and sample data we’ll use for this example is the same as one that I’ve published in a Simple Talk article in the past.  You’ll see the reference when we get around to discussing PERCENTILE_CONT.

CREATE TABLE #SampleTable

(

    ID  INT

    ,N   INT

);

CREATE INDEX i1 ON #SampleTable (ID, N);

 

-- Basic test data

INSERT INTO #SampleTable

VALUES (1,1),(1,2),(1,3),(1,4),(1,5)

    ,(2,1),(2,2),(2,3),(2,4)

    ,(3,10),(3,2),(3,10),(3,4)

    ,(4,1),(4,5),(4,1),(4,3),(4,3);

 

-- SQL 2005 window aggregate example

SELECT ID, N, [Rows]=COUNT(*) OVER (PARTITION BY ID)

FROM #SampleTable

--WHERE ID = 1;

If we run this code with the WHERE clause uncommented, it produces this results set showing how the COUNT of the rows within the PARTITION are appended as a column to each row.

ID     N      Rows

1      1      5

1      2      5

1      3      5

1      4      5

1      5      5

Even though they’re named the same, the window aggregates operate differently from the original aggregate functions, in that the result adds a column across the entire row set, rather than condensing the set to the partitions that result from the GROUP BY.  With just a little extra effort on the coding side, it is possible to use the original SQL aggregate functions to mimic what the window aggregates do.

In SQL 2000 and earlier, the equivalent result could be obtained using the following code pattern:

-- SQL 2000 equivalent

SELECT a.ID, N, [Rows]

FROM #SampleTable a

JOIN

(

    SELECT ID, [Rows]=COUNT(*)

    FROM #SampleTable b

    GROUP BY ID

) b ON a.ID = b.ID;

And when SQL 2005 came along, this could also be done with a CROSS APPLY:

-- SQL 2005 equivalent

SELECT a.ID, N, [Rows]

FROM #SampleTable a

CROSS APPLY

(

    SELECT ID, [Rows]=COUNT(*)

    FROM #SampleTable b

    WHERE a.ID = b.ID

    GROUP BY ID

) b;

When I compared the SQL 2000 pattern and the CROSS APPLY pattern available in SQL 2005, I noted no discernable performance difference.  Since I like CROSS APPLY and I’m the writer, ( … and I’m the editor: Ed) our initial comparison will be between the window aggregate COUNT vs. the SQL 2005 CROSS APPLY pattern.

The Performance Test Harness

If we TRUNCATE the data we put into our table initially, we can add a large number of rows to it using the following test harness.

DECLARE @N INT = 10000;  -- 10,000 = ~1,000,000 rows

 

WITH Tally (n) AS

(

    SELECT TOP (@N) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM sys.all_columns a CROSS JOIN sys.all_columns b

)

INSERT INTO #SampleTable

SELECT a.n, ABS(CHECKSUM(NEWID()))%@N

-- Always create exactly 100 IDs from the Tally table

FROM (SELECT TOP 100 n FROM Tally) a

CROSS APPLY

(

    SELECT TOP (@N) n

    FROM Tally

)b;

Our initial test at 1 million rows, shunting the results into a local variable, generated these timing results for COUNT.

SQL 2008 Window Aggregate: COUNT

 

 SQL Server Execution Times:

   CPU time = 3105 ms,  elapsed time = 2187 ms.

 

SQL 2005 Equivalent (CROSS APPLY): COUNT

 

 SQL Server Execution Times:

   CPU time = 390 ms,  elapsed time = 376 ms.

You can already see that this particular window aggregate seemingly cannot be classified as “high-performance” SQL.   But we don’t want to generalize, so let’s test each of the window aggregate functions vs. its corresponding traditional pattern and graphically demonstrate the performance results.

Performance of Traditional Aggregates vs. Window Aggregates

Since it is easy to replace COUNT(*) with XXX(N) (where XXX=SUM, AVG, MIN and MAX) in each of the two queries we’re testing, we can easily generate queries that compare each aggregate function against its corresponding window aggregate.  Resource files (described at the end) are provided if you want to see each of these queries or run these performance tests yourself.

We’ll also include a test case with all of the aggregates in one query, to determine whether the cost is cumulative or not.  Those two queries look like this.

SELECT ID, N

    ,[Rows]=COUNT(*) OVER (PARTITION BY ID)

    ,[Sum]=SUM(N) OVER (PARTITION BY ID)

    ,[Average]=AVG(N) OVER (PARTITION BY ID)

    ,[Minimum]=MIN(N) OVER (PARTITION BY ID)

    ,[Maximum]=MAX(N) OVER (PARTITION BY ID)

FROM #SampleTable;

SELECT a.ID, N

    ,[Rows], [Sum], [Average], [Minimum], [Maximum]

FROM #SampleTable a

CROSS APPLY

(

    SELECT ID

        ,[Rows]=COUNT(*)

        ,[Sum]=SUM(N)

        ,[Average]=AVG(N)

        ,[Minimum]=MIN(N)

        ,[Maximum]=MAX(N)

    FROM #SampleTable b

    WHERE a.ID = b.ID

    GROUP BY ID

) b;

The histograms show the comparative elapsed times for each of the aggregates, with the last column showing the case of all aggregates applied to the same partition within one query, with and without the suggested INDEX.

Many of these cases perform better without the index than with it.

This table shows the CPU and elapsed time, indicating the “cost” of each window aggregate compared to the traditional aggregation method.

With INDEX

Measure

Query

COUNT

SUM

AVG

MIN

MAX

ALL

CPU (ms)

Window Aggregate (WA)

3039

3136

3167

2901

3028

3556

Traditional Aggregate (TA)

390

499

531

453

452

858

   Cost

679%

528%

496%

540%

570%

314%

Elapsed (ms)

Window Aggregate (WA)

2208

2260

2419

2236

2233

2676

Traditional Aggregate (TA)

385

491

525

465

457

855

   Cost

474%

360%

361%

381%

389%

213%

 

 

 

 

 

 

 

 

Without INDEX

Measure

Query

COUNT

SUM

AVG

MIN

MAX

ALL

CPU (ms)

Window Aggregate (WA)

4945

5226

5130

5257

5008

6287

Traditional Aggregate (TA)

1141

1357

1391

1200

1247

1529

   Cost

333%

285%

269%

338%

302%

311%

Elapsed (ms)

Window Aggregate (WA)

1933

1967

1921

1663

1823

2050

Traditional Aggregate (TA)

441

440

453

416

423

648

   Cost

338%

347%

324%

300%

331%

216%

Cost in all cases is calculated as: 100*(WA-TA)/TA

I’m not sure where you, my valued reader stands, but this seems to be a pretty steep cost to pay for streamlining the query syntax!  These timing results were obtained in SQL 2008 but, sadly, there was no perceptible improvement when running the equivalent test in SQL 2012.  One could have hoped that Microsoft noticed or was informed of this lag in performance, and could have done something to improve it.

Redemption in the Ranking Functions

When I first started using the window ranking functions in SQL 2005, I was very impressed.  All of them solve some very sticky querying predicaments and they do so with excellent speed.  In fact, I’d be hard-pressed to come up with any SQL 2000 alternatives that have even reasonable performance (without using temporary tables) to compare against so I won’t even try.

Kudos to Microsoft on this one!

SQL 2012: Enter the Window Frame for the Aggregate Functions

In SQL versions past, there are a few problems that posed some very difficult cases for producing a high performing query.  The most notable of these are the following:

Both of these problems are now quite easily solved by the SQL 2012 window frame approach.  Rather than go through all of the details here, I’ve provided some useful links above where you can take a look at these problems yourself.  I will summarize the results you can expect here.

  1. Both problems can be solved using a SQL 2012 window frame applied to the window aggregate SUM function.
  2. You can expect the window frame approach to be faster than any triangular or semi-triangular JOIN approach (these are explained in the two linked articles above).
  3. Neither problem can be solved faster with a window frame than using what is known as the Quirky Update (QU).  The article by SQL MVP Jeff Moden linked in above describes in great detail all the rules needed to correctly implement the QU.  My article on calculating values within a rolling window shows how it is much faster than the window frame approach for that particular problem.  Microsoft Certified Master Wayne Sheffield showed how the QU is faster than a SQL 2012 window frame for the running totals problem in his blog Running totals in “Denali” CTP3.  Even though that study was done in a pre-release of SQL 2012, it is doubtful the performance characteristic has changed.

Overall, the SQL 2012 window frame offers a welcome addition to the SQL window functions.  The performance is better than most methods and given that its behavior is fully documented (unlike the QU), it is a prime choice to use for these problems, unless you absolutely must squeak out the highest possible performance and you’re willing to live with the “undocumented behavior” of the QU approach.

SQL 2012: And Then There are the Analytic Functions

The analytic functions in SQL 2012 offer unprecedented new functionality that is extremely useful in statistical analyses.  Just the fact that they exist will save many developers of scientific, statistical and engineering applications lots of headaches trying to replicate the same functionality within complex SQL code.  Of course, I’m sure CLR versions exist, but there will be cases when they can’t be used due to development constraints.  Overall I have to give kudos to Microsoft for these as well.

But in the world of SQL, it always depends.  Sometimes there are alternatives that may be just enough faster that you’d want to consider using them even though they may look more complicated.

PERCENTILE_CONT

A case in point is the PERCENTILE_CONT analytic function.  Its most common usage is to calculate the median of a statistical sample.  With the PARTITION clause, it can do so against partitioned sets.  Let’s compare a couple of queries that calculate median for this case.  We can use the same sample data that we created before.

-- Calculate Median in SQL 2012

SELECT ID, N

    ,Median=PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY N) OVER (PARTITION BY ID)

FROM #SampleTable;

 

-- One Way to Calculate Median in SQL 2008

WITH Counts AS

(

    SELECT ID, b.c

    FROM

    (

        SELECT ID, c=COUNT(*)

        FROM #SampleTable

        GROUP BY ID

    ) a

    CROSS APPLY (VALUES((c+1)/2),(CASE c%2 WHEN 0 THEN 1+c/2 ELSE 0 END)) b(c)

),

    CalculateMedian AS

(

    SELECT a.ID, Median=CAST(AVG(0.+b.N) AS FLOAT)

    FROM

    (

        SELECT ID, N

            ,rn=ROW_NUMBER() OVER (PARTITION BY ID ORDER BY N)

        FROM #SampleTable a

    ) a

    CROSS APPLY

    (

        SELECT N

        FROM Counts b

        WHERE a.ID = b.ID AND a.rn = b.c

    ) b

    GROUP BY a.ID

)

SELECT a.ID, N, Median

FROM #SampleTable a

JOIN CalculateMedian b ON a.ID = b.ID;

While the second query may look extraordinarily complex, the results when we run it are much better than PERCENTILE_CONT.

 

MEDIAN

With INDEX

Without INDEX

CPU (ms)

Elapsed (ms)

CPU (ms)

Elapsed (ms)

SQL 2012: PERCENTILE_CONT

6520

5842

13385

3786

SQL 2008: Complex Median Query

2340

2327

5477

1897

Cost

179%

151%

144%

100%

And there are even (much) faster solutions available to calculate median for cases where the suggested index is present.  The test harness provided in this article’s resources section is described at the end.

On the other hand, PERCENTILE_CONT does a lot more than just calculate the median.  We’ll let you explore the description by Microsoft linked into the function name to find out more.

LAG and LEAD Applied to Finding Gaps in Sequence Numbers

Recently, Simple Talk published an article of mine entitled The SQL of Gaps and Islands in Sequences.  In that article I compared 3 “traditional” solutions for the gaps problem to one of my own design.  We’ll now take that test harness provided for 1M rows and modify it (see description of the resource files provided at the end) to consider the methods within SQL 2012 where LAG and LEAD can be applied to identify gaps.  Oftentimes you see one or the other of these solutions suggested by the author but here we’ll show both.

SELECT ID, StartGap=SeqNo+1, EndGap=nxt-1

FROM (

    SELECT ID, SeqNo

        ,nxt=LEAD(SeqNo, 1) OVER (PARTITION BY ID ORDER BY SeqNo)

    FROM dbo.GapsIslands

) a

WHERE nxt - SeqNo <> 1;

 

SELECT ID, StartGap=prv+1, EndGap=SeqNo-1

FROM (

    SELECT ID, SeqNo

        ,prv=LAG(SeqNo, 1) OVER (PARTITION BY ID ORDER BY SeqNo)

    FROM dbo.GapsIslands

) a

WHERE SeqNo - prv <> 1;

Once again, these solutions are nice and concise, but let’s see how they perform in our test harness.

MEDIAN

CPU (ms)

Elapsed (ms)

Gaps Solution #1 from SQL MVP Deep Dives

780

782

Gaps Solution #2 from SQL MVP Deep Dives

4665

1451

Gaps Solution #3 from SQL MVP Deep Dives

1233

1219

CROSS APPLY VALUES - Islands to Gaps

421

462

SQL 2012 LEAD

1747

1740

SQL 2012 LAG

1357

1363

In this case, all of the traditional solutions outperformed the SQL 2012 LEAD function, and all but one outperformed LAG, in elapsed time.  Only one traditional solution consumed more CPU time.  My conclusion from this is that, while LAG (or LEAD) may improve the clarity of your code, opting for one of the well-documented, traditional approaches to gaps will solve the problem (particularly against very large row sets) more efficiently.

Wayne Sheffield also blogged on some of the existing high performance approaches to gaps in SQL 2012 Performance Test: Gap Detection.  In this blog, he compares the performance of LEAD only to a couple of competing high-performance methods that can be run in earlier versions of SQL.  His test of LEAD also showed it to be much slower (cost about 250% according to my prior calculation method) than the other approaches that he tested.  Using LAG would be expected to have approximately the same performance characteristic (although in my test harness it appears to be slightly better).

Conclusions

Just as new doesn’t always mean better, simpler sometimes does not always translate into better either, particularly where performance is concerned.  Ultimately you must be the judge of whether simpler query forms are more desirable from a maintenance perspective than having a better performing solution.

As far as the SQL window ranking functions are concerned, there is probably no better approach available in earlier versions of SQL.  On the other hand, the SQL window aggregate functions can be much improved upon with respect to performance by relying on only a slightly less simple code pattern based on standard aggregate functions.

For the SQL 2012 enhancements to the window aggregates (the window frame), for problems such as running totals they’re probably a good alternative if you don’t like or understand how the Quirky Update works.  LEAD and LAG show promise, but there remain a few traditional methods that outperform them for problems like finding gaps in sequence numbers.

All of the new SQL 2012 analytic window functions provided unprecedented flexibility.  But using PERCENTILE_CONT for calculating median can be improved upon from a performance perspective using more traditional, albeit more complicated code patterns.

In the end, using the latest query forms may offer a shorter path to getting your code working, because the queries tend to be simpler.  This article is simply a suggestion that you consider looking into and testing other code patterns in cases where performance is really important to you.

The resource files provided with this article are:

  • Window Aggregate Functions Test Harness.sql – This test harness is initially set up to create a 1,000,000 row test table and then display the timing results for window aggregates vs. corresponding traditional aggregations, while suppressing the actual query results by assigning to a local variable.  Comments in the beginning will guide you if you want to just display results for the simple test data shown in the initial example.
  • Median Test Harness.sql – This test harness is initially set up to create a 1,000,000 row (approximate) test table and then display the timing results for the SQL 2012 vs. an alternative median solution, while suppressing the actual query results by assigning to a local variable.  Comments in the beginning will guide you if you want to just display results for the simple test data shown in the initial example.
  • Gaps Test Harness.sql – This test harness is initially set up to create a 1,000,000 row (approximate) test table filled with gaps and then test 4 “traditional” gaps solutions vs. the SQL 2012 LEAD and LAG functions, while suppressing the actual query results by assigning to a local variable.  Comments in the beginning will guide you if you want to just display results for the simple test data shown in the initial example.
Dwain Camps

Author profile:

Dwain Camps has been a project manager for many years. Because performance of applications can be a critical success factor for projects, he has been evangelizing on the need to develop highly performing SQL. By mentoring and authoring articles on SQL, he hopes to train a future generation of software engineers on the right and wrong ways to deliver SQL code. He also has a special interest in developing solutions to complex, data intensive problems using high performance SQL because the declarative nature of SQL allows development of algorithmically unique solutions that procedural languages may not be capable of.

Search for other articles by Dwain Camps

Rate this article:   Avg rating: from a total of 33 votes.


Poor

OK

Good

Great

Must read
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.


Subject: Excellent Article
Posted by: Alan.B (view profile)
Posted on: Friday, January 31, 2014 at 9:30 PM
Message: Just finished this article. Great work as always Dwain!

Subject: Re: Excellent Article
Posted by: Dwain.C (view profile)
Posted on: Monday, February 03, 2014 at 10:26 PM
Message: Thanks Alan. I just hope more people read and learn the gotchas on the window functions.

Subject: Top work
Posted by: ChrisM@home (view profile)
Posted on: Thursday, February 06, 2014 at 2:22 AM
Message: Hey Dwain, excellent reference article to complement Wayne's. Thanks!

Subject: RE: Top work
Posted by: Dwain.C (view profile)
Posted on: Thursday, February 06, 2014 at 6:28 AM
Message: Hey Chris!

Thanks for stopping by and thanks much for the kind words!

BTW. The check is in the mail.

Subject: Great article
Posted by: ErikD (not signed in)
Posted on: Thursday, February 06, 2014 at 8:25 AM
Message: I think the message here is:

Always
Be
Cross (Applying)

Subject: Spot On Article
Posted by: Perry N. (not signed in)
Posted on: Thursday, February 06, 2014 at 9:19 AM
Message: The simplest solution is often the best. Keep up the great work!

Subject: Useful information
Posted by: John Freeman (view profile)
Posted on: Thursday, February 06, 2014 at 9:21 AM
Message: I use Windows function extensively in one-off queries to understand the data I'm working with, and I will continue to do that because the syntax is so easy to use.

But now I know that, for any large-scale work, or time-critical work, I should consider other options, especially Cross Apply.

Very useful information!




Subject: RE: Recent Replies
Posted by: Dwain.C (view profile)
Posted on: Thursday, February 06, 2014 at 11:52 PM
Message: To ErikD (don't I know you from SSC?), Perry N and John Freeman.

Thanks for the positive comments. I love CA and apparently my editor gives me some leeway to write on what I like :-)

I've seen some pretty amazing things get done with CAs, and since my main focus is on performance, you can guess what that means. If you need to learn more about them, there are some great articles I can point you at to help you get started (no not by me).

I too love it when simple solutions can be shown to perform the best (two wins), but alas that is not always the case.

Subject: Good article
Posted by: Davoscollective (not signed in)
Posted on: Saturday, February 08, 2014 at 4:23 AM
Message: Excellent article, disappointing conclusion. I hope Microsoft manage to improve the internal performance of these funtions in a future release. Maybe you should start a connect item about it.

Subject: RE: Good Article
Posted by: Dwain.C (view profile)
Posted on: Saturday, February 08, 2014 at 4:29 AM
Message: Thanks Davoscollective for the feedback.

You do need to be sure to keep my findings in context. You cannot judge overall performance against a few isolated cases. I'm sure there are cases where the new analytic functions do quite well compared to the elder counterparts. In fact, I did just recently find one, but I'm sure there others.

I only focused on the edge cases I did because I see (unfortunately) many proponents of them that don't consider the performance consequences. I prefer informed users over those that blindly follow advice just because something is new and seems good.

As with all things, we hope MS continually improves.

 

Phil Factor
Searching for Strings in SQL Server Databases

Sometimes, you just want to do a search in a SQL Server database as if you were using a search engine like Google.... Read more...

 View the blog

Top Rated

Continuous Delivery and the Database
 Continuous Delivery is fairly generally understood to be an effective way of tackling the problems of... Read more...

The SQL Server Sqlio Utility
 If, before deployment, you need to push the limits of your disk subsystem in order to determine whether... Read more...

The PoSh DBA - Reading and Filtering Errors
 DBAs regularly need to keep an eye on the error logs of all their SQL Servers, and the event logs of... Read more...

MySQL Compare: The Manual That Time Forgot, Part 1
 Although SQL Compare, for SQL Server, is one of Red Gate's best-known products, there are also 'sister'... Read more...

Highway to Database Recovery
 Discover the best backup and recovery articles on Simple-Talk, all in one place. Read more...

Most Viewed

Beginning SQL Server 2005 Reporting Services Part 1
 Steve Joubert begins an in-depth tour of SQL Server 2005 Reporting Services with a step-by-step guide... Read more...

Ten Common Database Design Mistakes
 If database design is done right, then the development, deployment and subsequent performance in... Read more...

SQL Server Index Basics
 Given the fundamental importance of indexes in databases, it always comes as a surprise how often the... Read more...

Reading and Writing Files in SQL Server using T-SQL
 SQL Server provides several "standard" techniques by which to read and write to files but, just... Read more...

Concatenating Row Values in Transact-SQL
 It is an interesting problem in Transact SQL, for which there are a number of solutions and... Read more...

Why Join

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.