A Slice of Time

Window Functions in SQL Server

Kevin Feasel (@feaselkl)
https://csmore.info/on/windows

Who Am I? What Am I Doing Here?

Motivation

My goals in this talk:

  • Explain the concept of window functions.
  • Review the types of window functions available in SQL Server.
  • Walk through use cases for and limitations of window functions.
  • Provide advice on performance tuning of window functions.

Agenda

  1. An Overview of Window Functions
  2. Types of Window Functions
  3. Use Cases
  4. Limitations
  5. Tuning Window Functions

What is a Window Function?

A window function is a function applied to a set of rows.

Window functions operate over windows of data.

Anotamy of a Window Function

There are four key elements to a window function:

  • The OVER() clause: defines that this is a window function and the window we will use
  • Partition (PARTITION BY): filter rows
  • Order (ORDER BY): sorting when we need an ordered set, but also provides meaning to frames when aggregating data
  • Frame (ROWS, RANGE): filter within a partition

An Example

A Visual Depiction

The PARTITION BY Clause

The ORDER BY Clause

The Frame

The Frame: Preceding

The Frame: Following

Demo Time

Agenda

  1. An Overview of Window Functions
  2. Types of Window Functions
  3. Use Cases
  4. Limitations
  5. Tuning Window Functions

Types of Window Function

There are five classes of window function in SQL Server:

  • Aggregate: MIN(), MAX(), AVG(), COUNT(), SUM(), etc.
  • Ranking: ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE()
  • Offset: LAG(), LEAD(), FIRST_VALUE(), LAST_VALUE()
  • Statistical: PERCENT_RANK(), CUME_DIST(), PERCENTILE_CONT(), PERCENTILE_DISC()
  • Ordered set functions: STRING_AGG()

Demo Time

Agenda

  1. An Overview of Window Functions
  2. Types of Window Functions
  3. Use Cases
  4. Limitations
  5. Tuning Window Functions

Key Use Cases

Some of the use cases we will cover:

  1. Finding rows to delete
  2. Turning start date + end date into event tracking
  3. Running totals
  4. Last N values
  5. Distinct number of values over time
  6. Percent of total

Demo Time

Agenda

  1. An Overview of Window Functions
  2. Types of Window Functions
  3. Use Cases
  4. Limitations
  5. Tuning Window Functions

Referencing Window Functions

If we define a window function, we can only reference it in the SELECT and ORDER BY clauses:

Logical Processing Order

SQL Server (typically) processes a SELECT statement in the following order (simplifying for space purposes):

  1. FROM
  2. JOIN
  3. WHERE
  4. GROUP BY
  5. HAVING
  6. SELECT
  7. DISTINCT
  8. ORDER BY

Window functions are processed in the SELECT section, so anything prior to that in a statement can't use the results of the window function.

In other words, WHERE, GROUP BY, HAVING, et al, shape the window, so the window cannot subsequently shape them!

The Easiest Workaround

The easiest workaround, and something we've seen already, is to use a common table expression (CTE):

Limitations to RANGE

The RANGE window operator is a lot more powerful in the SQL standards than what we have in T-SQL.

What's Missing in RANGE?

The RANGE window operator in SQL Server is missing some great support around intervals:

Rows versus RANGE

By contrast, the ROWS window operator has more flexibility.

Additional Limitations

Some additional things to keep in mind:

  1. Windows and APPLY

Demo Time

Agenda

  1. An Overview of Window Functions
  2. Types of Window Functions
  3. Use Cases
  4. Limitations
  5. Tuning Window Functions

Techniques for Tuning Window Functions

  • Appropriate indexing
  • Batch mode processing
  • Limit the number of windows

Indexing: POC

The rule of thumb around making window functions as fast as possible is to provide an index in the following order:

  • Partition By columns
  • Order By columns (following ordering!)
  • Covering columns--other columns used in the query

Indexing: WHERE before POC

Remember that the WHERE clause operates before the window function, so if you have a great filter, put it before the POC.

  • WHERE clause filters
  • Partition By columns
  • Order By columns (following ordering!)
  • Covering columns--other columns used in the query

Batch Mode

SQL Server generally operates in row mode, in which one row is processed by an operator at a time.

Introduced in SQL Server 2012, but improved greatly in SQL Server 2016 and even more in 2019.

Batch mode provides:

  • 64KB blocks of data, up to 900 rows, read at a time
  • Metadata evaluated once per batch versus once per row
  • Functions operating on a batch handle all ~900 rows at once

Batch Mode Requirements

For batch mode, you need:

  • Columnstore index + batch mode cost is less than row mode cost
  • OR rowstore index AND ALL OF:
    • 131,072 (2^17) rows in at least one table in the query
    • At least one operator which benefits from batch mode
    • At least one input of a batch has at least 2^17 rows
    • Batch mode cost is less than row mode cost

Batch Mode in 2016

If you have SQL Server 2016 or 2017 but don't want to work with a columnstore index, you can take advantage of a trick: create a filtered columnstore index, particularly one on an impossible condition.

Demo Time

Wrapping Up

Over the course of this talk, we have looked at the concept of window functions, as well as a variety of use cases for them and how to optimize window function queries.

Wrapping Up

To learn more, go here:
https://csmore.info/on/windows


And for help, contact me:
feasel@catallaxyservices.com | @feaselkl


Catallaxy Services consulting:
https://CSmore.info/on/contact