At its simplest,
APPLY is similar to
JOIN but allows for a dependency between the data sets being merged together.
Microsoft introduced the
APPLY operator in SQL Server 2005, with two variants:
CROSS APPLY and
CROSS APPLY is semantically similar to
INNER JOIN, whereas
OUTER APPLY is akin to
“If you don't understand the APPLY operator, your skills are somewhere around the 50th percentile at best.” -- Adam Machanic
APPLY is not always the best solution, but solves a number of problems in an elegant manner.
We can use
sys.dm_exec_connections to get basic information for each session, but this doesn't give us query text. There is a table-valued function named
sys.dm_exec_sql_text([handle] varbinary) but we need to pass in the handle for each row in our connections DMV.
Unfortunately, you cannot use
INNER JOIN to join to a function which expects a parameter.
APPLYto execute table-valued functions as though they were tables.
OUTER APPLYcan show rows that
CROSS APPLYwould filter out.
We want to figure out how many times somebody has ordered a water bottle from the Adventure Works store. We'd also like to see which product category and subcategory this belongs to.
APPLYto return data from ad hoc (derived) functions.
APPLYworked better because early aggregation allowed us to reduce the number of rows going through a nested loop join.
We want to see each customer's latest order, PO numbers associated with those orders, and the total due on each order.
There are three set-based options that we can use to perform this operation:
There are Row By Agonizing Row (RBAR) solutions available like using a cursor/WHILE loop, but these will not perform well. We might also put something together with temporary tables which could scale, but start with simple.
APPLY-based solution is inefficient for small loads but scales extremely well.
With small data sets, pick the one you prefer most, as they will all perform well enough.
This is a generalization of the "specific child record" problem.
We want to see the average, minimum, maximum, and total prices of each customer's last 5 orders.
There are still three set-based options that we can use to perform this operation:
With that said, however, the nested sub-query quickly becomes nasty to write, and we already know from the prior example that it doesn't scale well.
Similar to the specific case, the
APPLY version looks worse early on but scales much better.
Use either for small problem sets, but shift to
APPLY as the sets get larger.
APPLY runs once for each element in the "parent" (or left-hand) side. Because each iteration is unique, you typically want to see
APPLY generate nested loop joins.
The ideal scenario for
In other words:
This also worked for aggregation because we had one "parent" row.
"Relatively few interesting rows out of a huge total number of rows" is a verbose way of saying "when it is more efficient to seek rather than scan an index."
APPLY does much better when it makes sense to seek.
APPLY operator will consequently perform poorly when:
Use other techniques in these cases or see if you can reduce data requirements.
SQL Server calculations tend to be repetitious. The
APPLY operator can help us with that by feeding the columns from an earlier table or derived function into a later derived function. In other words, named columns you create in one
APPLY statement can be used in the next
APPLY operator to remove repetition from calculations.
APPLY operator excels at a few activities:
This is not a cure-all operator, though. Be sure to try writing your code a few different ways to see what performs best.
To learn more, go here:
And for help, contact me:
email@example.com | @feaselkl
90-minute Course for $40:
Catallaxy Services consulting: