Every professional (who has worked with SQL Server) will have used a Join operator. However, the same cannot be said about the APPLY operator. Although APPLY was introduced back in SQL Server 2005, there are still a number of developers who have barely seen the syntax, let alone utilised it’s capabilities.
This blog focuses solely on INNER JOINS vs CROSS APPLY and when it is more beneficial to use the latter. I will provide additional links at the end of the post, which will go into other features of APPLY, such as performance benefits against PIVOT’s.
This blog has been written with the assumption that the reader already knows about Joins. If you would like to refresh your memory, click here.
What is APPLY?
The original purpose of the APPLY operator was to use them with table-valued functions (TVF’s).
The MSDN definition is:
“The APPLY operator is similar to the JOIN operator, but the difference is that the right-hand side operator of APPLY can reference columns from the left-hand side”.
In simple terms, a join relies on self-sufficient sets of data, i.e. sets should not depend on each other. On the other hand, CROSS APPLY is only based on one predefined set and can be used with another separately created set. A worked example should help with understanding this difference.
Example Problem
Scenario: Find the last 3 orders from all customers that live in the UK.
Database: AdverntureWorksDW2014 – Download the free Microsoft database here.
Tables: dbo.FactInternetSales
dbo.DimCustomer
1. Create a function that accepts CustomerKey and the Top(N) orders (e.g. last 3 orders) to be returned.
USE [AdventureWorksDW2014]
GO
CREATE FUNCTION dbo.GetTopOrders (@CustomerKey INT, @N INT) RETURNS TABLE
AS
RETURN
— find last 3 orders, placed on different days
SELECT DISTINCT TOP(@N)
SalesOrderNumber,
OrderDate,
CustomerKey
FROM [dbo].[FactInternetSales]
WHERE CustomerKey = @CustomerKey
ORDER BY OrderDate DESC, SalesOrderNumber DESC
2. Using traditional CROSS JOIN approach. The C.CustomerKey reference (in the CROSS JOIN & function) is intended to dynamically pass through every customer, with the ‘3’ being the number of orders to show per customer.
SELECT C.CustomerKey,
C.FirstName,
C.LastName,
C.EmailAddress,
O.SalesOrderNumber,
O.OrderDate
FROM dbo.DimCustomer C
CROSS JOIN
dbo.GetTopOrders(C.CustomerKey,3) AS O
The reason why this fails is because C.CustomerKey is referencing the left set (DimCustomer), but does not exist itself in the right set (the Function). Joins require pre-defined sets from both sides, which is not how this query works.
3. Using CROSS APPLY. This operator reference the left set first, before then checking the second set against the first one. The second set runs a row by row basis, passing the CustomerKey in a type of recursive variable.
SELECT C.CustomerKey,
C.FirstName,
C.LastName,
C.EmailAddress,
O.SalesOrderNumber,
O.OrderDate
FROM dbo.DimCustomer C
CROSS APPLY
dbo.GetTopOrders(C.CustomerKey,3) AS O
— 26760 rows affected
The only change to the code is changing CROSS JOIN to CROSS APPLY. Although the sets are analysed separately, you can still return data from both – just like a Join.
4. The above example would also apply to an INNER JOIN. The left set only returns data if it matches with the right side, so if there is not a match, neither set is applied. A way to ensure all records are returned, is to use OUTER APPLY.
For the purpose of this article, I have added a new customer (not shown below) to the dbo.Customers table, who has yet to place an order. We can now run the original CROSS APPLY query, but as an OUTER APPLY instead.
SELECT C.CustomerKey,
C.FirstName,
C.LastName,
C.EmailAddress,
O.SalesOrderNumber,
O.OrderDate
FROM dbo.DimCustomer C
OUTER APPLY
dbo.GetTopOrders(C.CustomerKey,3) AS O
WHERE O.SalesOrderNumber IS NULL
The additional ‘Where’ clause specifically returns the non-matching record, but in total 26761 rows where affected.
Conclusion
As demonstrated, CROSS APPLY is a very useful operator when referencing a Table Function or filtering on a subset of data. The key benefit is the ability to use one defined set with another separately created set. Unlike Joins, you can define a dynamic subset of data to match with the outer query.
APPLY is a very underrated and underused within the SQL Server Community. Look out for future blogs, where I demonstrate the uses of the other APPLY operators and how they can produce more efficient queries.
References
For further reading on the keywords in this article, try the recommended links below:
1. Boost your T-SQL with CROSS APPLY – http://www.microsoftvirtualacademy.com/training-courses/boost-your-t-sql-with-the-apply-operator
2. CROSS APPLY in SQL Server – http://www.microsoftvirtualacademy.com/training-courses/boost-your-t-sql-with-the-apply-operator
3. Using APPLY – https://technet.microsoft.com/en-us/library/ms175156(v=sql.105).aspx
4. INNER JOIN VS CROSS APPLY – http://explainextended.com/2009/07/16/inner-join-vs-cross-apply/
Introduction to Data Wrangler in Microsoft Fabric
What is Data Wrangler? A key selling point of Microsoft Fabric is the Data Science
Jul
Autogen Power BI Model in Tabular Editor
In the realm of business intelligence, Power BI has emerged as a powerful tool for
Jul
Microsoft Healthcare Accelerator for Fabric
Microsoft released the Healthcare Data Solutions in Microsoft Fabric in Q1 2024. It was introduced
Jul
Unlock the Power of Colour: Make Your Power BI Reports Pop
Colour is a powerful visual tool that can enhance the appeal and readability of your
Jul
Python vs. PySpark: Navigating Data Analytics in Databricks – Part 2
Part 2: Exploring Advanced Functionalities in Databricks Welcome back to our Databricks journey! In this
May
GPT-4 with Vision vs Custom Vision in Anomaly Detection
Businesses today are generating data at an unprecedented rate. Automated processing of data is essential
May
Exploring DALL·E Capabilities
What is DALL·E? DALL·E is text-to-image generation system developed by OpenAI using deep learning methodologies.
May
Using Copilot Studio to Develop a HR Policy Bot
The next addition to Microsoft’s generative AI and large language model tools is Microsoft Copilot
Apr