Common table expressions (CTEs) are a great way to temporarily store a set of results and query that result set without the need to persist it to the database. Unlike sub-queries, CTEs can be referred to more than once in the subsequent query, without the need to re-include the entire coding. There is also the opportunity of creating multiple CTEs in the same query, when for example data needs to be extracted from different tables in the database.
Scenario
To set the scenario, imagine that a retail company have daily sales for different platforms saved in one database within different transactional tables. This could be website, mobile phone application and in-store sales. In this scenario the different tables need to be combined into one.
Solution
In this scenario, a possible solution is to model a target table which has all the columns required to accommodate the data from multiple source tables into one table.
The source tables might have different names or maybe need some conversions before they can fit within the standardized columns of the target table.
Let’s create a select statement, using CTEs, to showcase the expected results, adding any filtering conditions to focus on specific records and also doing any transformations or aliases for example:
;WITH ApplicationSales AS( SELECT ST.SalesHeaderId ,ST.SaleDate AS Date ,ST.CustomerKey as CustomerId ,ST.SaleAmount ,ST.createdBy ,ST.createdOn ,’Application’ AS [Source] FROM [Application].[SalesHeaderTransaction] ST WHERE YEAR(SaleDate) = ‘2020’ ), WebsiteSales AS( SELECT AT.HeaderId AS SalesHeaderId ,AT.FiscalDate ,AT.CustomerId , CONVERT(DECIMAL(16,4), AT.SaleAmount) AS SaleAmount ,AT.InputtedBy AS createdBy ,AT.InputtedOn AS createdOn ,’Website’ AS [Source] FROM [Website].[SalesHeaderTransaction] AT WHERE YEAR(FiscalDate) = ‘2020’ ), StoreSales AS( SELECT AT.SalesHeaderId ,AT.Date ,AT.CustomerId ,AT.Amount AS [SalesAmount] ,AT.createdBy ,AT.createdOn ,’Store’ AS [Source] FROM [Store].[SalesHeaderTransaction] AT WHERE YEAR(Date) = ‘2020’ )
Then in order to combine all the different CTEs holding all the data from the distinct transactional tables, another CTE can be used to UNION the tables together (now that we have same number of columns in each). The order of the full script can be seen below:
;WITH ApplicationSales AS( SELECT ST.SalesHeaderId ,ST.SaleDate AS Date ,ST.CustomerKey as CustomerId ,ST.SaleAmount ,ST.createdBy ,ST.createdOn ,’Application’ AS [Source] FROM [Application].[SalesHeaderTransaction] ST WHERE YEAR(SaleDate) = ‘2020’ ), WebsiteSales AS( SELECT AT.HeaderId AS SalesHeaderId ,AT.FiscalDate ,AT.CustomerId , CONVERT(DECIMAL(16,4), AT.SaleAmount) AS SaleAmount ,AT.InputtedBy AS createdBy ,AT.InputtedOn AS createdOn ,’Website’ AS [Source] FROM [Website].[SalesHeaderTransaction] AT WHERE YEAR(FiscalDate) = ‘2020’ ), StoreSales AS( SELECT AT.SalesHeaderId ,AT.Date ,AT.CustomerId ,AT.Amount AS [SalesAmount] ,AT.createdBy ,AT.createdOn ,’Store’ AS [Source] FROM [Store].[SalesHeaderTransaction] AT WHERE YEAR(Date) = ‘2020’ ), CombineCTE AS( SELECT * FROM ApplicationSales UNION ALL SELECT * FROM WebsiteSales UNION ALL SELECT * FROM StoreSales ) INSERT INTO [Sales].[HoldingTable]( Date, CustomerId, SaleAmount, createdBy, createdOn, Source) SELECT Date , CustomerId , SaleAmount , createdBy , createdOn , Source FROM CombineCTE ;
Such a script can be used in a stored procedure which can then be executed in either directly in SQL or even called through services like Azure Data Factory. This is just one-use case for the scenario and there are multiple reasons why CTEs can be beneficial or utilized as an alternative to Temporary Tables or sub-queries in SQL.
Introduction to Data Wrangler in Microsoft Fabric
What is Data Wrangler? A key selling point of Microsoft Fabric is the Data Science
Jul
Autogen Power BI Model in Tabular Editor
In the realm of business intelligence, Power BI has emerged as a powerful tool for
Jul
Microsoft Healthcare Accelerator for Fabric
Microsoft released the Healthcare Data Solutions in Microsoft Fabric in Q1 2024. It was introduced
Jul
Unlock the Power of Colour: Make Your Power BI Reports Pop
Colour is a powerful visual tool that can enhance the appeal and readability of your
Jul
Python vs. PySpark: Navigating Data Analytics in Databricks – Part 2
Part 2: Exploring Advanced Functionalities in Databricks Welcome back to our Databricks journey! In this
May
GPT-4 with Vision vs Custom Vision in Anomaly Detection
Businesses today are generating data at an unprecedented rate. Automated processing of data is essential
May
Exploring DALL·E Capabilities
What is DALL·E? DALL·E is text-to-image generation system developed by OpenAI using deep learning methodologies.
May
Using Copilot Studio to Develop a HR Policy Bot
The next addition to Microsoft’s generative AI and large language model tools is Microsoft Copilot
Apr