Microsoft have now released the highly anticipated and detailed paper on performance tuning tabular models. It is available here
At over 180 pages and a high level of technicality it takes a bit of getting through, this blog aims to summarise a few of the points I feel are new to me and useful. I wont cover the DAX query plan section in detail here but its well worth a read. There is lots that i wont mention as I had already come across it elsewhere.
As such I will mostly be pointing at important sections in the document, all credit goes to the authors: John Sirmon, Greg Galloway, Cindy Gross, Karan Gulati
Query speed
- Avoiding error handling functions made easier with fourth parameter on SEARCH, FIND and the DIVIDE function (p47)
- Prefer ISEMPTY to ISBLANK (p50)
- Only evaluate many to many measures when you need to (p55):
- Train users of Power View and Excel to add measures first to avoid cross products of dimensions (p57)
- Beware of Excel compatibility mode and Pre 2010 Excel, this can send less efficient queries which return unnecessary sub total rows (p60)
- Excel slicers connected to a lot of pivots is a known performance issue (I thought it was just my cube!) (p64)
Processing
- The Process Data followed by process recalc pattern i knew about, however the paper also proposes that if you are overnight processing and users are not querying during your process then doing a process clear and then process full reduces your memory pressure as you don’t need both copies in memory at once. Great for those smaller environments. (p75)
- You can control how many connections can be opened against your data source for parallel processing of tables, default is 10 (p82)
- Data types and encoding, lots to consider here, generally favour currency and avoid stings where you could use int. (p87)
- Compression setting can be adjusted to fine tune query performance vs processing time (p89)
- To process tables in parallel you need to manually adjust the XMLA to include Parallel tags () around all processing commands (p95)
- You can query to check when a process defrag is necessary rather than it just being an overhead task (p116)
Server Settings
- Get all the latest cumulative updates, lots of fixes and performance improvements are not in SP1 (118)
- On large servers with a lot of RAM raise the Low Memory limit as the default allocation is probably too aggressive (p127)
That’s all I will list out for now but there is a lot in there
Thanks
Introduction to Data Wrangler in Microsoft Fabric
What is Data Wrangler? A key selling point of Microsoft Fabric is the Data Science
Jul
Autogen Power BI Model in Tabular Editor
In the realm of business intelligence, Power BI has emerged as a powerful tool for
Jul
Microsoft Healthcare Accelerator for Fabric
Microsoft released the Healthcare Data Solutions in Microsoft Fabric in Q1 2024. It was introduced
Jul
Unlock the Power of Colour: Make Your Power BI Reports Pop
Colour is a powerful visual tool that can enhance the appeal and readability of your
Jul
Python vs. PySpark: Navigating Data Analytics in Databricks – Part 2
Part 2: Exploring Advanced Functionalities in Databricks Welcome back to our Databricks journey! In this
May
GPT-4 with Vision vs Custom Vision in Anomaly Detection
Businesses today are generating data at an unprecedented rate. Automated processing of data is essential
May
Exploring DALL·E Capabilities
What is DALL·E? DALL·E is text-to-image generation system developed by OpenAI using deep learning methodologies.
May
Using Copilot Studio to Develop a HR Policy Bot
The next addition to Microsoft’s generative AI and large language model tools is Microsoft Copilot
Apr