Modern Data Warehouse

A Guide to Azure SQL DataWarehouse

Posted on 27th April 201715th June 2022 by Simon Whiteley

27
Apr

So you’ve heard the hype – the Azure SQL DW is going to solve all of your problems in one fell swoop… Right? Well… maybe. The system itself is a mix of technologies designed for low concurrency analytics across huge amounts of relational data. In short, it’s a cloud-scalable T-SQL-based MPP platform, with all the benefits and restrictions that performing everything in parallel brings. If your problem can be solved by performing lots of calculations over small of your data before aggregating the results into a whole, this is the technology for you.

However, before you jump right in, be aware that SQLDW is a very different beast to other SQL tools. There are specific concepts you need to be familiar with before building your system, otherwise you’re not going to see the promised performance gains and will likely lose faith very quickly!

I’m in the process of documenting these concepts, plus there is a wealth of information available on the Azure SQLDW site. For the next few months, I’ll be running through the blog topics below and updating the links accordingly. If there are topics you’d like me to add to the list, please get in touch!

Azure SQLDW Core Concepts:

– What is it?

– How Does Scaling Work?

– Distributions

– Polybase Limitations

– Polybase Design Patterns

– CTAS

– Resource Classes

– Partitioning

Designing ETL (or ELT) In Azure SQLDW

– Row counts

– Statistics

– Surrogate Keys

Performance Tuning Azure SQLDW

– Plan Exploration in SQLDW

– Data Movement Types

– Minimising Data Movement

Managing Azure SQLDW

– Backup & Restore

– Monitoring Distributions

– System Monitoring

– Job Orchestration

– Scaling and Management

– Performance Tuning Queries

Azure SQLDW Architecture

– Presentation Layers

– Data Lake Integrations

Simon Whiteley

Introduction to Data Wrangler in Microsoft Fabric

What is Data Wrangler? A key selling point of Microsoft Fabric is the Data Science

Autogen Power BI Model in Tabular Editor

In the realm of business intelligence, Power BI has emerged as a powerful tool for

Microsoft Healthcare Accelerator for Fabric

Microsoft released the Healthcare Data Solutions in Microsoft Fabric in Q1 2024. It was introduced

Unlock the Power of Colour: Make Your Power BI Reports Pop

Colour is a powerful visual tool that can enhance the appeal and readability of your

Python vs. PySpark: Navigating Data Analytics in Databricks – Part 2

Part 2: Exploring Advanced Functionalities in Databricks Welcome back to our Databricks journey! In this

GPT-4 with Vision vs Custom Vision in Anomaly Detection

Businesses today are generating data at an unprecedented rate. Automated processing of data is essential

Exploring DALL·E Capabilities

What is DALL·E? DALL·E is text-to-image generation system developed by OpenAI using deep learning methodologies.

Using Copilot Studio to Develop a HR Policy Bot

The next addition to Microsoft’s generative AI and large language model tools is Microsoft Copilot