In this blog I’ll give a light introduction to Azure Event Grid and demonstrate how it is possible to integrate the service in an modern data warehouse architecture.
Azure Event Grid is a fully managed event routing service that went into general availability on the 30th January 2018. With this service, we can subscribe to any event happening across our Azure resources and take advantage of serverless platforms like Azure Functions and Logic Apps to easily create serverless workflows. It has a built-in publish support for events with services like Blob Storage and Resource Groups and supports custom web hooks that can publish events to Azure and third -party services.
Following is a list of key terms from this service.
Events – In this context, an Event is a message that contains data describing what happened in the service. Eg. a new file was uploaded to a container in a Blob Storage. The event will contain information about the file, such as the name of the file.
Event Publishers – It is the source of the events published to the Event Grid. Following is a list of current and planned event publishers.
Available | Planned |
– Azure Subscriptions (management operations)
– Custom Topics – Event Hubs – IoT Hub – Resource Groups (management operations) – Storage Blob – Storage General-purpose v2 (GPv2) |
– Azure Automation
– Azure Active Directory – API Management – Logic Apps – IoT Hub – Service Bus – Azure Data Lake Store – Cosmos DB |
Topics – The endpoint where publishers send events
Event Subscriptions – Receives specific events from the Topic and sends them to the Event Handlers
Event Handlers – It is the receiver of the events subscribed by the event subscriptions. Following is a list of current and planned event handlers.
Available | Planned |
– Azure Automation
– Azure Functions – Event Hubs – Logic Apps – Microsoft Flow – WebHooks |
– Fabric Controller
– Service Bus – Event Hubs – Azure Data Factory – Storage Queues |
Azure Event Grid can be used as any other message queue service, however, the service stands when integrated in an event-based architecture. Let’s consider that we have an application that uploads csv files to a Blob Storage several times a day. As soon as a set of files are available, we want to move them to an Azure Data Lake Store (ADLS) to apply data transformations using Azure Data Lake Analytics (ADLA). Azure Data Factory (ADF) will be used to orchestrate the data movement. Finally, we need to extract complete data sets from the ADLS using the PolyBase features in Azure SQL DW and present them as tables to Azure Analysis Services (AAS). A Tabular model hosted in AAS will populate a set of Power BI reports.
In the diagram below we can see that we can subscribe the Blob Storage events using an Event Grid subscription and trigger a loading process using Logic App as soon as a file is uploaded to the Blob Storage.
Following I’ll detail how we can implement the sections surrounded by the red square.
Blob Storage
Blob storage events are only available in the Blob Storage and StorageV2 (general purpose v2) accounts. In this example, I created a Blob Storage account. Once the storage account was deployed I created the container to where the application was uploading the files.
Logic App
In Logic Apps we subscribed to the events from the Blob Storage and implemented a logic to validate if we had all the required files to start a new data load. If true, we called an Azure Function that triggered an ADF pipeline. An Azure Function was required because as of the time of writing, there wasn’t a Logic App connector to ADF. The ADF pipeline then executed a couple of U-SQL stored procedures that applied data transformations to the ingested data and created our dimension and fact files in the ADLS.
The following screens demonstrate how to create the logic app, the Azure Event Grid trigger and an overview of the workflow with all the requested steps.
When adding the Azure Even Grid trigger we are prompted to sign in. We should use the account with our subscription.
Once we hit save, we can go back to the blob storage account and find a new event grid subscription was created.
Final Considerations
Azure Event Grid uses a pay-per-event pricing model, meaning we only pay what we use. The first 100,000 operations per month are free and beyond that, £0.448 per million operations.
As always, if you have any queries or considerations do let me know.
Pareto Charts in Power BI and the DAX behind them
The Pareto principle, commonly referred to as the 80/20 rule, is a concept of prioritisation.
Apr
Databricks: Cluster Configuration
Databricks, a cloud-based platform for data engineering, offers several tools that can be used to
Apr
AI Assistance in Microsoft Fabric
The exponential growth of Large Language Models (LLMs) couples with Microsoft’s close partnership with OpenAI
Apr
10 reasons why it’s worth the effort to understand the value of your data
“If leaders really want to create a data driven culture, the journey starts with them!
Apr
Content Safety in Azure AI Studio
Azure AI Content Safety is a solution designed to identify harmful content, whether generated by
Apr
Model Benchmarks in Azure AI Studio
In the constantly changing field of artificial intelligence (AI) and machine learning (ML), choosing the
Apr
Celebrating International Women’s Day: from Classroom to Code
As we celebrate International Women’s Day, I want to share my journey of breaking stereotypes
Mar
Pretty Power BI – Adding Pagination to Bar Charts
Good User Experience (UX) design is crucial in enabling stakeholders to maximise the insights that
Feb