What is Azure Data Explorer and Kusto Querying Language (KQL)?

Azure Data Explorer (ADE) put simply is a platform to analyse large amounts of data quickly and to scale up your efforts in minutes to accommodate any discoveries in relevant data insights.

Azure Data Explorer (ADE) has several methods for ingesting data from various sources, like social media platforms, IoT, CRM and has the ability to support high volumes of structured, semi structured and unstructured data for fast querying.

Azure Data Explorer integrates with other major services and can be an important part of the data warehousing workflow by executing the explore step of the workflow focusing on analysing a large amount of diverse raw data.

ADE Data warehouse Workflow

In order to query the data, you use Kusto Querying Language (KQL). KQL is for querying only and unlike SQL, you can not update or delete data using KQL.

You can also visualize your analysis through ADE either through the ‘render’ command in KQL or you can connect to PowerBI and output your findings that way.

KQL – Another query language?

Basically yes….. Kusto Query Language (KQL) is however very straight forward and easy to learn. It is the same language used in Azure Log Analytics and Application Insights so if you are already using it there then you won’t have any issues.

Your query starts easily with a reference to the table. You use this tabular data and run it through a set of statements connected by pipes to shape your data.
Below is an example from the Microsoft KQL documentation.

StormEvents
| where StartTime >= datetime(2007-11-01) and StartTime < datetime(2007-12-01)
| where State == “FLORIDA”
| count

How does it work?

  • ADE is a distributed cluster instance maintained by Microsoft.
  • In your cluster you need to host a database with tables. ADE does allow for dynamic column types meaning you can add just about anything into your tables.
  • In terms of security, you can use Azure Active Directory to assign permissions to clusters, databases and tables.
  • There are two methods to ingest data. Queued ingestion, where data is ingested asynchronously or direct ingestion where data is pushed into ADE immediately.
  • Use KQL to query your data.
  • Visualize it either through the ‘render’ command in KQL or through PowerBI

Why use Azure Data Explorer?

Microsoft already has a number of databases many of which have been around long enough to be good options to use, so why Azure Data Explorer?

Pros:

  • Fully managed service. No need to worry about infrastructure, making development easier and faster.
  • Highly scalable.
  • Very simply to push data to and quickly too with minimal set up.
  • Optimized for time series analysis.
  • Azure Data Explorer has been heavily used with in Microsoft internally for many years now, meaning the technology is relatively mature.
  • Azure Data Explorer is available in many Azure regions already.

Cons:

  • Pricing, if you don’t have a large amount of data, then there may be cheaper and better ways to analyse it.
  • Due to Azure Data Explorer being used only internally with in Microsoft, the community support around it can be a little light for the time being.
  • If you need to be able to edit your data, then other options like Cosmos may be better.

 

As of publishing date of this blog, ADE is in preview.

What next?  I will look to add to this blog with a working example of how to set up an Azure Data Explorer cluster, database and query using KQL.

Thanks,

Alex