Let’s kick this blog off by first making the distinction between Connectivity Types and Data Connectors, as its common to find these two terms being used interchangeably. Connectivity Types are referring to the methods available for establishing a connection and interacting with the underlying source. Whilst Data Connectors are simply the data sources that we can ingest or pull data from.
A single Data Connector can support multiple Connectivity Types. For instance, if we require to analyse data from SQL Server Analysis Services (Data Connector) we have the option to establish a connection through an Import or LiveConnection (Connectivity Types).
To make more sense of this, when we first open Power BI Desktop, we have an empty canvas presented to us, therefore we have no data to work with and no measures or visuals to analyse. To start our analysis, we must first select ‘Get Data’ which will display all the available Data Connectors. Once we select a data source from the list, as the below image shows, we will then have the option to select an available Connectivity Type. Keep in mind, if you select a data source that only supports one connectivity type, such as Excel, you won’t be given the option to select a connectivity type.
Connectivity Type breakdown
There are four connectivity types that can be used for interacting with the underlying data source, with each having its own benefits and limitations. Connectivity types can also alter the functionality available for use in Power BI Desktop, hence it is important to make sure we select the connectivity type most suited for our purpose early on.
Import is the most common connectivity type used in Power BI and the default selection when connecting to most data sources. The most important aspect to know about when using an Import is that all data ingested is stored in the Power BI model, therefore the PBIX file and hosted in memory. Storing and reading data from memory increases the overall performance of accessing and retrieving data, making Import the fastest of all connectivity types. This is all capable through the underlying engine which sits behind Power BI (and Analysis Services), the VertiPaq engine (also known as xVelocity) which is an in-memory columnar dataset that compresses and stores all the data in memory.
As the data is stored in the Power BI model when using Import, this means queries generated by the report consumers whilst interacting with the visuals from a report are all sent to the Power BI model and not back to the underlying data source. The only time Power BI will query the underlying data source whilst using an Import, is during the refresh process. This means the Power BI dataset must be scheduled to refresh for the reports using the dataset to have the most up to date information.
Using Import also offers the full capabilities of Power BI Desktop, therefore transforming and shaping data from multiple sources, modelling and joining data as well as all aspects of reporting and DAX functions. Also, with all the great performance benefits Import brings, we should always keep our memory usage in mind as there are limitations in regard to how large our Power BI dataset can be. With Pro licensing, the Power BI dataset cannot exceed 1GB after compression, however with Power BI Premium this can reach levels of 400GB using the Large Model option. For more information on licensing in Power BI, please read my blog here.
The most important aspect of a DirectQuery is that no data is ingested and stored in the Power BI model. This is the opposite behaviour of the Import connectivity type we explored above, which stores all data in the PBIX file and hosts the data in memory. Power BI will only store the metadata of the underlying data, therefore the table names, relationships, field names, etc., but not the actual data. By not storing any data in the Power BI model means you can store larger volumes of data and there is less chance that you will run into any of the data volume limitations against the Power BI dataset, such as the 1GB dataset that comes with Pro licensing. DirectQuery is mainly available with relational database sources, you can find the list here.
When interacting with Power BI reports that are sourcing data from a model using DirectQuery, queries will be generated and sent directly to the underlying data source to request data before displaying back in the visuals. The positive here is that the data remains on source side, near real time data is available and no scheduled refresh is required, but it can have an impact on the reporting experience for report consumers and on the performance of the underlying data source when too many users are generating queries at the same time. Tip: Power BI does have some Query Reduction methods to ease the number of queries to source.
Using DirectQuery limits the capabilities of Power BI Desktop. For instance, Time-Intelligence DAX functions such as same period last year are not available for the purpose of limiting the complexity of queries sent to the underlying data source and some transformations in Power Query are restricted, such as remove duplicates and changing to some data types. Also, Power BI only allows DirectQuery to return 1M rows from the underlying data source, unless we use Power BI Premium.
It is very common to find LiveConnection being confused with DirectQuery due to both connectivity types not storing any data in the Power BI model. However, they are two very different connectivity types and they cannot be used interchangeably. Using a LiveConnection means no data is stored in the Power BI model, therefore all interaction with a report using a LiveConnection will directly query the existing Analysis Services model.
LiveConnection can be used with SQL Server Analysis Services (Tabular models and Multidimensional Cubes), Azure Analysis Services (Tabular Models), and Power BI Datasets hosted in the Power BI Service. As these sources are analytical engines, the overall performance will be much higher compared to DirectQuery. Furthermore, all these sources can offer a semantic layer that acts as a single version of the truth, offering the business a golden layer of high integrity, well-governed data, therefore LiveConnection is commonly used in enterprise deployments of Power BI.
Using a LiveConnection does have some restrictions from an authoring perspective. However, as all the work has been done in Analysis Services or an existing Power BI Dataset, we are simply taking advantage of an existing enterprise-level model, therefore it makes sense that restrictions exist on data transformation and data modelling capabilities. Also, some limitations exist on the DAX measures that can be written as only Report Level Measures are available in LiveConnection. This means the measures are stored in the Power BI report and not written back to the Analysis Services model. All reporting capabilities are available for use in Power BI when using a LiveConnection.
Previously, we could not establish a connection to multiple data sources using DirectQuery or through the use of both DirectQuery and Import within a single Power BI report. However, with the Mixed connectivity type or Composite models, not only can we combine data from multiple DirectQuery data sources but mix data from a DirectQuery data source and an Import data source. So, we can establish a DirectQuery connection to the sales fact table and multiple dimension tables from a SQL Data Warehouse and ingest data using an Import from an Excel spreadsheet that contains budget values. This will be called a Composite model and in the Power BI Desktop application, you will notice the storage mode is set to Mixed in the bottom right, as the below shows.
A great benefit that comes with the mixed connectivity type is the ability to ingest additional data into your Power BI report that does not exist in your SQL Data Warehouse. It is very common to use a DirectQuery to a SQL Data Warehouse, for the benefits mentioned previously such as near real-time data and handling larger volumes of data, however, it is also very common for a single set of data to not be available in the enterprise data warehouse, such as the budget values. Even though this is required to answer crucial business questions, the BI team may not have the capacity and resources to ingest this data into the Warehouse. Before the mixed connectivity type came into play, the way around this may have been to simply Import all data from the data warehouse, however, this could be counter-intuitive for the reasons for choosing DirectQuery in the first place. The Mixed connectivity type resolves this.
We should be aware that using a Mixed connectivity type does limit some capabilities available in Power BI, the same way a single DirectQuery does. Just because we have the Mixed connectivity type available this does not mean that all limitations that currently come with DirectQuery have been lifted. So, tables in the Power BI report that have a DirectQuery to the source still have limitations around the time-intelligence DAX functions and many of the data transformations in Power Query.
Can we use LiveConnection with a Mixed (Composite) Connectivity Type?
Before answering the questions it’s important to reiterate that it is very common to use a LiveConnection in an enterprise scenario. A LiveConnection establishes a connection to a semantic layer such as Analysis Services or a Power BI dataset. In an enterprise scenario, it is important to have a semantic layer placed between your data source systems and Power BI, as it simplifies the data in the Data Warehouse by making it more usable for the business. Also, it aligns complex data with familiar business terms and removes the need to do any data preparation, therefore offering a complete & consolidated view across the organization.
We cannot yet use a LiveConnection to a centralized version of the truth and extend it with our own data through an Import connectivity type. However, this was highly requested by the Power BI community and it is now on the public Power BI roadmap, which you can read more about here. Once this is available, this will unlock many more self-service scenarios and certainly offer faster answers to business questions.
We should not have a good understanding of all the available Connectivity Types in Power BI, as well as the pros and cons of using one over the other.