PowerBI Optimisation P3– Extracting and Source Controlling PowerBI Data Models

Source Control – once seen as “something proper developers do” – has been an integral part of the way business intelligence developers work for a long time now. The very idea of building a report, data model or database without applying some kind of source control actually pains me slightly.

However, there has been a push for “Self-Serve” reporting tools to strip out anything that looks remotely like a technical barrier for business users – This includes the ability to properly track changes to code.

We find ourselves in a very familiar situation – versions of PowerBI desktop files are controlled by including version numbers in file names. I’ve seen several copies of “Finance Dashboard v1.2.pbix”. This is obviously dangerous – who’s to say that someone didn’t open up the file, edit it and forget to increment the name. Once a file has been shared, there’s no controlling what changes happen at that point. If this happened to an SSIS package, for example, we would still be able to perform a code comparison. This would highlight differences between the two packages so we could accurately see what caused the changes. This is not currently possible with PBIX files in their entirety.

We can, however, compare the data model behind the file. This allows us to check for changes in business logic, amendments to DAX calculations, additions of new fields etc. If the performance of two PBIX files different drastically even if they were meant to be the same “version”, then this is a reasonable starting point!

Extracting the Data Model from a PowerBI PBIX File

Firstly, we need to extract the JSON that describes the Tabular Model embedded model (technically, this is TMSL, the tabular model scripting language, but it’s still JSON…)

We can do this by connecting to the model via SSMS. I’ve talked about the steps required to do this here.

So, assuming you have found your temporary SSAS port and connected via SSMS, you should see something like this:

image

As we would with any other Tabular model, you can right-click and script out the database as so:

image

If we do this to a new query window, you’ll see the various JSON objects that describe your PowerBI model:

image

This script contains the details for all tables, attributes, DAX measures etc required for your data model.

Comparing PowerBI Data Models

What if someone has been using a specific version of my PowerBI desktop file, but they’ve modified it and it has stopped working? For a Tabular model, I’d compare the model definition to source control which will automatically highlight any changes.

Now that we can script out our PowerBI model, we can apply the same principles. Say, for example, I make a couple of changes to my sample PowerBI report and want to figure out how it has changed compared to a baseline script I exported previously.

The easiest option is to use a tool like Textpad – here you can compare two text documents and it will highlight any differences it finds between the two. For example, I changed the name of a table and removed a column, the text comparison highlights this change as below:

image

I can now be confident that if someone sends me a PBIX file, I can check to see if there are any data model changes without having to manually eyeball the two side by side. This alone is a huge leap forward in manageability of models.

The next step would be to add this file to an actual Source Control provider, such as Visual Studio Team Services. This tool is free for the first 5 users and can be used with Visual Studio 2015 Community Edition – which is also free!

Essentially you would add this exported script to your source control directory each time you updated the model. By checking in your new model, you can compare previous versions, much like with the TextPad editor above.

Final Thoughts

In the end, this isn’t real, true Source Control. If you make a mistake, you can only view what the previous configuration was, you cannot roll back code directly into your PowerBI model. It is, however, a step towards managing PowerBI with a bit more discipline and rigour.

I don’t see this as a huge drawback as rumours on the wind are hinting at larger steps in this direction coming with future releases. Let’s hope we’re not having to work around these problems for much longer!