Handling Web Page Errors in Power BI Query Editor

I was recently asked if it was possible to handle Error Rows in the Query Editor and more specifically, to web pages that do not exist from source. The user broached me with this question after following a previous blog on looping through multiple web pages in Query Editor – found here.

In my blog post, there were no errors from the loop output but in the user’s workbook, they were losing data directly below the expanded error row. My first suggestion was to use the Remove Errors UI option, which would delete the problematic row. However, they wanted a way of treating the errors as an Unknown Member and did not want to lose the data entirely

It is assumed consumers of this blog already have knowledge of the Power BI Query Editor and the UI functions used in the examples.  

Scenario

I created a new workbook that connects to local Council Facebook pages. Each page has a unique Object ID, which will be used as the parameter in the loop. The Council “Camberley” deliberately contains an invalid Object ID. I then proceeded to create a Parameter and Function, replicating the exact steps from my previous blog.

When I invoke the function (through the use of a Custom Column), the following is produced:

clip_image001[4]

As you can see, Camberley Council produces an error in the fnInvokeCouncils column. If we expand the contents (highlighted in yellow), the Facebook page data appears. Upon further inspection, the Farnham and Rushmoor council data are available, but Camberley (incorrect Object ID) and Guildford are not.

clip_image003[4]

The error message is a little misleading but let’s save the debugging debate for another day. The key observation is “Guildford” data is not available, simply because it comes after “Camberley” in the list. Whilst we want to see errors in a Query, we do not want them causing data loss.

Resolution

As I mentioned at the beginning of this article, using the Remove Errors function would prevent the loss of Guildford data. However, the user needs to handle errors as Unknown Members and conform to a typical Kimball Data Warehouse. 

I am sure there are many ways to fulfil the requirement, but here is how I approached it:

1.       Duplicate the existing ‘Councils’ query, naming it ‘Councils Error Rows’.

2.       Switch back to the ‘Councils’ query and Remove Errors, leaving only three records:

 

clip_image005[4]

3.       Expand the fnInvokeCouncils column, opening up the underlying fields and data:


clip_image007[4]

4.       In the ‘Council Error Rows’ query, apply the Replace Errors UI function – inserting the string “Validation Failed”.

5.       Add a Custom Column, writing the following M:

if [fnInvokeCouncils] = “Validation Failed” then 1 else 0

This is a simple IF statement that sets the error rows to 1.

6.       Now filter the selection to only display ErrorRows with the value of 1. This is achieved by using the Filter Rows UI function. The ‘Council Error Rows’ query now looks like the following:


clip_image008[4]

7.       The columns must match the ‘Councils’ query, meaning 4 new Custom Columns are needed. We can hardcode the values and remove any unwanted columns.

 

clip_image010[4]

 

8.       Right click on the previously modified ‘Councils’ query select Reference. Name the query ‘All Councils’. This makes it easier to track the transformations and persists any future changes made to the raw data.  

9.       Within the ‘All Council’ query, select Append Query transformation. Choose ‘Council Error Rows’ as the table to append and click OK.
 

10.   We are now left with a Union of both datasets, containing the Unknown Member and data from all other Councils.

clip_image012[4]

11.   The Unknown Member record is visible within the final Query.

 

clip_image014[4]

 

Summary

I have shown you how to get around two different problems with Error Rows in the Power BI Query Editor. The first is how to retain all correct data, whilst the second is keeping error records and inserting them back into the dataset as an Unknown Member. Both methods are dynamic, meaning that if we added a new Council called ‘Basingstoke’, we would see the underlying data regardless of whether the underlying Facebook Object ID exists or not.

Whilst none of the transformations in this blog are overly technical, there are quite a few hoops to jump through to retain all data from a web page/Facebook. Having said that, I am sure there are a few other ways people could approach this problem. I would be really interested to speak to anyone who does have an alternative solution.

Further Reading

Query Editor Basics (Power BI blog) – http://bit.ly/2pwBdo1
Unknown Members in Data Warehousing –
http://bit.ly/2qTefwe
Loop through Multiple Web Pages using Power Query – http://bit.ly/2q3a8Nc

Contact Me

If you would like a copy of the workbook containing the examples or want to know more about the Query Editor within Power BI, please leave a comment below. My Twitter details are also provided.

Twitter:  @DataVizWhizz