Data Analytics

Dynamic Top N rows in Power Query

Posted on 25th September 20144th May 2020 by Dan Evans

25
Sep

I had a situation recently where I was importing a CSV file into Power Query. This file had some summary data at the beginning of the file that I wanted to skip. This is perfectly easy enough using the Remove Top Rows step. However a problem I soon encountered was that if the files summary data varied in the number of rows then the Remove Top Rows step produced errors, especially if you then want to promote a row to be your header in a later step.

To get around this you can search for a particular string that is expected to appear, perhaps one that will become one of your headers or signify the start of a section of the file.

In this example I am using an Azure Usage file. I want to cut out the summary data at the top and start with the Daily Usage data.

Below is an abstract of a large query, starting at one of the file import steps:

1. Source = Csv.Document(File.Contents(“NameRemoved.csv”),null,”,”,null,1252)

2. #”FilteredRows” = Table.SelectRows(Source, each Text.Contains([Column1], “Daily Usage”)),

3. #”Position of Daily Usage” = Table.PositionOf(Source, FilteredRows {0}),

4. #”TopRemoved” = Table.Skip(Source, (#”Position of Daily Usage” + 1)),

5. #”First Row as Header” = Table.PromoteHeaders(#”TopRemoved”),

Pay attention to where steps 2, 3 and 4 all reference the step 1. These steps can be added using the advanced query editor.

Breaking it down by steps starting with Step 2 (#”FilteredRows”); this filtered a particular column, in this case column 1, by the string you are looking for, e.g. “Daily Usage”. The result of this is inserted into a table using Table.SelectRows.

Step 3 (#”Position of Daily Usage”) then finds the position of the 1^st row of the table from Step 2, within the imported data in Step 1. Table.PositionOf requires a table as the input. The {0} denotes the first row in the table.

The row number is then used in Step 4 (#”TopRemoved”), in the Table.Skip function. In this example I wanted to remove the row with the Column 1 String “Daily Usage” so I added a + 1.