I had a situation recently where I was importing a CSV file into Power Query. This file had some summary data at the beginning of the file that I wanted to skip. This is perfectly easy enough using the Remove Top Rows step. However a problem I soon encountered was that if the files summary data varied in the number of rows then the Remove Top Rows step produced errors, especially if you then want to promote a row to be your header in a later step.
To get around this you can search for a particular string that is expected to appear, perhaps one that will become one of your headers or signify the start of a section of the file.
In this example I am using an Azure Usage file. I want to cut out the summary data at the top and start with the Daily Usage data.
Below is an abstract of a large query, starting at one of the file import steps:
1. Source = Csv.Document(File.Contents(“NameRemoved.csv”),null,”,”,null,1252)
2. #”FilteredRows” = Table.SelectRows(Source, each Text.Contains([Column1], “Daily Usage”)),
3. #”Position of Daily Usage” = Table.PositionOf(Source, FilteredRows {0}),
4. #”TopRemoved” = Table.Skip(Source, (#”Position of Daily Usage” + 1)),
5. #”First Row as Header” = Table.PromoteHeaders(#”TopRemoved”),
Pay attention to where steps 2, 3 and 4 all reference the step 1. These steps can be added using the advanced query editor.
Breaking it down by steps starting with Step 2 (#”FilteredRows”); this filtered a particular column, in this case column 1, by the string you are looking for, e.g. “Daily Usage”. The result of this is inserted into a table using Table.SelectRows.
Step 3 (#”Position of Daily Usage”) then finds the position of the 1st row of the table from Step 2, within the imported data in Step 1. Table.PositionOf requires a table as the input. The {0} denotes the first row in the table.
The row number is then used in Step 4 (#”TopRemoved”), in the Table.Skip function. In this example I wanted to remove the row with the Column 1 String “Daily Usage” so I added a + 1.