I had a situation recently where I was importing a CSV file into Power Query. This file had some summary data at the beginning of the file that I wanted to skip. This is perfectly easy enough using the Remove Top Rows step. However a problem I soon encountered was that if the files summary data varied in the number of rows then the Remove Top Rows step produced errors, especially if you then want to promote a row to be your header in a later step.
To get around this you can search for a particular string that is expected to appear, perhaps one that will become one of your headers or signify the start of a section of the file.
In this example I am using an Azure Usage file. I want to cut out the summary data at the top and start with the Daily Usage data.
Below is an abstract of a large query, starting at one of the file import steps:
1. Source = Csv.Document(File.Contents(“NameRemoved.csv”),null,”,”,null,1252)
2. #”FilteredRows” = Table.SelectRows(Source, each Text.Contains([Column1], “Daily Usage”)),
3. #”Position of Daily Usage” = Table.PositionOf(Source, FilteredRows {0}),
4. #”TopRemoved” = Table.Skip(Source, (#”Position of Daily Usage” + 1)),
5. #”First Row as Header” = Table.PromoteHeaders(#”TopRemoved”),
Pay attention to where steps 2, 3 and 4 all reference the step 1. These steps can be added using the advanced query editor.
Breaking it down by steps starting with Step 2 (#”FilteredRows”); this filtered a particular column, in this case column 1, by the string you are looking for, e.g. “Daily Usage”. The result of this is inserted into a table using Table.SelectRows.
Step 3 (#”Position of Daily Usage”) then finds the position of the 1st row of the table from Step 2, within the imported data in Step 1. Table.PositionOf requires a table as the input. The {0} denotes the first row in the table.
The row number is then used in Step 4 (#”TopRemoved”), in the Table.Skip function. In this example I wanted to remove the row with the Column 1 String “Daily Usage” so I added a + 1.
Pareto Charts in Power BI and the DAX behind them
The Pareto principle, commonly referred to as the 80/20 rule, is a concept of prioritisation.
Apr
Databricks: Cluster Configuration
Databricks, a cloud-based platform for data engineering, offers several tools that can be used to
Apr
AI Assistance in Microsoft Fabric
The exponential growth of Large Language Models (LLMs) couples with Microsoft’s close partnership with OpenAI
Apr
10 reasons why it’s worth the effort to understand the value of your data
“If leaders really want to create a data driven culture, the journey starts with them!
Apr
Content Safety in Azure AI Studio
Azure AI Content Safety is a solution designed to identify harmful content, whether generated by
Apr
Model Benchmarks in Azure AI Studio
In the constantly changing field of artificial intelligence (AI) and machine learning (ML), choosing the
Apr
Celebrating International Women’s Day: from Classroom to Code
As we celebrate International Women’s Day, I want to share my journey of breaking stereotypes
Mar
Pretty Power BI – Adding Pagination to Bar Charts
Good User Experience (UX) design is crucial in enabling stakeholders to maximise the insights that
Feb