On a current client project we are taking files from an on-prem file server and uploading them to Azure Blob Storage using ADF V2. The files are compressed on-prem using GZip compression and need to be decompressed before they are placed in blob storage where some other processes will pick them up.
ADF V2 natively supports decompression of files as documented at https://docs.microsoft.com/en-us/azure/data-factory/supported-file-formats-and-compression-codecs#compression-support. With this functionality ADF should change the extension of the file when it is decompressed so 1234_567.csv.gz would become 1234_567.csv however, I’ve noticed that this doesn’t happen in all cases.
In our particular case the file names and extensions of the source files are all uppercase and when ADF uploads them it doesn’t alter the file extension e.g. if I upload 1234_567.CSV.GZ I get 1234_567.CSV.GZ in blob storage rather than 1234_567.CSV.
If I upload 1234_567.csv.gz the functionality works correctly and I get 1234_567.csv in blob storage.This means that the file extension replace is case sensitive when it should be case insensitive.
This bug isn’t a major issue for us as the file is decompressed and we can change the extension when we process the file further however, it’s something that stumped me for a while.
I’ve raised a bug at https://feedback.azure.com/forums/270578-data-factory/suggestions/34012312–bug-file-name-isn-t-changed-when-decompressing-f to get this fixed so please vote and I’ll update the post once the issue has been resolved.
AI Assistance in Microsoft Fabric
The exponential growth of Large Language Models (LLMs) couples with Microsoft’s close partnership with OpenAI
Apr
10 reasons why it’s worth the effort to understand the value of your data
“If leaders really want to create a data driven culture, the journey starts with them!
Apr
Content Safety in Azure AI Studio
Azure AI Content Safety is a solution designed to identify harmful content, whether generated by
Apr
Model Benchmarks in Azure AI Studio
In the constantly changing field of artificial intelligence (AI) and machine learning (ML), choosing the
Apr
Celebrating International Women’s Day: from Classroom to Code
As we celebrate International Women’s Day, I want to share my journey of breaking stereotypes
Mar
Pretty Power BI – Adding Pagination to Bar Charts
Good User Experience (UX) design is crucial in enabling stakeholders to maximise the insights that
Feb
Pretty Power BI – Creating Dynamic Histograms
Good User Experience (UX) design is crucial in enabling stakeholders to maximise the insights that
Feb
Top Tips to Pass the Databricks Certified Data Engineer Professional Exam
Having recently passed the Databricks Certified Data Engineer Professional exam, this blog post covers some
Jan