The Problem
Our Lake folder structure means that there is a folder per year/month/day for files. A Data Lake USQL Procedure is placing files in an incorrect directory – the directory for the day before the day that the job actually executes on. I am in the UK so my current time zone is GMT +1 British Summer Time (BST).
The Why
The DateTime.UtcNow() function is without daylight savings. The server that runs the job runs at half midnight and during daylight savings this means that the USQL runs as if it was 11:30 the previous day. Also, the DateTime.Now() function for our Data Lake running in North Europe, returns it’s time zone as UTC-7:00 !
The Proof
The following script…
DECLARE @outputLocation string = “RAW/DT.csv”;
DECLARE @now string = DateTimeOffset.Now.ToString();
DECLARE @nowUTC string = DateTimeOffset.UtcNow.ToString();
@output = SELECT * FROM (VALUES(@now, @nowUTC)) AS DT(Now, Utc);
OUTPUT @output
TO @outputLocation
USING Outputters.Text(outputHeader : true, delimiter: ‘¬’, quoting: true );
…when run at 10:45:53, returns the following result…
“Now”¬”Utc”
“4/17/2018 2:45:53 AM -07:00″¬”4/17/2018 9:45:53 AM +00:00”
Using datatimeoffset instead of just datetime allows you to display the time offset. You can see the –07:00 as the output for the DateTime.Now function.
The Solution
We need to force the time zone returned to be GMT that is day light savings time (DST) sensitive. I also split it into Year, Month and Day parts.
The code is also below for convenience:
DECLARE @outputLocation string = “RAW/DT.csv”;
DECLARE @adjustment string = TimeZoneInfo.ConvertTime(DateTime.Now, TimeZoneInfo.FindSystemTimeZoneById(“GMT Standard Time”)).ToString();
DECLARE @month string = TimeZoneInfo.ConvertTime(DateTime.Now, TimeZoneInfo.FindSystemTimeZoneById(“GMT Standard Time”)).Month.ToString(“00”);
DECLARE @day string = TimeZoneInfo.ConvertTime(DateTime.Now, TimeZoneInfo.FindSystemTimeZoneById(“GMT Standard Time”)).Day.ToString(“00”);
DECLARE @year string = TimeZoneInfo.ConvertTime(DateTime.Now, TimeZoneInfo.FindSystemTimeZoneById(“GMT Standard Time”)).Year.ToString();
@output = SELECT * FROM (VALUES(@adjustment, @month, @day, @year)) AS Z(adjustment, [month], [day], [year]);
OUTPUT @output
TO @outputLocation
USING Outputters.Text(outputHeader : true, delimiter: ‘¬’, quoting: true );
This should now be correct, as you can see from the output below from when I ran this job at 14:55 BST:
BUT WAIT
Turns out this isn’t the end of it. I jumped the gun slightly when publishing this as there are a few issues that mean this solution does not work…
Once you try and use a variable like those above to set an output path, the job fails:
It decides that it cannot apply constant folding to the variables and falls over. Unfortunately I could not find a way to get around this issue whilst still using the TimeZoneInfo functions. It seems that wrapping the DateTime.Now function in this prevented constant folding from happening. If you need to test whether something can constant fold, set your variables to be declared as DECLARE CONST. This way, when you test you can determine whether it does or not before you try and use the variable somewhere like the output statement.
Constant folding is the process of recognizing and evaluating constant expressions at compile time rather than computing them at runtime. The wikipedia article on this is good at explaining it: https://en.wikipedia.org/wiki/Constant_folding
Basically now I had to find how to get around this and it seemed after many attempts that I had to abandon anything involving timezoneinfo as this stopped constant folding. The end result is something not ideal:
DECLARE @outputLocation string = “RAW/ZS.csv”;
DECLARE CONST @nowtime string = DateTime.Now.ToString();
DECLARE CONST @isDaylight int = Convert.ToInt32(DateTime.Now.IsDaylightSavingTime()) + 7;
DECLARE CONST @day string = DateTime.Now.AddHours(@isDaylight).Day.ToString(“00”);
DECLARE CONST @month string = DateTime.Now.AddHours(@isDaylight).Month.ToString(“00”);
DECLARE CONST @year string = DateTime.Now.AddHours(@isDaylight).Year.ToString();
DECLARE CONST @truedate string = DateTime.Now.AddHours(@isDaylight).ToString();
@output = SELECT * FROM (VALUES(@nowtime,@isDaylight, @day,@month, @year,@truedate)) AS Z(nowtime, isDaylight, day, month, year, truedate);
OUTPUT @output
TO @outputLocation
USING Outputters.Text(outputHeader : true, delimiter: ‘¬’, quoting: true );
Here you can see I merely detect daylight savings using IsDaylightSavingTime() and use the AddHours() function to force it to the equivalent of GMT. It’s dirty but it gets the time I want. This relies on me knowing that the time difference is 7 hours to UTC/GMT and manually adding 7 hours plus an extra 1 if daylight savings. After much frustration, this is best I can do until the –7 hours problem is fixed. This solution works with constant folding as well.
Meet the Team – Jason Bonello, Senior Consultant
Meet Jason Bonello! Jason has been with us for just over two years and works
Apr
Meet the Team – Matt How, Principal Consultant
Next up in our series of meet the team blogs is Matt How. Matt has
Apr
MLFlow: Introduction to MLFlow Tracking
MLFlow is an open-source MLOps platform designed by Databricks to enable organisations to easily manage
Apr
Adatis are pleased to announce expansion plans into India
Adatis has offices in London, Surrey and Bulgaria – and has some big expansion plans
Mar
Querying and Importing Data from Excel to SSMS
Introduction There are couple of ways to import data from Excel to SSMS – via
Mar
Data Engineering for Graduates and New Starters
This blog aims to give potential graduates and other new starters in the industry some
Mar
Passing DP-900 Azure Data Fundamentals
Back in December, I took the DP-900 Azure Data Fundamentals exam which is one of
Feb
Real-time Dashboards Provide Transparency to Everyone
Real-time dashboards enable data analytics firm Adatis to be agile and transparent with its team
Feb