Microsoft introduced native predictive model scoring with the release of SQL Server 2017.
The PREDICT function (Documentation) is now a native T-SQL function that eliminates having to score using R or Python through the sp_execute_external_script procedure. It’s an alternative to sp_rxPredict. In both cases you do not need to install R but with PREDICT you do not need to enable SQLCLR either – it’s truly native.
PREDICT should make predictions much faster as the process avoids having to marshal the data between SQL Server and Machine Learning Services (Previously R Services).
Migrating from the original sp_execute_external_script approach to the new native approach tripped me up so I thought I’d share a quick summary of what I have learned.
Stumble One:
Error occurred during execution of the builtin function 'PREDICT' with HRESULT 0x80004001.
Model type is unsupported.
Reason:
Not all models are supported. At the time of writing, only the following models are supported:
- rxLinMod
- rxLogit
- rxBTrees
- rxDtree
- rxdForest
sp_rxPredict supports additional models including those available in the MicrosoftML package for R (I was using attempting to use rxFastTrees). I presume this limitation will reduce over time. The list of supported models is referenced in the PREDICT function (Documentation).
Stumble Two:
Error occurred during execution of the builtin function 'PREDICT' with HRESULT 0x80070057.
Model is corrupt or invalid.
Reason:
The serialisation of the model needs to be modified for use by PREDICT. Typically you might serialise your model in R like this:
model <- data.frame(model=as.raw(serialize(model, NULL)))
Instead you need to use the rxSerializeModel method:
model <- data.frame(rxSerializeModel(model, realtimeScoringOnly = TRUE))
There’s a corresponding rxUnserializeModel method, so it’s worth updating the serialisation across the board so models can be used interchangeably in the event all models are eventually supported. I have been a bit legacy.
That’s it. Oh, apart from the fact PREDICT is supported in Azure SQL DB, despite the documentation saying the contrary.
Meet the Team – Jason Bonello, Senior Consultant
Meet Jason Bonello! Jason has been with us for just over two years and works
Apr
Meet the Team – Matt How, Principal Consultant
Next up in our series of meet the team blogs is Matt How. Matt has
Apr
MLFlow: Introduction to MLFlow Tracking
MLFlow is an open-source MLOps platform designed by Databricks to enable organisations to easily manage
Apr
Adatis are pleased to announce expansion plans into India
Adatis has offices in London, Surrey and Bulgaria – and has some big expansion plans
Mar
Querying and Importing Data from Excel to SSMS
Introduction There are couple of ways to import data from Excel to SSMS – via
Mar
Data Engineering for Graduates and New Starters
This blog aims to give potential graduates and other new starters in the industry some
Mar
Passing DP-900 Azure Data Fundamentals
Back in December, I took the DP-900 Azure Data Fundamentals exam which is one of
Feb
Real-time Dashboards Provide Transparency to Everyone
Real-time dashboards enable data analytics firm Adatis to be agile and transparent with its team
Feb