Migrating to Native Scoring with SQL Server 2017 PREDICT

Microsoft introduced native predictive model scoring with the release of SQL Server 2017.

The PREDICT function (Documentation) is now a native T-SQL function that eliminates having to score using R or Python through the sp_execute_external_script procedure. It’s an alternative to sp_rxPredict. In both cases you do not need to install R but with PREDICT you do not need to enable SQLCLR either – it’s truly native.

PREDICT should make predictions much faster as the process avoids having to marshal the data between SQL Server and Machine Learning Services (Previously R Services).

Migrating from the original sp_execute_external_script approach to the new native approach tripped me up so I thought I’d share a quick summary of what I have learned.

Stumble One:

Error occurred during execution of the builtin function 'PREDICT' with HRESULT 0x80004001. 
Model type is unsupported.

Reason:

Not all models are supported. At the time of writing, only the following models are supported:

  • rxLinMod
  • rxLogit
  • rxBTrees
  • rxDtree
  • rxdForest

sp_rxPredict supports additional models including those available in the MicrosoftML package for R (I was using attempting to use rxFastTrees). I presume this limitation will reduce over time. The list of supported models is referenced in the PREDICT function (Documentation).

Stumble Two:

Error occurred during execution of the builtin function 'PREDICT' with HRESULT 0x80070057. 
Model is corrupt or invalid.

Reason:

The serialisation of the model needs to be modified for use by PREDICT. Typically you might serialise your model in R like this:

model <- data.frame(model=as.raw(serialize(model, NULL)))

Instead you need to use the rxSerializeModel method:

model <- data.frame(rxSerializeModel(model, realtimeScoringOnly = TRUE))

There’s a corresponding rxUnserializeModel method, so it’s worth updating the serialisation across the board so models can be used interchangeably in the event all models are eventually supported.  I have been a bit legacy.

That’s it.  Oh, apart from the fact PREDICT is supported in Azure SQL DB, despite the documentation saying the contrary.