Number 10 is hiring data scientists – but will they get the data they need?

Downing street sign

A new, post-Brexit government is about to ramp up investment in data science. First it will need a way to get the right data ready for analysis.

The UK government is about to embark on a data science spending spree, according to a recent and much-discussed blog by Number 10 special adviser Dominic Cummings.

“There are some profound problems at the core of how the British state makes decisions,” Cummings wrote, noting that Brexit in particular “requires many large changes in policy and in the structure of decision-making”.

To accelerate policymaking in a post-EU world, he envisages hiring “an unusual set of people with different skills and backgrounds”, including data scientists and software developers, as well as a category of maverick thinkers and doers he terms “weirdos and misfits”.

Their brief will be to apply “data science, AI and cognitive technologies to […] ‘clusters’ of issues to improve policy and project management.” That will involve understanding what’s happening at the cutting edge of predictive science in other disciplines, and bringing new modelling and forecasting capabilities to “decision-making institutions at the apex of government.”

A seismic shift in Government data science strategy

If Number 10 succeeds in getting those skills and capabilities in place, it will represent a seismic shift in the way the government uses the data at its disposal.

Despite initiatives like the creation in 2011 of the Government Digital Service, and a drive to recruit data scientists in 2017-2018,Government currently lags far behind sectors in using predictive modelling to inform decision-making.

Partly that’s due to the comparative lack of people with data science skills in central government departments. Research by the Parliament Street think tank found that the Department of Transport, for example, had ‘five or fewer’ staff with data management responsibilities in 2018.

But there’s also the fact that data science skills are hard to find in general. A 2018 study by Kyvos Insights found that 52% of organisations “lacked the skills to staff BI and big data projects,” according to Information Age.

Before you can apply data science, you need to have data

Assuming Government is successful in hiring the army of data scientists it needs to accelerate post-Brexit policymaking, another huge obstacle remains. Before the new hires can apply the science, they need the right data to apply it to.

Much of that data is fragmented across departments, often in systems that are decades old. Swathes of useful data are tied up in legacy systems with no reporting capabilities, and which were never designed to provide data for analysis.

Within those systems, data exists in different formats using different taxonomies, and with huge amounts of duplication. A 2019 report by the National Audit Office found “more than 20 ways of identifying individuals and businesses across 10 departments and agencies.” It also noted that “a lack of common data models and standards within and between departments makes it difficult and costly to combine different sources of data.”

The NAO’s conclusion was that the fragmentation of data “makes it difficult for Government to maximise its data assets, for example by allowing thematic analysis across different sectors to help understand economic challenges or systemic problems.”

Collating and standardising data from multiple systems is a critical challenge

The first challenge, then, is to identify the systems where the useful data resides; find a way to extract it; clean it and standardise it; and then bring it into a central place for data scientists to work on it.

But that’s not all. The data science envisaged by Dominic Cummings doesn’t just involve applying new predictive models to historical data. It involves combining data from different departments and external agencies, and applying new models to these combined datasets to unearth hidden correlations.

It may also mean capturing and analysing fresh data as it’s created. If Number 10 wants to tackle the huge problems besetting the country’s rail network, for example, it may want the DoT’s data scientists to apply their skills to real-time data about delays and cancellations.

A new approach to data science requires a new kind of data platform

All of this requires a new kind of data platform: one that’s able to standardise, deduplicate and pool data from many sources very quickly, possibly in near-real time.

If the government’s data science push is to be successful, a modern data platform – with a data lake allowing rapid access to fresh datasets – is an essential requirement before any predictive models can be applied.

Building such platforms is exactly the kind of work that Adatis undertakes, for both government and private sector organisations. For the Cabinet Office, for example, we built a platform that pools data from multiple source systems to create the Contracts and Spend Insight Engine (CaSIE); an analytical tool that provides unprecedented levels of insight into billions of pounds’ worth of central government contract spending.

Find out more about building a modern data warehouse with Adatis

The good news is that the technologies that support large-scale data collation, data analytics and visualisation have never been more accessible or affordable. To learn more about how Adatis works with Government to build modern data management and analytics platforms, download our Government white paper or get in touch.