Skip to the content

Opening up the frontiers of public sector data

08/12/22

Mark Say Managing Editor

Get UKAuthority News

Share

Alison Pritchard
Alison Pritchard
Image source: GOV.UK, Open Government Licence 3.0

Interview: Deputy national statistician Alison Pritchard talks about new possibilities and the work of the Office for National Statistics

D is for data, but Alison Pritchard has been emphasising the importance of three Ls to go with it – linked, local and longitudinal.

The deputy national statistician and director general for data capability at the Office for National Statistics (ONS) flagged up the importance of the concepts in September and, in an interview with UKAuthority, explains that they should underpin a lot of the work done with data by ONS and across the public sector.

“The more I think about them the more they do reflect the biggest leap forward in what we are going to do with data,” she says.

She explains that the focus on linked data comes from the public sector having moved on from a bilateral sharing between organisations to a position in which an array of data assets can be linked to each other. This can provide insights and show exactly how resources should be targeted, as was shown to great effect during the Covid-19 pandemic and has terrific potential for a wide range of services.

The emphasis on local reflections a rising intent to provide sub-national, geographically defined data, down to a granular level.

Policy dimensions and impacts

“The levelling up agenda really brought that to the fore,” Pritchard says. “We need to be able to show how policy dimensions in one postcode area can have policy impacts above those in another, because it turns out areas of deprivation sit cheek by jowl with those that are better off, and that level of local focus is really important.”

Another element is that people have their own experiences of a social or economic trend – she cites the example of inflation having a very different effect on people with high or low disposable incomes – and ‘local’ should involve attributes that make it possible to reflect this.

Thirdly, longitudinal – collected over a period of time - comes from beginning to build data assets before there is an immediate need for them. If the right choices are made this can prove highly valuable in fields such as policy evaluation, data modelling and the development of digital twins.

In this case, there will have to be choices over what types of data to collect and they will have to change over time.

“It takes a lot of methodological skill to be able to manage that,” Pritchard says. “First you have a core component of the data asset. When you are doing something like GDP or prices, they can adjust over time but you don’t want to be doing it on a repeated basis. And you can gather additional information or collect other pieces.

“As long as you have a consistent core you can give yourself flexibility around the edges.”

Climate change indicators

She says that choosing datasets for the core should not be as difficult as it may seem at first sight, citing the example of climate change. ONS has been working with a team at Cambridge University which has identified four key indicators that can be related to a lot of other factors: a reduction of flying; reduction of eating ruminants; decarbonisation of home heating; and decarbonisation of personal vehicles.

“That’s the kind of critical thinking we can do, make sure those are prominent as we go forward,” she says.

There are different views on how much of the total a consistent core should account for, but she believes it should be more than half.

It leads into a question of whether government’s existing data sources will be sufficient to deal with emerging challenges around issues such as climate change, economic supply chains and the pressures on the care system. Pritchard says most of that needed for policy development and evaluation is already there, but there are significant gaps in what is made available, notably in that collected for regulatory purposes. There can be a lot of value in this but so far it has not been widely shared and integrated.

“It’s in a specific environment,” she says. “The department may have sufficient visibility of it, but we don’t make broader use of the data. It’s all out there but not yet sufficiently linked and joined.”

Back burner

This is acknowledged in many organisations but pushed onto the back burner by the pressures of meeting routine demands, and the fact that they will often focus on collecting data for statutory purposes but not go much further.

“I have to give colleagues credit that we do know it has broader value, but have often set it up in a ways we have not optimised it,” she says. “For example, there are often regulatory rules around what data licensees must provide, but it will be very much framed around the management and regulation of the licences, rather than around things like whether food data could support climate change evaluation and formulation of policy.”

She also relates the issue to risk management, saying that a “hard edged” model is often applied in the handling of data, there is always a risk when it goes beyond an organisation’s boundaries, and this could require some new approaches to information governance.

“We have become very risk averse to sharing data for several good reasons, but as a consequence I think we are starting to overlook risks of not using data.

“One of the things I think we will have to do more is look at how we govern data across boundaries rather than just in individual stovepipes. As you start to link data and create integrated assets you don’t really have the means for doing governance around it at the moment.”

Untapped sources

Asked about specific untapped data sources, she cites the example of travel and tourism, pointing out that ONS has been compiling the International Passenger Survey since 1961 but there could well be more information on the details of holidays and how much people spend on them that could have a social value.

That leads to a question about how far ONS can go in collecting the data.

“By law we have rights to ask for data for use in public good, but you use those powers appropriately and carefully, and we already have some useful agreements and relationships with external providers,” she says.

“We have just enhanced our data acquisition capacity under the new chief data officer, whose job it is do more acquisition of data assets. Think of ONS as having shifted from being fundamentally a data production house to a data laden organisation. It means our mindset had to change around where we find data assets.”

This comes with the issues of data governance and public perceptions. Pritchard says that managing data appropriately is “part of our DNA”, characterised by de-identification and aggregation, and that ONS makes a continued effort to inform the public of why it collects their data. She says it has paid a dividend in high levels of public confidence in how it uses data, and agreement with the idea of an independent body to oversee the production and use of statistics. But there is more work to be done.

Perspectives on value

“When you start to ask people about their understanding of how data is used they recognise there is value to community and the nation but less that there is value to themselves. That’s a challenge we still have to overcome.”

There is also work to be done on a broader scale – beyond the immediate responsibilities of ONS – to help public sector bodies in freeing up more data.

“I think there are two things that are behind the curve and we will have to bash, common technical architecture and common data architecture,” Pritchard says. “We’ve let this stuff develop organically over time, done great digital transformation, especially in health sector, but look at health and government and they are two massive domains.

“We can join them up but have to work hard at it. Leave aside legal gateways and stuff like that, and look at the pure technology of making those data assets work.

“I would really like to see an enterprise data model for government that goes beyond the usual standards of what data should look like to a data model where we can work out where the joins are across government.

“It’s like the longitudinal argument; you have to start somewhere even if you don’t get the benefits on day two. We’ve got the means and skill of doing it, businesses are pretty good at doing it because they have strong federated model, but we don’t have that model in government.”

Breaking down debt

This is closer to the remit of the Central Digital and Data Office in the Cabinet Office, building on the digital and data strategy for government, and aiming to break down what Pritchard describes as a “legacy data debt”. This would require the development of “a convergence path” to provide a data architecture for government as a whole.

She also sees potential in a catalogue of government data and the possible emergence of data brokers.

“There is definitely a role for a brokerage, whether the broker is inside the government or a third party. Part of the reason I think we need a better data model is that brokers could populate the bits we are missing, the pockets where something is available and could really make a difference. Then we could negotiate a fair price.

“There is no need for a broker to handle the data rather than playing a part in managing the missing gaps; but we would want to make sure we are dealing with appropriate processes.”

While there is plenty of horizon scanning at ONS, it is also devoting attention to improving its core activities of collecting and publishing statistics on the economy, population and society in the UK.

Balanced tech basket

Pritchard says it now has a sophisticated technology stack, with a strong emphasis on cloud systems and using a “balanced basket” of AWS, Microsoft Azure and Google Cloud Platform, and plans to use products such as Google Omni and increase the deployment of APIs to connect data assets.

It is also increasingly using the cross-government Integrated Data Service (IDS) – for which it is the lead partner – to help government make better use of its data for analytical purposes.

“We’re seeking to do three things differently,” she says. “One is to adopt a distributed data approach, not a data lake.

“Second is to focus on linking data, so it will all be indexed and connected regardless of where it came from.  Thirdly, it will reside in a trusted research environment where the legal gateway of Digital Economy Act applies and we can handle transparency, proprietary and ethics and the way we can engage the public.

“This is approaching full public beta in March. One of the early data assets to be available will be the 2021 census, totally aggregated and de-identified for selected users.”

There is a further ambition relevant to the census, that in future its overview of population data could be put together from administrative statistics rather than carrying out another of the 10-yearly exercises. ONS is planning to submit a formal recommendation on this late in 2023.

IDS interactions

Pritchard also expresses a desire that the IDS will support as much public interaction as possible in the future, enabling people to access and re-use datasets. It has already been used in the development of a prototype climate change portal, which was opened up to feedback last month.

“I think in due course the IDS will be more granular over what data the public can access, or accredited researchers like PhD students, and government users having the widest access, all through role based access controls,” she says.

Other work includes the continuing development of the ONS Data Science Campus and exploring the possibilities of synthetic data – which is artificially manufactured rather than generated by real world events – for filling gaps in data models and supporting privacy.

This is topped off by evangelising the cause of encouraging public and private sectors to become more sophisticated in developing and managing use cases for data.

“We recognise the role of Cabinet Office in data strategy ownership and policy, and we like to think of ourselves at being at the edge of data implantation and usage,” Pritchard concludes. “It causes us to push the boundaries of what is possible. “

Register For Alerts

Keep informed - Get the latest news about the use of technology, digital & data for the public good in your inbox from UKAuthority.