The Government’s Data Standards Authority (DSA) has published a collection of metadata standards for the sharing and publication of data.
It throws the emphasis onto the use of open standards including the Dublin Core schema for sharing across government and schema.org Dataset schema for publication, along with CSV on the Web (CSVW) for CSV files.
The publication is the first to emerge from the DSA, which was set up within the Government Digital Service (GDS) in April with the aim of improving the public sector’s management of data.
Alison Pritchard (pictured), director general of GDS, said: “I’m delighted by today’s launch of the metadata standards. They’re the first step in assuring how data is shared across government.
“Standards are critical in allowing us to make sure our information is better managed. They will improve the quality of government data and help us deliver the best possible services to citizens.”
The guidance on open standards is split into three sections, saying Dublin Core is the most appropriate metadata schema for sharing tabular data around government as it meets the need for consistency and context. It says this makes it easier to catalogue, validate and re-use data, and is the foundation for more complicated standards such as Data Catalog Vocabulary.
For describing tabular data for publication, the guide points to the schema.org Dataset schema. This is already used for publication on GOV.UK and data.gov.uk, and enables users to add metadata for search engines to find the data.
The DSA says this should be used in combination with its guidance on persistent resolvable identifiers so users can find the most recent version of the data.
Thirdly comes the advice to use the SCVW standard in adding metadata to describe the contents and structure of CSV files. DSA says this makes it easier to process the files into an annotated data model, consolidate different CSVs into one file and load data into a datastore for queries and analysis.
It can also be used to convert a file’s existing schema to formats such as RDF and JSON, making it easier to integrate them into a datastore or transfer data across systems.
The more general guidance on recording information about datasets to be shared with others takes in where to record and store the metadata, making it machine readable and accessible, recording the provenance of the data, helping others find, identify and validate it, and making sure it is used appropriately.
A further section on publishing tabular data includes advice on working with a CSV file and recording the metadata.
The DSA is advised by a steering board made up of representatives from several Whitehall departments, including the Department for Digital, Culture, Media and Sport, the Home Office, the Office for National Statistics (ONS) and the Department for Work and Pensions. It meets monthly to discuss strategic priorities and the roadmap for the DSA.
Frankie Kay, ONS director general for data capability, commented: “The DSA’s work allows the government to further capitalise on the benefits of data to improve our services. Through participation on the steering board, ONS is able to ensure alignment with wider data initiatives and implement data standards in line with our priorities.”
Image from GOV.UK, Open Government Licence v3.0