Skip to the content

Follow us @UKAuthority

GDS creates govcookiecutter for data science

30/07/21

Mark Say Managing Editor

Share

The data science team in the Government Digital Service (GDS) has created a tool to generate project structure for relevant initiatives.

govcookiecutter logo

Named govcookiecutter, it is freely available to public sector data scientists, and is aimed at making it quicker to bring colleagues into a project with a consistent structure.

Eric Young, data scientist at GDS, said in a blogpost that it is based on assumptions that Git version control - a system that records changes to a file or set of files over time to enable recall of specific versions later – is being used with either GitHub or GitLab. Users also need access to Python programming language and a Unix based machine, although most features will work on Windows.

The tool provides a series of prompts that help to generate the structure with a range of AQA (analytical quality assurance) features.

Hooks

Other features include a hook to check if committing files larger than 5MB, another to clean up Jupyter notebook outputs, and another to identify secrets such as credentials and API tokens and prevent them being version controlled.

Young said the GDS team is open to contributions on the project through the GitHub repository and would like to incorporate frameworks from other government departments.

Image from GOV.UK, Open Government Licence v3.0

Register For Alerts

Keep informed - Get the latest news about the use of technology, digital & data for the public good in your inbox from UKAuthority.