Guidance

Ministry of Justice: Data First

Data First is an ambitious data-linking programme led by the Ministry of Justice and funded by ADR UK.

Data First aims to unlock the potential of the wealth of data already created by the Ministry of Justice (MOJ), by linking administrative datasets from across the justice system and enabling accredited researchers, from within government and academia, to access the data in an ethical and responsible way. The project will also enhance the linking of justice data with other government departments.

By working in partnership with academic experts to facilitate and promote research in the justice space, Data First will create a sustainable body of knowledge on justice system users, their interactions across the criminal, family and civil courts and their needs, pathways and outcomes across a range of public services.

This will provide greater insight to inform the development of MOJ policies and drive real progress in improving social and justice outcomes.

The programme is led by MOJ and funded by ADR UK (Administrative Data Research UK), an investment by the Economic Social and Research Council (ESRC).

Through Data First, the MOJ has developed a free and open-source software library to enable data linkage at scale. This software has been used to link some of the largest datasets held by MOJ as part of Data First.

Splink is now in its third version. It is a freely available, open-source Python package that is:

  • faster and more accurate than other free tools
  • able to link huge datasets, of tens of millions or records or more
  • developed with advice from academic experts in data linkage
  • able to produce a wide range of interactive data visualisations that help to build effective models, explain linkage predictions, diagnose problems and quality assure models
  • compatible with multiple databases and big data processing engines, meaning it can run on a wider range of computer systems

You can find out more on the Splink website, where you can download and start using Splink. You can also ask us a question or raise an issue on the public GitHub repository. We’d be very happy to hear from researchers interested in using Splink for their work.

General project information

Datasets

Analytical outputs

Application form

MOJ: Data First, application form for secure access to data

Contact

Contact the Data First team at datafirst@justice.gov.uk if you have any queries.

Published 30 June 2020
Last updated 14 October 2022 + show all updates
  1. Splink information added.

  2. Data First Family Court data catalogue updated.

  3. Data First prisoner custodial journey data catalogue updated.

  4. Analytical outputs section added.

  5. User guide updated and Data First probation data catalogue, Data First criminal courts, prisons and probation linking data catalogue published.

  6. User guide updated and Data First Family Court data catalogue published.

  7. User guide, privacy statement, Data First magistrates' court defendant data catalogue, Data First Crown Court defendant data catalogue and Data First criminal courts and prisons linking data catalogue updated.

  8. User guide updated and Data First prisoner custodial journey data catalogue published.

  9. User guide updated and Data First linked magistrates’ and Crown Court data catalogue published.

  10. Documents updated and Data First Crown Court defendant data catalogue published.

  11. First published.