Press release

Ground-breaking use of AI saves taxpayers’ money and delivers greater government efficiency

The government’s AI tool Consult analysed 50,000+ responses to the Independent Water Commission review in 2 hours, matching human accuracy and potentially saving 75,000 days of manual work each year.

  • AI tool built by the UK government sped up analysis of over 50,000 responses to a government-commissioned review of the water sector. 
  • Named ‘Consult’, the tool was found to be at least as accurate and reliable as humans.
  • The tool could ultimately help save 75,000 days of manual analysis which is currently slowing down policy action across government every year.

Specialist AI tech built by the UK government helped to speed up the government’s decision to abolish Ofwat.

The simple task of sorting over 50,000 responses into key themes made the Independent Water Commission (IWC) analysis more efficient and effective. The AI tool categorised responses into themes in around 2 hours, costing £240 and experts only needed 22 hours to check the results.

It meant policy experts could focus on using themes and categorised responses to inform recommendations for their independent report, rather than sorting tens of thousands of individual responses.

Alongside the AI-assisted thematic analysis, the team also completed detailed manual reviews of responses from stakeholders to ensure their perspectives were thoroughly considered.

The work of ‘Consult’ was compared to 2 groups of experts. It agreed with one or both of the groups almost 83% of the time, while the 2 well-practiced human groups only agreed with each other 55% of the time.

Earlier in the year the tool successfully supported the analysis of the Scottish government’s consultation on non-surgical cosmetics. It has been confirmed that it was also used to sort responses to the Digital Inclusion Action Plan. With 800 people responding to the ‘call for evidence’, the technology was accurate and sped up the government’s ability to find initial results.

The technology, part of ‘Humphrey’, will analyse other consultations responses in a bid to save officials from 75,000 days of manual analysis every year, which costs £20 million in staffing costs. This will help to create a more agile, effective state refocused on delivering Plan for Change.

Digital Government Minister Ian Murray said:

This shows the huge potential for technology and AI to deliver better and more efficient public services for the public and provide better value for the taxpayer.

By taking on the basic admin, Consult is giving staff time to focus on what matters – taking action to fix public services. In the process, it could save the taxpayer hundreds of thousands of pounds.

Another tool in the ‘Humphrey’ suite, called ‘Redbox’, helped 5,330 officials at its peak work more efficiently – with the technology helping them to summarise long documents, draft briefing notes and more.

Since it was introduced, major tech companies have started to provide tools that give officials a secure way to use large language models that are integrated into IT systems they are already using, for example, Microsoft Copilot. Often, these come as part of existing software deals between the government and technology companies.

For example, a recent trial of Microsoft Copilot found that the technology could save officials 2 weeks every year. As a result, engineers in the team are developing new tools, such as those identified by the Prime Minister as AI Exemplars’ which aim to speed up planning decisions to help build homes, help probation officers have more impactful engagements with offenders, and more.

As a result, development on Redbox will not continue, though it has now been open-sourced. The engineers that built the tool have gone on to use their knowledge to build other technology in the ‘Humphrey’ suite and also shared information that was used to build GOV.UK Chat, the generative AI chatbot that will soon be trialled in the GOV.UK App.

Notes to editors

The evaluation of Consult on the Independent Water Commission call for views shows that it secured an F1 score (a common measure of alignment for AI tools) of 0.79 and 0.82. This is higher than the F1 score between human reviewers (0.74), and shows an increase from 0.76 when the technology was used on the Scottish government consultation, which received fewer responses (2,000). There are 2 F1 figures since there were 2 groups of reviewers that Consult was compared to. They’ve reported the individual scores for both groups.

Visit:

DSIT media enquiries

Email press@dsit.gov.uk

Monday to Friday, 8:30am to 6pm 020 7215 3000

Updates to this page

Published 16 October 2025