DWP: Conversational Platform

A telephone based virtual assistant which asks callers the reason for their call, uses speech recognition to understand the response and then either signposts them to relevant self serve options such as GOV.UK or routes through to DWP advisors.

Tier 1 Information

1 - Name

Conversational Platform

2 - Description

Vision: To modernise our telephony service offered across DWP in a user-centric way, enabling customer self-service within telephony, reducing effort, and increasing the accuracy of understanding of what our customers want to do. Taking advantage of the enhanced technology to adapt our customer offer with enhanced customer insight, linking telephony transformation to DWP’s wider service modernisation, whilst better protecting the public purse.

Conversational Platform provides the most human-like human-to-machine communication experience in order to help improve the customer experience across multiple channels.

Conversational platform -

Allows our citizens to speak naturally,  provides a better insight into why the customers are calling which then enables the following, depending on the nature of the enquiry:

Reason For Call – Understands why the citizen is calling, enabling call steering and basic deflection.

Identification & Verification (ID&V) – Orchestrates the voice to machine identity and verification process with the trust hub.

Personalised Call Deflection - A technique of re-routing a customer’s call to an alternate channel of customer service. Call deflection is the most effective way to reduce costs by moving customers to lower cost ‘digital’ channels

Self-Service – Answers the caller’s query in the Interactive Voice Response (IVR) without the need for an agent intervention (for example, advising the citizen of their next payment amount and payment date, based on capturing the customers initial utterance at the beginning of the call).

Customer Differentiation – By knowing what the citizen is calling about, we can apply differentiation treatments to improve routing the ‘right call to the right agent at the right time’

3 - Website URL

N/A

4 - Contact email

N/A

Tier 2 - Owner and Responsibility

1.1 - Organisation or department

Department for Work and Pensions, Digital Modernisation and Efficiency, (part of DWP Digital)

1.2 - Team

Conversational Platform Team, Next Generation Contact Centre (NGCC)

1.3 - Senior responsible owner

Director for Digital Modernisation & Efficiency

1.4 - External supplier involvement

Yes

1.4.1 - External supplier

Omilia Natural Language Solutions (UK) Limited

1.4.2 - Companies House Number

12750619

1.4.3 - External supplier role

DWP procured the product from Omilia to host on the DWP estate to deliver the service internally. The Supplier has no further involvement other than providing release updates, patches, fixes, tuning etc

1.4.4 - Procurement procedure type

The tender was run using a CCS Framework, “Automation Market Place” which is a Dynamic Purchasing System (DPS).

1.4.5 - Data access terms

No access has been granted to data

Tier 2 - Description and Rationale

2.1 - Detailed description

The Conversational platform utilises speech recognition software and asks citizens that call the DWP the question ‘what are you calling about today?’. Then once we have listened to the answer, the team have designed a natural language understanding model which then acts accordingly to what it has been taught do do. This can include signposting the caller to GOV.UK, or helping them log in to their online account to resolve the query, or it may route them to a live DWP advisor. In all cases, the tool takes the action the DWP digital team has taught it to take, it does not make any decisions itself.

2.2 - Scope

The tool was designed to improve the experience and efficiency of the telephony contact channel across DWP. The primary function is to quickly and accurately ascertain what the person wants to achieve on that phone call and help them do it themselves, or route them to the right advisor to talk to about it.

2.3 - Benefit

The tool was designed to improve the experience and efficiency of the telephony contact channel across DWP. The primary function is to quickly and accurately ascertain what the person wants to achieve on that phone call and help them do it themselves, or route them to the right advisor to talk to about it.

2.4 - Previous process

DWP has historically used DTMF IVRs but these have limitations compared to the enhanced capability of virtual assistants. The decision making stages prior to the deployment of the new tool were:

Market Research Supplier demonstrations Request for Information (RFI) Request for Proposal (RFP) Scoring Moderation Selection

This was alongside key internal governance gateways which were:

Digital Design Authority (DDA) Digital Planning Forum (DPF) Change Approval Board (CAB)

2.5 - Alternatives considered

The two alternatives in a telephony context are:

  1. DTMF (dual tone multi frequency) IVRs (aka touch tone IVRs) or
  2. Speech based virtual assistants.

DWP has historically used DTMF IVRs but these have limitations compared to the enhanced capability of virtual assistants.

The other option is, if Virtual Assistants were not be used, would be to continue with, or add more DTMF IVRS onto DWP phone lines.

Tier 2 - Decision making Process

3.1 - Process integration

The tool does not make any decisions, either the customer makes a decision to end the call after we have provided signposting information to alternative channels, or the call is routed to through to a human agent and the caller will discuss their query with them and the agent would follow their processes regarding decision making when required.

The tool is integrated with the DWPs ‘Dynamic Trust Hub’ (DTH) and communicates via an Application Programming Interface (API). This is in the context of identifying and verifying a customer who is being routed through to an agent. Conversational Platform asks citizens a question from the ID&V list and relays that answer to DTH which replies with a ‘yes or no’ to indicate whether the answer is correct or incorrect. The Conversational Platform does not make any decisions in this process either.

3.2 - Provided information

Conversational Platform provides the caller with following:

Allows our citizens to speak naturally,  provides a better insight into why the customers are calling which then enables the following, depending on the nature of the enquiry:

Reason For Call – Understands why the citizen is calling, enabling call steering and basic deflection.

Identification & Verification (ID&V) – Orchestrates the voice to machine identity and verification process with the trust hub.

Personalised Call Deflection - A technique of re-routing a customer’s call to an alternate channel of customer service. Call deflection is the most effective way to reduce costs by moving customers to lower cost ‘digital’ channels

Self-Service – Answers the caller’s query in the Interactive Voice Response (IVR) without the need for an agent intervention (for example, advising the citizen of their next payment amount and payment date, based on capturing the customers initial utterance at the beginning of the call).

Customer Differentiation – By knowing what the citizen is calling about, we can apply differentiation treatments to improve routing the ‘right call to the right agent at the right time’

This tool does not make any decisions and the caller has the option to be routed to a human agent if the wish

3.3 - Frequency and scale of usage

At present approximately 1 million calls per month pass through the Conversational Platform.

3.4 - Human decisions and review

This tool does not make any decisions and the caller has the option to be routed to a human agent if the wish.

3.5 - Required training

Training has included:

Dashboard tools for building effective and usable applications.

How to define applications in terms of targets, system actions, fields and preconditions.

Familiarisation with key functionalities such as planning, event tracking, ambiguity resolution, reaction definition and prompt generation among others.

Hands-on experience in creating their own conversational application.

Simple & complex customer requests as well as complete Self Service flows, thus familiarising themselves with the notions of dialogue targets, fields, actions, and intents among others.

3.6 - Appeals and review

This tool does not make any decisions and the caller has the option to be routed to a human agent if the wish if they have any additional needs

Tier 2 - Tool Specification

4.1.1 - System architecture

The Conversational Platform consists of several discrete components and micro service, that are developed individually but integrate to make the complete solution. These are deployed in the AKS clusters using DWP standard deployment and path to live process.

Description of dataflow –

•Calls arrive at Omilia by reaching the Omilia Session Initiation Protocol (SIP) Connectivity Service. This service is providing a SIP interface (based on Kamailio) that allows calls and audio to be forwarded to Open Platform Communications (OPC). Kamailio will receive a SIP invite and will forward it to OmIVR.

•OmIVR will then receive Real-time Transport Protocol (RTP) in the assigned port. The range 10000-40000 is used to support the different connections. The audio is forwarded to OmIVR and a connection is opened to initiate audio recognition for the duration of the call. Data in transit here is the customer’s voice/audio.

•OmIVR triggers agi-connector which initiates an application as per the dialplan or number configuration. The application is started in DiaMant and on each step DiaManT will provide the prompt and information that should be played back to the customer. DiaManT is exposed to agi-connector through a haproxy installation for load-balancing.

•When a user is prompted to speak, the audio is directed to deepAutomatic Speak Recognition (ASR) on the respective RTP channel and a transcription of that audio is stored in redis for agi-connector to fetch it and send it to DiaManT.

•DiaManT will then take this utterance and send it over http/https to be annotated through the Natural Language Understanding (NLU) service which returns a response with the annotations and allows the conversation to move forward.

•After understanding the customer’s intent/interaction, DiaManT will select what prompt to play and inform agi-connector and OmIVR sequentially which will play an audio file stored in the shared file system.

•On each step it then iterates the process described above until the call is either closed or transferred to the contact centre. To perform a transfer, diamant will inform agi-connector that will send a SIP REFER message back to Kamailio and then to the SBC. If the call closes a SIP BYE message will be sent.

•In each step Call Detail Records are stored in the DiaMant database and audios are placed in the shared file system. These can only be accessed through DRTViewer by users with the specified access.

4.1.2 - Phase

Production

4.1.3 - Maintenance

The speech recognition software is tuned every month and the natural language understanding model is updated every quarter.

4.1.4 - Models

Speech Recognition Natural Language understanding Utterance transcription Data attached to phone call (customer intent) Encryption to mask any Personally Identifiable Information (PII) from transcription

Tier 2 - Model Specification

4.2.1 - Model name

Omilia software

4.2.2 - Model version

N/A

4.2.3 - Model task

To understand verbal customer responses to the question asked by the virtual assistant which is ‘what is it you’re calling about today?’ and depending on that response, signpost to other channels or route the call to an agent.

4.2.4 - Model input

Analysis of verbal customer utterances captured on the phone lines then classifying and grouping these to form ‘intents’ which in turn dictate what happens next (signposting or routing)

4.2.5 - Model output

Information on self serve options for that reason for call or calls being routed to a set of live DWP advisors.

4.2.6 - Model architecture

See 2.4.1.1

4.2.7 - Model performance

97% speech recognition success 9% increase in deflection Right first time routing increased by 21%.

4.2.8 - Datasets

80,000 customer utterances were captured and analysed to inform the build of intent classification, intent groupings and associated customer journeys (signpost or route out).

4.2.9 - Dataset purposes

See above.

Tier 2 - Data Specification

4.3.1 - Source data name

Customer utterance / IDV - DTH

4.3.2 - Data modality

Audio

4.3.3 - Data description

Customer utterances describing their reason for contact such as ‘I want to change my address’ or I’m calling to ask where my claim is up to’

4.3.4 - Data quantities

80,000 customer utterances were captured and analysed to inform the build of intent classification, intent groupings and associated customer journeys (signpost or route out).

4.3.5 - Sensitive attributes

There are no sensitive attributes contained within the datasets.

4.3.6 - Data completeness and representativeness

These customer utterances were captured on the phone lines for all benefit lines in scope of the project over the period of 1 week. No specific targeting was in place, all calls had utterances captured and as such were fully representative of citizens calling DWP during that period.

4.3.7 - Source data URL

There is no openly accessible data.

4.3.8 - Data collection

Capture intent as per 2.4.3.6

4.3.9 - Data cleaning

Encryption ensures no Personally Identifiable Information (PII) appears in any call transcript

4.3.10 - Data sharing agreements

No data is shared.

4.3.11 - Data access and storage

Call Detail Records are stored in the DiaManT database and audios are placed in the shared file system. These can only be accessed through DRTViewer by users with the specified access.

Tier 2 - Risks, Mitigations and Impact Assessments

5.1 - Impact assessment

Both Data Protection Impact Assessment (DPIA) and Equality Analysis (EA) have been completed

5.2 - Risks and mitigations

Possible risks associated with the Conversational Platform are that the speech recognition doesn’t understand what a caller says and as a result they are provided incorrect signposting information or routed to the wrong set of DWP agents. These risks have been mitigated by making detailed reporting which shows what the customer said and what action was taken as a result. This enabled regular speech recognition tuning to take place and also updated to the language model when necessary.

Updates to this page

Published 16 December 2025