Guidance

Universal Credit statistics: quality assurance of administrative data report

Published 10 November 2020

Applies to England, Scotland and Wales

The latest release of Universal Credit statistics can be found in the Universal Credit statistics collection.

1. Introduction

1.1 Background

This report contains information on the Universal Credit administrative data sources used by the Department for Work and Pensions, as well as quality assessments on each of them.

The UK Statistics Authority have published a regulatory standard including a Quality Assurance of Administrative Data (QAAD) toolkit. The standard was developed in response to concerns about the quality of administrative data and in recognition of the increasing role that such data is playing in the production of official statistics.

1.2 List of Administrative Datasets

Three administrative datasets are used in the production of Universal Credit official statistics. A brief summary of these datasets and their uses is described below.

Universal Credit Full Service (UCFS)

Universal Credit Full Service is the main data source for the administration of Universal Credit claims. It is used as the base data source for Universal Credit official statistics to determine who is on Universal Credit, and the details of the various elements of Universal Credit.

Customer Information Service (CIS)

The Customer Information System contains a record for all individuals who have registered and been issued with a National Insurance number.

Central Payment System (CPS)

The Central Payment System is an integrated payment and accounting system for the Department. It is used for the claim end dates and payments.

2. Quality assurance of administrative data (QAAD) assessment

2.1 UK Statistics Authority QAAD toolkit

The assessment of the Universal Credit administrative data sources has been carried out in accordance with the QAAD toolkit.

The QAAD toolkit sets out four levels of quality assurance that may be required of a dataset:

  • A0 – no assurance

  • A1 – basic assurance

  • A2 – enhanced assurance

  • A3 – comprehensive assurance

The UK Statistics Authority states that the A0 level is not compliant with the Code of Practice for Statistics. The assessment of the assurance level is in turn based on a combination of assessments of data quality risk and public interest. The toolkit sets out the level of assurances required as follows:

Level A1 – basic assurance

The statistical producer has reviewed and published a summary of the administrative data quality assurance (QA) arrangements.

Level A2 – enhanced assurance

The statistical producer has evaluated the administrative data QA arrangements and published a fuller description of the assurance.

Level A3 – comprehensive assurance

The statistical producer has investigated the administrative data QA arrangements, identified the results of independent audit and published detailed documentation about the assurance and audit.

To determine which assurance level is appropriate for a statistics publication it is necessary to take a view of the level of risk of quality concerns and the public interest profile of the statistics.

Each administrative data source has been evaluated according to the toolkits risk and profile matrix (Table 1), reflecting the level of risk to data quality and the public interest profile of the statistics.

Table 1: UK Statistics Authority quality assurance of administrative data (QAAD) risk and profile matrix

Lower public interest profile Medium public interest profile Higher public interest profile
Low level of risk of quality concerns Statistics of lower quality concern and lower public interest [A1] Statistics of low quality concern and medium public interest [A1 or A2] Statistics of a low quality concern and higher public interest [A1 or A2]
Medium level of risk of quality concerns Statistics of medium quality concern and lower public interest [A1 or A2] Statistics of medium quality concern and medium public interest [A2] Statistics of medium quality concern and higher public interest [A2 or A3]
High level of risk of quality concerns Statistics of higher quality concern and lower public interest [A1 or A2 or A3] Statistics of higher quality concern and medium public interest [A3] Statistics of higher quality concern and higher public interest [A3]

Source: Office for Statistics Regulation

2.2 Assessment and justification against the QAAD risk and profile matrix

The data risk of quality concern and public interest profile in Universal Credit statistics are rated by assessing: (a) the possibility of quality concerns arising in the administrative data that may affect the statistics’ quality; and (b) the nature of the public interest served by the statistics.

(a) The Universal Credit data is regarded as being a medium risk of data quality concern. While every effort is made to collect data to the highest quality, as with all administrative data it is dependent on the accuracy of information entered into the system. Checks are made throughout the process from collection of the data to producing the statistics but some data entry or processing errors may filter through.

(b) The Universal Credit official statistics are regarded as higher public interest due to regular coverage of Universal Credit policies and statistics in the media and their impact on labour market.

Therefore, as defined by the risk and profile matrix (Table 1), the combination of medium level of data risk concerns, and higher public interest profile indicate that enhanced assurance [A2] is the minimum level required for Universal Credit statistics.

The QAAD toolkit outlines four specific areas for assurance, and the rest of this report will focus on these areas in turn. These are:

  • operational context and administrative data collection

  • communication with data supply partners

  • quality assurance principles, standards and checks applied by data suppliers

  • producer’s quality assurance investigations and documentation

Each of the four practice areas are evaluated separately, and the respective level of assurance is stated. This approach enabled an in-depth investigation of the areas of particular risk or interest to users.

The overall level of assurance for Universal Credit statistics is outlined in the summary section.

3. Areas of quality assurance of administrative data (QAAD)

3.1 Operational context and administrative data collection (QAAD matrix score A2)

This relates to the need for statistical producers to gain an understanding of the environment and processes in which the administrative data are being compiled and the factors that might increase the risks to the quality of the administrative data.

Universal Credit Full Service (UCFS) Data Collection

The information needed to produce Universal Credit statistics is mainly obtained from the administrative systems to operate Universal Credit. The accuracy and completeness of data at this stage is crucial to the quality of the statistics.

The data on the UCFS administrative system is built up from information entered by claimants, Universal Credit agents, work coaches and through automated procedures.

Universal Credit is largely a digital service with around 98% of claimants applying for, and managing their claim through online accounts. Alternative telephone based services are available for the 2% that are unable to have access to digital services.

The following processes detail the information that is collected during a claimant’s journey through Universal Credit and the verification procedures in place.

Submitting a claim

Claimants first create a Universal Credit account. This is done online or, where they are unable to use digital services, by telephone. Claimants enter information about their circumstances and submit their claim.

After submitting a claim, claimants are required to verify their identity online using the DWP Confirm Your Identity service or GOV.UK verify. If this is not possible, a work coach will call to arrange an appointment at a Jobcentre to verify their identity. Universal Credit agents validate claimant identities against the information on the Central Information Service (CIS). Where there is a disparity, CIS is updated once the evidence has been provided.

Work coaches establish the claimant work regime, and set up a claimant commitment to be agreed by the claimant.

Universal Credit agents investigate claimant information regarding their personal circumstances and the corresponding evidence is independently verified (for example, with landlords). The system is then updated to record that the information has been accepted by the agent.

Payment

At the end of the first assessment period (1 month from the claim submitted date), the Universal Credit entitlement is calculated either automatically or by agents. Entitlement will be based on the elements that have been verified with evidence and accepted.

The first payment is made 7 days after end of the first assessment period (payment date) and recorded on the Central Payment System. Any outstanding entitlement that has not been paid will only be paid once evidence has been verified and accepted.

At 7 days after the end of the latest assessment period, the next payment is made and the monthly assessment period cycle continues.

End of claim

Claimants no longer receive payments if their claim is closed. When a claim is closed, the claimant’s account remains active for 6 months. If they reapply for Universal Credit within that time, the information from their previous claim is used, with only changes in circumstances having to be verified and accepted.

Other systems

Some information cannot be obtained from the UCFS administrative system. Other systems are therefore used to improve the quality of postcode, gender and date of birth data where it is not available on the UCFS system.

Strengths

  • Information is updated in a timely manner

  • Information is verified either independently or by documentary evidence provided by the claimant

  • Coverage is expected to be close to 100% given the nature of UCFS data collection

  • Data quality is improved by using data from other systems

Weaknesses

  • Constant developments in the UCFS system present risks to the consistency of data collection

  • As with any administrative system, there is potential for fraud and administrative error, as the process relies on information submitted by claimants and verified by operational teams

  • Data is subject to revision that may take several months to filter through

3.2 Communication with data supply partners (QAAD matrix score A2)

This relates to the need to maintain effective relationships with suppliers (through written agreements such as service level agreements or memoranda of understanding), which include change management processes and the consideration of statistical needs when changes are being made to relevant administrative systems.

UCFS data is owned by DWP and provided to analysts as a business requirement for management information, policy analysis and statistics.

Data Works, an internal team within DWP, are the principal data supply partners for Universal Credit official statistics. They are responsible for providing data to the analytical community within DWP to agreed specifications. They liaise with data owners on behalf of analysts in respect of data issues and proposed system changes.

UCFS system data is exported to Data Works which then goes through an extract, transform and load (ETL) process to form a structured physical data model (PDM). The Data Delivery team create encrypted UCFS SAS datasets which are then made available to statistics producers.

Data Works and the Data Delivery team:

  • collect, store and transfer data lawfully

  • ensure the data is accurate and updated in a timely manner

  • store special category data (e.g. ethnicity, health, religion info) more securely

  • comply with internal policy to mask personal identifiers to ensure that individuals cannot be identified

Communication with Data Works

The statistics team have established communication channels with Data Works. Data Works act as the channel of communication to the UCFS system team who oversee the running and design of the administrative system.

If there is an issue with the UCFS system data export to the PDM:

  • Data Works liaise with the UCFS system team to investigate

  • stakeholders are informed if there is the possibility of a delay to the agreed schedule of updating the PDM

Once Data Works have built the PDM, stakeholders are notified and sent a quality report. Analysts use the quality report to ‘sense-check’ observation counts and feedback potential issues.

Data Works receive ongoing feedback from analysts and statistics producers which they use to review their quality assurance processes. If there are data issues, Data Works inform stakeholders and resolve them by meeting with stakeholders to agree a proposed resolution.

Data Works sign-off arrangements for the PDM are to:

  • create a quality report to compare data to previous months

  • use feedback from analysts to identify and investigate potential issues

  • ensure any identified issues are resolved

Communication with the Data Delivery team

The Data Delivery team (DDT) continuously inform Data Works of issues with the PDM and provide relevant feedback where required.

Data processing to create encrypted UCFS SAS datasets starts when the DDT receive notification from Data Works that the PDM has been updated. Prior to starting, the DDT instruct all analysts not to access the existing SAS datasets until the data refresh is complete to minimise the risk of data issues. On completion, statistics producers are notified to confirm that the SAS datasets are available.

A report is made to the DWP security advice centre if DDT later identify an issue with the data.

The DDT sign-off arrangements for the UCFS SAS datasets are to:

  • ensure SAS datasets match the specifications on the quality control sheet

  • report issues to Data Works for further investigation

  • ensure any identified issues are resolved

Communication with operational teams

Communication with operational and other analytical teams ensure an understanding of UCFS system changes that impact on the official statistics. This is particularly important for Universal Credit because, as a relatively new benefit, the administrative systems are constantly developing.

Strengths

  • Stakeholders receive regular communication from data supply partners

  • Data supply partners receive feedback during the data transformation process from analysts (experts in the data) which therefore increases the likelihood of issues being identified and resolved

  • Awareness of UCFS system changes contribute to increasing the accuracy of the data

Weaknesses

  • The statistics team are not involved in UCFS system design and have no direct engagement with the system designers

  • There is a small production window to create the PDM and UCFS SAS datasets and issues are likely to cause delays

3.3 Quality assurance principles, standards and checks by data supplier (QAAD matrix score A2)

This relates to the validation checks and procedures undertaken by the data supplier, any process of audit of the operational system and any steps taken to determine the accuracy of the administrative data.

Data supply partners carry out quality assurance checks throughout the data transformation process.

Data Works QA checks

The data export from the UCFS system to the PDM is monitored by Data Works, who use quality assurance checks to identify issues such as a stop in the data flow. with quality assurance checks in place to ensure flags are raised if the data flow stops or issues are identified. Data Works have the option to extend the retention period for UCFS system data to allow more time to investigate and resolve any issues.

Data Works create a quality report to highlight potential data issues which is then provided to analysts for additional scrutiny. Data Works are developing the scope of the report as currently it only details total observations compared to previous months. The enhanced quality report will:

  • include counts of the observations removed and added due to revisions

  • check all variables for missing data

  • highlight changes to the list of Jobcentres

Data Delivery team QA checks

The Data Delivery team use the UCFS quality control sheet to check the specification of all variables in each of the UCFS SAS datasets. Any inconsistencies are investigated to identify whether the issue is with the SAS datasets or the PDM.

Strengths

  • Consistent checks every month allows understanding of observed differences

  • Continuously improved quality assurance checks and using stakeholder feedback to shape them

  • In-depth understanding of data and access to more detailed data ensure issues are identified and resolved efficiently

Weaknesses

  • It is not possible to identify missing historic data from the Data Works quality report

3.4 Producers quality assurance investigations and documentation (QAAD matrix score A2)

This relates to the quality assurance conducted by the statistical producer, including corroboration against other data sources.

The UCFS SAS datasets are refreshed on a monthly basis by the Data Delivery team. Once they are available, the statistics team carries out a series of checks prior to data processing.

Pre-processing checks

The statistics team generates observation counts for each dataset to ensure missing data is identified at the earliest opportunity and rectified before data processing starts. The back series counts for each UCFS SAS dataset are expected to match those recorded in previous months, or relatively small changes due to revisions. If there are significant changes, this may indicate missing data and is reported to Data Works for further investigation.

The variable formats in each dataset are compared to previous months to identify changes and therefore reduce the risk of errors during data processing. If there is a change, it is resolved by modifying the code in relevant data processing projects.

Data processing checks

The statistics team check that each stage of data processing has completed as intended. Generally, the checks ensure that all expected variables are populated and that total observation counts are comparable to previous months.

An important stage of data processing is checking that all claimants have a Jobcentre office code. In the case of missing office codes, if the claimant was previously present in the dataset during the same claim (for example, in the preceding month), and at that time they were associated with an office, then this office is used.

UCFS SAS datasets contain claimant data that covers the UK. However, during data processing, all claims registered at a Northern Ireland office are removed to create a Great Britain (GB) dataset only. There are quality checks in place to ensure that claims with a Northern Ireland postcode or Jobcentre office code are excluded from the GB dataset.

Output validation

Once the statistical outputs have been created, a series of robust quality assurance checks are carried out by the statistics team. This includes:

  • checking observation counts compared to the back series for several breakdowns

  • checking that revisions to provisional and historic data are reasonable

  • checking monthly percentage changes and proportions compared to recent trends

  • investigating unexpected trends and sharp changes

In addition, the statistical outputs are compared to Universal Credit management information (MI) data. Corroboration with MI data allows comparison of expected trends and observation counts.

Strengths

  • A pre-emptive approach ensures that issues are addressed prior to data processing. Resources within the statistics team are therefore optimised to complete other time sensitive data processing tasks

  • Pre-processing checks are in place to identify missing historic data

  • Extensive quality assurance throughout the statistical production process

Weaknesses

  • Quality assurance is mostly manual so human error is a risk factor

  • The identification of a data issue during pre-processing checks may cause a delay in data processing (and possibly on timely publication) especially if re-creation of UCFS data by data supply partners is required

4. Summary

The Department for Work and Pensions (DWP) considers the main strengths of the Universal Credit Full Service data to be that:

  • UCFS system information is updated in a timely manner

  • information is verified either independently or by documentary evidence provided by the claimant

  • the feedback that data supply partners receive during the data transformation process from analysts increases the likelihood of issues being identified and resolved

  • data supply partners continuously improve their quality assurance checks and use stakeholder feedback to shape them

  • the statistics team carry out extensive quality assurance throughout the statistical production process

The current limitations are that:

  • there are potential risks to the consistency of data collection due to constant developments in the UCFS system

  • there is the potential for fraud and administrative error as the system relies on the information submitted by claimants and verification by operational teams

  • the statistics team are not involved in the UCFS system design and have no direct engagement with the system designers

In constantly seeking to improve Universal Credit official statistics, steps will be taken to mitigate the limitations identified in this report, and progress will be communicated to users in the next Quality Assurance of Administrative Data assessment.

However, Universal Credit official statistics are assessed as being assured to level A2 (enhanced assurance) as outlined by the UK Statistics Authority QAAD toolkit.

If you are of the view that this report does not adequately provide this level of assurance, or you have any other feedback, please contact us via email at team.ucos@dwp.gov.uk with your concerns.