Official Statistics

Method for assigning ethnic group in the COVID-19 Health Inequalities Monitoring for England (CHIME) tool

Updated 24 May 2023

Applies to England

Introduction

There is great demand for data on coronavirus (COVID-19) cases, hospitalisations and deaths by ethnic group. However, ethnicity data is not collected at death registration and, in addition, there are gaps in the ethnicity data available for confirmed COVID-19 cases. A process was therefore needed to assign ethnicity from NHS Digital Hospital Episodes Statistics (HES).

This document details how CHIME assigns ethnicity to records of deaths, hospital admissions and confirmed COVID-19 cases, and the methods used.

As different ethnicities may be recorded in different treatment episodes, the method selected a single ethnic group from a patient’s HES records and linked it with the COVID-19 case or death record. This method was used to provide data by ethnic group in the report on disparities in the risks and outcomes of COVID-19.

During the pandemic, it became evident that this original method of assigning ethnicity had some limitations - in particular, that it overestimated the number of people in the ‘other’ ethnic group.

Alternative methods of assigning ethnicity from HES were therefore investigated and were discussed with stakeholders in the Office for Health Improvement and Disparities (OHID) (formerly Public Health England (PHE) and PHE at the point of consultation), as well as external stakeholders from the Office for National Statistics (ONS), the Race Disparity Unit, NHS Digital, the King’s Fund and the Institute of Health Equity.

A difference method was agreed and used for the data provided by ethnic group in the COVID-19 Health Inequalities Monitoring for England (CHIME) tool. This document details the method for assigning ethnicity as used by CHIME. It also notes the source of the population estimates used for calculating rates for hospital admissions, deaths and confirmed COVID-19 cases by ethnic group.

The method for assigning ethnicity from Hospital Episode Statistics

The original method used by OHID (formerly PHE) assigned the most recent usable ethnic code for an individual available in HES data.

The new method is based on the NHS Digital HES ethnicity index with a few modifications. The new method is for deaths and hospital admissions indicators using HES. With this method:

  • the most frequent ethnicity recorded across the 3 HES data sets (Admitted Patient Care (APC) from 2003 to 2004 onwards, Accident and Emergency (A&E) from 2007 to 2008 onwards and outpatient (OP) from 2003 to 2004 onwards), excluding any unknown values, is used. OP data was not used between 2006 to 2007 and 2009 to 2010 as, due to a technical issue, no ethnic code entries were recorded in those years. APC data is restricted to 2003 to 2004 onwards as the quality and completeness of admitted patient care data was lower before then
  • if there are multiple ethnicities in the data sets with the same frequency, the most recent is chosen
  • if there are multiple ethnicities with the same frequency and latest date, precedence is given to the most recent value from the APC data set as it is considered more robust, followed by the A&E data set, followed by the OP data set. Checks completed by NHS Digital indicate completeness in the A&E data set is better than the OP data set
  • if there are multiple ethnicities with the same frequency, latest date and source of data, the ethnicity that occurs more frequently in the general population of England and Wales, according to the 2011 Census (see Appendix A), is selected. Incidences of this are very small, and this step was introduced in order to automate the process and to receive the exact same result each time the analysis is completed
  • a value of ‘ethnicity unknown’ will only be present if there are no known ethnicities in any of the HES data sets
  • to take into the account the overrepresentation of the ‘other’ ethnic group, if the most common ethnic group assigned by the method above is ‘other’ then the second-most common usable ethnic group is assigned instead. A person will only be assigned to the ‘other’ ethnic group if there are no other usable ethnic groups

To note, it is perfectly valid for patients to decide to not state their ethnicity when this information is collected in hospital data. People may also decide to state their ethnicity on some occasions but not others. The original and new methods used for assigning ethnicity do not select ‘not stated’ records if there are alternative ethnic codes available. Only those who do not have a usable ethnic code and have repeatedly not stated their ethnicity will have the ethnicity ‘not stated’ recorded.

Impact of the change in method

The biggest impact of the change in method (for indicators of death and hospitalisations) has been on the ‘other’ ethnic group. In the report on disparities in the risks and outcomes of COVID-19, the highest mortality rates for deaths involving COVID-19 in the first wave of the pandemic were, by some margin, in the ‘other’ group (see Appendix B). That is not the case for the mortality rates presented in the CHIME tool. Across the pandemic period to date, the cumulative mortality rates (and hospital admission rates) using the new method of ethnicity assignment were highest for the black and Asian groups.

Method for assigning ethnicity - confirmed COVID-19 cases

A new method of assigning ethnicity to COVID-19 cases was implemented when the case definition for COVID-19 cases was updated.

Indicators looking at confirmed COVID-19 cases within the CHIME tool primarily use the ethnicity recorded during pillar 2 testing (swab testing for the wider population as part of the UK government testing programme). If the most recent ethnicity collected is a usable ethnic code, this will be the ethnicity used for the CHIME indicators. The most recent ethnicity collected is applied to all COVID-19 episodes for the same individual.

However, for all pillar 1 cases (swab testing in UK Health Security Agency labs and NHS hospitals for those with a clinical need or health and care workers), the ethnicity used will be determined through linkage to HES as detailed above.

In addition, any pillar 2 cases which did not have a usable ethnic code, such as ‘null’, were also assigned an ethnicity through linkage with HES.

‘Prefer not to say’ was introduced as a code for ethnicity in the confirmed cases data in 2021. ‘Prefer not to say’ is regarded as a usable ethnic code in this updated method of ethnicity assignment.

The changes in case definition for the ethnicity breakdowns and ethnicity assignment were implemented in the CHIME tool in the update on 15 September 2022.

As the change in case definition and change in method of ethnicity assignment were implemented together, it is not possible to directly assess the impact of the latter using the data available in the CHIME tool. However, analysis of data using a comparable case definition showed that the change in method for ethnicity assignment resulted in an increase in cases reported as ‘ethnicity unknown’. The biggest relative change in specific ethnic groups was a decrease in the number of cases assigned to the ‘black other’ ethnic group.

Broad ethnic groups

Because of small populations for some ethnic groups in some regions, rates for hospital admissions, deaths and confirmed cases are only presented for detailed ethnic groups within England as a whole in the CHIME tool.

Rates are also presented for broad ethnic groups within England and within regions (but the rates for regions are cumulative and are not presented by month).

The broad ‘black or black British’ ethnic group is made up of the detailed groups: ‘black African’, ‘black Caribbean’ and any other ‘black’ background.

The broad ‘Asian or Asian British’ ethnic group is made up of the detailed groups: ‘Bangladeshi’, ‘Chinese’, ‘Indian’, ‘Pakistani’ and any other ‘Asian’ background.

Population estimates for ethnic groups

In the report disparities in the risks and outcomes of COVID-19, the populations for ethnic groups were taken from population estimates from the ONS.

These estimates were based on population data from the 2011 Census that were ‘aged-on’ to create annual estimates for years up to 2018.

These estimates, however, had some limitations, including the fact that they did not take into account the effect of international migration since 2011 on the ethnic distribution of the population. ONS noted that this was likely to lead to some underestimate of the population in some ethnic groups, particularly, the ‘Asian’ and ‘other’ groups, which would increase over time.

In the CHIME tool, an alternative set of populations has therefore been used for the rates of hospital admissions, deaths and confirmed cases by ethnic group.

These are estimates from the ETHPOP projections. These subnational population estimates by ethnic group were produced by Philip Rees and Paul Norman in the School of Geography, University of Leeds.

The estimates used are for 2019. These populations are modelled rather than being counts of individuals. They are thus presented in the CHIME tool at one decimal place, rather than as whole numbers.

It is anticipated that rates in the CHIME tool will be updated once population estimates for ethnic groups become available from the 2021 Census in England.

Appendix A

Table 1: proportions of population in England and Wales, split by ethnic group, Census 2011

Ethnicity Ethnic code Percentage Order
White British A 80.5% 1
White other (including Gypsy or Traveller) C 4.5% 2
Indian H 2.5% 3
Pakistani J 2.0% 4
Black African N 1.8% 5
Asian Other L 1.5% 6
Black Caribbean M 1.1% 7
White Irish B 0.9% 8
Bangladeshi K 0.8% 9
Mixed white or black Caribbean D 0.8% 10
Chinese R 0.7% 11
Mixed white or Asian F 0.6% 12
Mixed other G 0.5% 13
Black other P 0.5% 14
Mixed white or black African E 0.3% 15
Other S 1.0% 16

Appendix B

Age-standardised mortality rates for all cause deaths and deaths mentioning COVID-19, 21 March 2020 to 1 May 2020, compared with baseline mortality rates (2014 to 2018), by ethnicity and sex, England

Figure 1: male age-standardised mortality rates for all cause deaths and deaths mentioning COVID-19, 21 March 2020 to 1 May 2020, compared with baseline mortality rates (2014 to 2018), by ethnicity and sex, England

Chart displaying male age-standardised mortality rates for all cause deaths and deaths mentioning COVID-19, 21 March 2020 to 1 May 2020, compared with baseline mortality rates (2014 to 2018), by ethnicity and sex, England

Figure 2: female age-standardised mortality rates for all cause deaths and deaths mentioning COVID-19, 21 March 2020 to 1 May 2020, compared with baseline mortality rates (2014 to 2018), by ethnicity and sex, England

Chart displaying female age-standardised mortality rates for all cause deaths and deaths mentioning COVID-19, 21 March 2020 to 1 May 2020, compared with baseline mortality rates (2014 to 2018), by ethnicity and sex, England

Source: disparities in the risks and outcomes of COVID-19

Change note: This page was updated on 15 September 2022 to reflect a change in the method of ethnicity assignment for confirmed cases of COVID-19, and the availability of published population estimates by ethnic group.