Consultation outcome

Standards for ethnicity data

Updated 17 April 2023

This the draft consultation version of the standards.

Please use the revised standards for ethnicity data, published in April 2023, for guidance on best practice when collecting, analysing and reporting ethnicity data.

1. Introduction

The government’s Equality Hub has produced these standards for ethnicity data (from now on, “the standards”). They describe best practice when collecting, analysing and reporting ethnicity data.

They will be helpful to anyone interested in using ethnicity data, including people in:

  • the public sector
  • the private sector
  • the media
  • academia

1.1 The importance of ethnicity data

These standards contribute to action 6 of Inclusive Britain, the government’s action plan for racial equality:

To ensure more responsible and accurate reporting on race and ethnicity, the RDU will, by the end of 2022, consult on new standards for government departments and other public bodies on how to record, understand and communicate ethnicity data.

Ethnicity data has become important in recent years. A person’s ethnicity is often collected in government datasets. The 2017 Race Disparity Audit identified many of these datasets and showed disparities that impacted different aspects of people’s lives. This led to the creation of Ethnicity facts and figures. This website contains 180 government datasets about people from different ethnic groups.

As well as the amount of ethnicity data, the quality of ethnicity data is important.

The following publications have reiterated the importance of data quality in recent years:

Ethnicity data needs to be fit for purpose. This is so the government and other organisations can use the data to develop policies that reduce disparities between ethnic groups.

1.2  The standards and the Code of Practice for Statistics

Producers of official statistics should commit to the standards in the Code of Practice for Statistics. This gives confidence that government statistics:

  • have public value
  • are high quality
  • are produced by people and organisations that are trustworthy

The code formally applies to official statistics. It also sets good practice for everyone who works with statistics.

These ethnicity standards are based around the 3 pillars of the code:

  • value
  • quality
  • trustworthiness

The ethnicity standards focus on data quality. They give guidance on how to improve the quality of collection, analysis and reporting of ethnicity data. They also give more general guidance on trustworthiness and value.

1.3 Who the standards are for

The standards apply to people in government departments or public bodies who are:

  • collecting data about people’s ethnicity – for example, in surveys
  • analysing differences between ethnic groups
  • publishing ethnicity data – for example, in statistical releases

But they might also be useful to other people outside of the public sector who collect or use ethnicity data.

1.4 Using the standards

It can be difficult to collect ethnicity data. Asking people about their ethnicity is a sensitive topic. It can be more complicated than asking them about their age or country of birth, for example. This is often because ethnicity is based on a combination of factors. These can include someone’s country of birth, nationality, language, skin colour and religion. It is a self-defined and subjective concept – everyone has their own view about their own ethnicity.

Drawing conclusions from unreliable ethnicity data might also be difficult because it is based on a small number of people. It might not be comparable with other datasets you are interested in. You might also have different uses for ethnicity data.

So, there are good reasons to use different ways to collect and analyse ethnicity data. These can depend on:

  • why the data is being collected
  • the questions you want to answer using the statistics
  • the budget or time you have available

However, using these standards as often as possible can help you collect, analyse and report ethnicity data more responsibly. It will help increase the comparability of data across government and wider society.

1.5 Monitoring the use and impact of the standards

The Equality Hub will monitor the use and impact of the standards, along with the Office for Statistics Regulation (OSR).

OSR can assess how data producers and users of ethnicity data are following the standards. OSR can also provide guidance on areas where collection, analysis and reporting of ethnicity data might be improved.

2. Key considerations: Quality

This section lists some of the important things you should consider when collecting, analysing, and reporting on ethnicity data.

It is not always possible to be definitive in some parts of the standards. For example, the size of a survey sample needed to produce reliable results. This is because data quality is dependent on what you are analysing and the data you have available.

A dataset that is good quality for one analysis might not be good quality for another.

2.1  Data collection

Collecting ethnicity data should be a priority

The introduction to the standards noted the importance of ethnicity data. You should see the collection of ethnicity data as a priority, whether in a survey or administrative process. In some cases, collecting (and reporting) is an obligation.[footnote 1]

Feedback from users can help you decide:

  • the level of detail you need to collect (described in more detail below)
  • how often you need to collect the data

At the start, think about how you will use the ethnicity data you collect

Think about the following questions when designing a new ethnicity data collection or adding ethnicity to an existing one:

  • How robust do you need the results to be?
  • What survey sample sizes do you need for the analysis, for example, to be able to detect any significant differences between ethnic groups
  • Do you need to collect data on specific groups, for example, ethnic groups and geographies? (This will impact the classifications used)
  • How often will you collect the data?
  • How much change do you think is likely over time?

Some of this is likely to be easier with a sample survey than with administrative data collection.

If you are collecting administrative data, you should use your stakeholder relationships to change data collections to collect ethnicity data if there is a strong user need.

Collect ethnicity data using the GSS harmonised standards, or more detailed groups that you can align with the harmonised standards

The Government Statistical Service (GSS) develops and maintains the harmonised standards for ethnicity. You should use these standards to collect ethnicity data.

You can collect data using different or more detailed categories, as long as you can align them to the harmonised groups.

You should give an option for respondents to write in their ethnicity. This gives them a chance to respond if they don’t identify with any of the categories in the harmonised list.

If you are collecting data for the whole of the UK, you should use the UK harmonised standard. If this is not possible, then you should follow guidance from the GSS on which standard to use.

A Government Digital Service (GDS) set of design patterns exists for collecting equality information, including ethnicity. Using these patterns to collect equality information in a consistent way across the public sector makes data more coherent.

If you are commissioning data collection to other organisations, ensure that they also use harmonised standards during data collection.

You should provide reasons for not using the harmonised standards and explain any implications for use. This is required by the Code of Practice for Statistics.

Collect data on religion and national identity

Consider collecting data on national identity and religion. This improves the acceptability of the ethnicity question to respondents.

You should use the harmonised standards and GDS design patterns for these questions, and in the recommended order:

  • national identity
  • ethnic group
  • religion

Including the national identity and religion questions helps people to give details about their full cultural identity.

Ask people to report their own ethnicity, where possible

The best way for you to collect ethnicity data is to ask the person for it – for them to “self-report” their ethnicity.

This does not rule out someone else reporting a person’s ethnicity where they are not able to do it themselves. This is called “third-party” or “proxy” reporting and can include:

  • collecting ethnicity for a young child
  • using imputation to provide someone’s ethnicity
  • using visual appearance
  • using an algorithm based on a name and location

Ethnicity data collected by someone else will generally be of lower quality than when someone reports their own ethnicity - it might not necessarily reflect the ethnicity the person themselves would respond with.

Design data collections to increase response rates for different ethnic groups

You should use best practice when collecting data to increase response rates and reduce the amount of missing ethnicity data. For example, you might use translated materials and multilingual phone lines.

You should include wording to help respondents be clear on the benefits of giving their ethnicity and other personal information.

You should include national identity and religion questions to improve the acceptability of the ethnicity question.

Design data collections to increase the representativeness of ethnic groups

People from ethnic minority groups are underrepresented compared to the population in many data collections. This can reduce the quality of your data and what conclusions you can draw from it.

For administrative datasets, you might use best practice to ensure the distribution in your data collection reflects that of the latest census data, for example.

For surveys it might be better to have different (non-proportional) distributions for analysing differences between ethnic groups.

You can increase the number and proportion of people from different ethnic groups in surveys by using:

  • sample boosts
  • bespoke or local surveys
  • different survey techniques such as snowball sampling

Be mindful that clustered sample designs might lead to higher variance if the populations in sampled areas are homogeneous.

Use data linkage to improve ethnicity data quality

You can use data linkage to fill in incomplete records, or improve the quality of ethnicity classification in a dataset. In particular, if ethnicity records are used from a linked dataset that are known to be more accurate or complete.

2.2 Data analysis

Analysing ethnicity data should be a priority

The introduction to the standards noted the importance of ethnicity data, and analysis of it as a priority. In some cases, it might also be an obligation.

Weight survey data to correct for bias. You might include ethnicity as one of the weighting factors

You should weight your data to correct for bias in the collection or analysis of data. Bias can be due to different rates of non-response between different groups in the population.

These weights often include age, sex and geography but you might also include ethnicity as one of the weighting factors.

Use harmonised categories for analysing ethnicity data

You should use the GSS harmonised categories when analysing ethnicity data.

When reliable data for the full harmonised set of classifications is not available, then you should use the 5 aggregated groups:

  • white
  • black
  • Asian
  • mixed
  • other

If you combine ethnic groups in this way, you should note the limitations. For example, one limitation is that data for an aggregated group (the black group) can hide differences between the detailed ethnic groups (the black Caribbean and black African groups).

You should avoid using binary categories in your analysis. An example of this is using white and other than white. Binary classifications have little analytical value.

Avoid aggregating data for ethnic groups together in a non-harmonised way. This is because it reduces the comparability of your analysis with other datasets.

If you are analysing data for the whole of the UK, you should use the UK harmonised standard. If this is not possible, then you should follow guidance from the GSS on which standard to use.

If you are commissioning data analysis to other organisations, ensure that they also use harmonised standards.

Use appropriate comparators in your analysis

You should use a range of comparators in your analysis. This is to ensure that your selected comparator does not risk being misleading.

You can use any ethnic group as the comparator. A larger comparator group makes some comparisons more reliable. In practice, the availability of data is often a main consideration.

RDU has used the white British group as a comparator. This is preferable to comparing to the white group as a whole because it can show any disparities associated with white minority groups, such as Gypsy, Roma and Irish Travellers.

Comparing with the white British group does require you to disaggregate your data below the white group. This might not always be possible.

Using the total population as your comparator is the most ‘neutral’ approach. This avoids the perception that the white or white British groups are some sort of ‘ideal’. This approach does include an element of comparing an ethnic group against itself, as the group will be in the comparator.

If you are analysing an ethnic group with a small population, such as the Gypsy and Irish Traveller group, this might be an acceptable compromise as the impact on the total will be small.

Find out whether the geographic clustering of some ethnic groups has produced counterintuitive results

Consider whether the disproportionate concentration of some ethnic groups in urban areas has led to counterintuitive results. For example, ecological fallacies such as Simpson’s paradox or the modifiable areal unit problem.

You should document issues like these in metadata associated with the analysis.

Consider whether you measure differences between ethnic groups by analysis of raw data or after adjustment to take into account other socio-economic and demographic factors, or both

You should consider whether you measure differences between ethnic groups by:

  • analysis of raw data
  • the remaining differences after taking other factors into account, for example through regression
  • both of these methods

For example, people in ethnic minority groups tend to be younger than white British people and are more likely to live in large urban areas. This can impact on your comparisons if the data is not adjusted for age and geography.

You should investigate other statistical issues when using regression analysis, such as collinearity.

You should use correct techniques to address analytical questions. Different types of analytical adjustment can answer different questions.

You can use other methods to improve the reliability of ethnicity data, such as adding together more than one time period, or using rolling averages. Note any limitations of using these methods.

2.3 Data reporting

Reporting of ethnicity data should be a priority

The introduction to these standards noted the importance of ethnicity data. Reporting ethnicity data should be seen as a priority. Sometimes it is an obligation.

Use GSS harmonised categories for reporting on ethnicity data

The same considerations apply here as in the data analysis section around:

  • using the correct harmonised standards
  • aggregating ethnic groups correctly
  • noting the limitations of the way you have aggregated ethnic groups

Report potential biases to allow users to understand limitations in the ethnicity data, and how this impacts on the interpretation of the analysis

You should report any biases in the metadata. These might be due to data collection, analysis or reporting. These should be included in any commentary. This allows users to understand any limitations of the data and the impact on the interpretation of your analysis.

In particular, you should report any risks and biases that may arise from the way administrative systems collect and categorise data. You should do this even if it is not always straightforward to change these systems.

You might report some of the following issues:

  • the proportion of ethnicity records that have been proxy reported and by whom
  • the proportion of imputed ethnicity records
  • the proportion of records with missing ethnicity
  • the response rates
  • the impact on ethnic groups of weighting data
  • the impact of any sample boosts on the reliability of estimates for different ethnic groups
  • the impact of data linkage, including reporting data linkage rates and accuracy for different ethnic groups
  • the consistency of ethnicity data over time
  • design factors for complex surveys

You should keep metadata up to date.

Report measures of reliability so users can correctly understand and interpret the data

You should report appropriate measures of reliability. This allows your users to make informed decisions about how to use your ethnicity analysis.

You might report the following:

  • a measure of the variability and bias of the ethnicity coding, for example from a special repeat study
  • confidence intervals
  • standard errors
  • coefficients of variation
  • sample sizes – both weighted and unweighted numerators and denominators
  • measures of relative likelihood
  • the use of overlapping confidence intervals or appropriate statistical tests to detect significant differences in data
  • how aggregating time periods has impacted on the reliability and timeliness of estimates

Consider whether you report differences using raw data, or report them after adjustment to take into account other socio-economic and demographic factors, or both

You should consider whether you report differences between ethnic groups by:

  • analysis of raw data
  • the remaining differences after taking other factors into account, for example through regression
  • both of these methods

You should understand any issues reporting either of these ethnicity analyses, or reporting both.

Be transparent in your reasons for using specific comparators

Whichever comparators you use (for example, another ethnic group or a time period) you should report the reasons for using them.

Report your reasons for comparison with specific distributions, such as the census of population.

Follow best practice when writing about ethnic groups - for example, the writing principles developed by RDU

RDU has developed guidance for writing about ethnicity that you use in your reporting. The guidance shows:

  • words and phrases RDU uses
  • words and phrases RDU avoids, such as ‘BAME’
  • how RDU describe different ethnic groups
  • capitalisation of the names of ethnic groups

Supporting evidence and guidance

3. Key considerations: Trustworthiness

3.1 Data collection, reporting and analysis

Collect ethnicity data in a respectful way - it should support public interest

Your collection, analysis and reporting of ethnicity data should support a legitimate public interest. You should do this in the least intrusive way.

You should collect data in a respectful way. Understand the risks to data quality or survey response when asking for sensitive information. These might include the burden on survey respondents, or emotional impact, for example, in the case of children.

Supporting evidence and guidance

Understand what data can be legally collected about ethnicity, and comply with relevant legislation

If you are collecting ethnicity data, you should understand what data you can legally collect about an individual’s ethnicity. Follow relevant legislation.

Ethnicity data is classed as special category data under the General Data Protection Regulation (GDPR). Special category data is personal data that needs more protection because it is sensitive.

To lawfully process this data, a lawful basis under Article 6 of the UK GDPR and a separate condition for processing under Article 9 must be identified. These do not have to be linked.

Build capability

To improve ethnicity data quality, you should dedicate resources to building capability in assessing, improving and communicating.

You might do this through training and sharing best practice.

Protect the privacy and identity of individuals in your data at all times

You must protect the privacy and identity of individuals in your data at all times. This is during data:

  • collection
  • storage
  • analysis
  • reporting

Be clear and open with people about how you will protect their information.

You should apply relevant security standards to keep data secure. If necessary, use disclosure control methods when releasing statistics.

Supporting evidence and guidance

Regularly review your ethnicity data to ensure that it remains relevant

You should understand the public debate on data about ethnicity. This will help ensure your statistics stay relevant to a changing society.

Your ethnicity analyses and reports should be regularly reviewed with users and other stakeholders. This will help you prioritise any development of the data.

You might identify user needs that are impacted by how your ethnicity data are collected. You should consider how you can meet those needs in your work programme. This will involve working with stakeholders and subject experts.

4. Key considerations: Value

4.1 Data collection, reporting and analysis

Your ethnicity statistics should meet their intended uses and inform public debate

Your ethnicity Statistics should meet their intended uses.

The statistics should inform public debate.

You should seek to understand your user base and the questions that users want to answer with your data.

Your supporting commentary should provide clarity and insight. It should describe any assumptions. This will enable your users to draw the correct conclusions from your data.

You can enhance your insight into ethnicity data by consultation with subject experts.

Users of ethnicity data are diverse and have different data needs. They will include community groups and leaders representing ethnic groups. You should understand whether they have other specific requirements. This might include the availability of information in different languages.

Enhance ethnicity statistics to meet new or evolving user needs

You should identify any evolving or new user needs for ethnicity statistics. You should try and enhance the data that inform these statistics to meet the user needs.

Where you cannot meet a user need, you should report why this is the case. You should also report anything in the existing data that will help these users.

Report new ethnicity datasets to the ONS Equalities Data Audit

You can increase the user value of data by adding new ethnicity data collections to the ONS Equalities Data Audit.

Supporting evidence and guidance

Make decisions about whether to continue, discontinue or adapt ethnicity data and statistics in discussion with users

You should make decisions about whether to continue, discontinue or change ethnicity data and statistics in discussion with users.

You should publish explanations of changes to data collections. The explanations should include evidence of the rationale for the change. You should also publish any analysis that informed the change.

Your decision-making processes should be transparent and open

There may be times when you are unable to meet the requests of everyone who has an interest in your statistics. In these cases, it is important to be open about your decision-making process. You should document evidence used to inform these decisions, particularly in relation to areas of contention.

Supporting evidence and guidance

5. Summary of ethnicity data standards

5.1 Quality

Data collection

  • Collecting ethnicity data should be a priority
  • At the start, think about how you will use the ethnicity data you collect
  • Collect ethnicity data using the GSS harmonised standards, or more detailed groups that you can align with the harmonised standards
  • Collect data on religion and national identity
  • Ask people to report their own ethnicity, where possible
  • Design data collections to increase response rates for different ethnic groups
  • Design data collections to increase the representativeness of ethnic groups
  • Use data linkage to improve ethnicity data quality

 Data analysis

  • Analysing ethnicity data should be a priority
  • Weight survey data to correct for bias. You might include ethnicity as one of the weighting factors
  • Use harmonised categories for analysing ethnicity data
  • Use appropriate comparators in your analysis
  • Find out whether the geographic clustering of some ethnic groups has produced counterintuitive results
  • Consider whether you measure differences between ethnic groups by analysis of raw data or after adjustment to take into account other socio-economic and demographic factors, or both

Data reporting

  • Reporting of ethnicity data should be a priority
  • Use GSS harmonised categories for reporting on ethnicity data
  • Report potential biases to allow users to understand limitations in the ethnicity data, and how this impacts on the interpretation of the analysis
  • Report measures of reliability so users can correctly understand and interpret the data
  • Consider whether you report differences using raw data, or report them after adjustment to take into account other socio-economic and demographic factors, or both
  • Be transparent in your reasons for using specific comparators
  • Follow best practice when writing about ethnic groups - for example, the writing principles developed by RDU

5.2 Trustworthiness

Data collection, reporting and analysis

  • Collect ethnicity data in a respectful way - it should support public interest
  • Understand what data can be legally collected about ethnicity, and comply with relevant legislation
  • Build capability
  • Protect the privacy and identity of individuals in your data at all times
  • Regularly review your ethnicity data to ensure that it remains relevant

5.3 Value

Data collection, reporting and analysis

  • Your ethnicity statistics should meet their intended uses and inform public debate
  • Enhance ethnicity statistics to meet new or evolving user needs
  • Report new ethnicity datasets to the ONS Equalities Data Audit
  • Make decisions about whether to continue, discontinue or adapt ethnicity data and statistics in discussion with users
  • Your decision-making processes should be transparent and open