Standards for ethnicity data
Updated 17 April 2023
This the draft consultation version of the standards.
Please use the revised standards for ethnicity data, published in April 2023, for guidance on best practice when collecting, analysing and reporting ethnicity data.
1. Introduction
The government’s Equality Hub has produced these standards for ethnicity data (from now on, “the standards”). They describe best practice when collecting, analysing and reporting ethnicity data.
They will be helpful to anyone interested in using ethnicity data, including people in:
- the public sector
- the private sector
- the media
- academia
1.1 The importance of ethnicity data
These standards contribute to action 6 of Inclusive Britain, the government’s action plan for racial equality:
To ensure more responsible and accurate reporting on race and ethnicity, the RDU will, by the end of 2022, consult on new standards for government departments and other public bodies on how to record, understand and communicate ethnicity data.
Ethnicity data has become important in recent years. A person’s ethnicity is often collected in government datasets. The 2017 Race Disparity Audit identified many of these datasets and showed disparities that impacted different aspects of people’s lives. This led to the creation of Ethnicity facts and figures. This website contains 180 government datasets about people from different ethnic groups.
As well as the amount of ethnicity data, the quality of ethnicity data is important.
The following publications have reiterated the importance of data quality in recent years:
- the Race Disparity Unit’s Quality Improvement Plan
- the report of the Commission on Race and Ethnic Disparities
- the Inclusive Britain report
- quarterly reports on progress to address COVID-19 health inequalities
- the Inclusive Data Taskforce report
Ethnicity data needs to be fit for purpose. This is so the government and other organisations can use the data to develop policies that reduce disparities between ethnic groups.
1.2 The standards and the Code of Practice for Statistics
Producers of official statistics should commit to the standards in the Code of Practice for Statistics. This gives confidence that government statistics:
- have public value
- are high quality
- are produced by people and organisations that are trustworthy
The code formally applies to official statistics. It also sets good practice for everyone who works with statistics.
These ethnicity standards are based around the 3 pillars of the code:
- value
- quality
- trustworthiness
The ethnicity standards focus on data quality. They give guidance on how to improve the quality of collection, analysis and reporting of ethnicity data. They also give more general guidance on trustworthiness and value.
1.3 Who the standards are for
The standards apply to people in government departments or public bodies who are:
- collecting data about people’s ethnicity – for example, in surveys
- analysing differences between ethnic groups
- publishing ethnicity data – for example, in statistical releases
But they might also be useful to other people outside of the public sector who collect or use ethnicity data.
1.4 Using the standards
It can be difficult to collect ethnicity data. Asking people about their ethnicity is a sensitive topic. It can be more complicated than asking them about their age or country of birth, for example. This is often because ethnicity is based on a combination of factors. These can include someone’s country of birth, nationality, language, skin colour and religion. It is a self-defined and subjective concept – everyone has their own view about their own ethnicity.
Drawing conclusions from unreliable ethnicity data might also be difficult because it is based on a small number of people. It might not be comparable with other datasets you are interested in. You might also have different uses for ethnicity data.
So, there are good reasons to use different ways to collect and analyse ethnicity data. These can depend on:
- why the data is being collected
- the questions you want to answer using the statistics
- the budget or time you have available
However, using these standards as often as possible can help you collect, analyse and report ethnicity data more responsibly. It will help increase the comparability of data across government and wider society.
1.5 Monitoring the use and impact of the standards
The Equality Hub will monitor the use and impact of the standards, along with the Office for Statistics Regulation (OSR).
OSR can assess how data producers and users of ethnicity data are following the standards. OSR can also provide guidance on areas where collection, analysis and reporting of ethnicity data might be improved.
2. Key considerations: Quality
This section lists some of the important things you should consider when collecting, analysing, and reporting on ethnicity data.
It is not always possible to be definitive in some parts of the standards. For example, the size of a survey sample needed to produce reliable results. This is because data quality is dependent on what you are analysing and the data you have available.
A dataset that is good quality for one analysis might not be good quality for another.
2.1 Data collection
Collecting ethnicity data should be a priority
The introduction to the standards noted the importance of ethnicity data. You should see the collection of ethnicity data as a priority, whether in a survey or administrative process. In some cases, collecting (and reporting) is an obligation.[footnote 1]
Feedback from users can help you decide:
- the level of detail you need to collect (described in more detail below)
- how often you need to collect the data
At the start, think about how you will use the ethnicity data you collect
Think about the following questions when designing a new ethnicity data collection or adding ethnicity to an existing one:
- How robust do you need the results to be?
- What survey sample sizes do you need for the analysis, for example, to be able to detect any significant differences between ethnic groups
- Do you need to collect data on specific groups, for example, ethnic groups and geographies? (This will impact the classifications used)
- How often will you collect the data?
- How much change do you think is likely over time?
Some of this is likely to be easier with a sample survey than with administrative data collection.
If you are collecting administrative data, you should use your stakeholder relationships to change data collections to collect ethnicity data if there is a strong user need.
Collect ethnicity data using the GSS harmonised standards, or more detailed groups that you can align with the harmonised standards
The Government Statistical Service (GSS) develops and maintains the harmonised standards for ethnicity. You should use these standards to collect ethnicity data.
You can collect data using different or more detailed categories, as long as you can align them to the harmonised groups.
You should give an option for respondents to write in their ethnicity. This gives them a chance to respond if they don’t identify with any of the categories in the harmonised list.
If you are collecting data for the whole of the UK, you should use the UK harmonised standard. If this is not possible, then you should follow guidance from the GSS on which standard to use.
A Government Digital Service (GDS) set of design patterns exists for collecting equality information, including ethnicity. Using these patterns to collect equality information in a consistent way across the public sector makes data more coherent.
If you are commissioning data collection to other organisations, ensure that they also use harmonised standards during data collection.
You should provide reasons for not using the harmonised standards and explain any implications for use. This is required by the Code of Practice for Statistics.
Supporting evidence and guidance
- Ethnicity harmonised standard
- RDU publication: how different or similar are aggregated ethnic groups?
- Measuring equality: A guide for the collection and classification of ethnic group, national identity and religion data in the UK
- GDS design patterns
- RDU blog post: why data harmonisation is important
- GSS blog post: Harmonising decentralised data collections
Collect data on religion and national identity
Consider collecting data on national identity and religion. This improves the acceptability of the ethnicity question to respondents.
You should use the harmonised standards and GDS design patterns for these questions, and in the recommended order:
- national identity
- ethnic group
- religion
Including the national identity and religion questions helps people to give details about their full cultural identity.
Supporting evidence and guidance
Ask people to report their own ethnicity, where possible
The best way for you to collect ethnicity data is to ask the person for it – for them to “self-report” their ethnicity.
This does not rule out someone else reporting a person’s ethnicity where they are not able to do it themselves. This is called “third-party” or “proxy” reporting and can include:
- collecting ethnicity for a young child
- using imputation to provide someone’s ethnicity
- using visual appearance
- using an algorithm based on a name and location
Ethnicity data collected by someone else will generally be of lower quality than when someone reports their own ethnicity - it might not necessarily reflect the ethnicity the person themselves would respond with.
Supporting evidence and guidance
- Principles and Recommendations for Population and Housing Censuses
- RDU publication: The relative quality of self-reported and proxy-reported ethnicity data
- The ethnicity of the deceased person: the apparent quality of the data that are collected when deaths are registered
- Attributing someone’s ethnicity using their name
Design data collections to increase response rates for different ethnic groups
You should use best practice when collecting data to increase response rates and reduce the amount of missing ethnicity data. For example, you might use translated materials and multilingual phone lines.
You should include wording to help respondents be clear on the benefits of giving their ethnicity and other personal information.
You should include national identity and religion questions to improve the acceptability of the ethnicity question.
Supporting evidence and guidance
Design data collections to increase the representativeness of ethnic groups
People from ethnic minority groups are underrepresented compared to the population in many data collections. This can reduce the quality of your data and what conclusions you can draw from it.
For administrative datasets, you might use best practice to ensure the distribution in your data collection reflects that of the latest census data, for example.
For surveys it might be better to have different (non-proportional) distributions for analysing differences between ethnic groups.
You can increase the number and proportion of people from different ethnic groups in surveys by using:
- sample boosts
- bespoke or local surveys
- different survey techniques such as snowball sampling
Be mindful that clustered sample designs might lead to higher variance if the populations in sampled areas are homogeneous.
Supporting evidence and guidance
Use data linkage to improve ethnicity data quality
You can use data linkage to fill in incomplete records, or improve the quality of ethnicity classification in a dataset. In particular, if ethnicity records are used from a linked dataset that are known to be more accurate or complete.
2.2 Data analysis
Analysing ethnicity data should be a priority
The introduction to the standards noted the importance of ethnicity data, and analysis of it as a priority. In some cases, it might also be an obligation.
Weight survey data to correct for bias. You might include ethnicity as one of the weighting factors
You should weight your data to correct for bias in the collection or analysis of data. Bias can be due to different rates of non-response between different groups in the population.
These weights often include age, sex and geography but you might also include ethnicity as one of the weighting factors.
Supporting evidence and guidance
Use harmonised categories for analysing ethnicity data
You should use the GSS harmonised categories when analysing ethnicity data.
When reliable data for the full harmonised set of classifications is not available, then you should use the 5 aggregated groups:
- white
- black
- Asian
- mixed
- other
If you combine ethnic groups in this way, you should note the limitations. For example, one limitation is that data for an aggregated group (the black group) can hide differences between the detailed ethnic groups (the black Caribbean and black African groups).
You should avoid using binary categories in your analysis. An example of this is using white and other than white. Binary classifications have little analytical value.
Avoid aggregating data for ethnic groups together in a non-harmonised way. This is because it reduces the comparability of your analysis with other datasets.
If you are analysing data for the whole of the UK, you should use the UK harmonised standard. If this is not possible, then you should follow guidance from the GSS on which standard to use.
If you are commissioning data analysis to other organisations, ensure that they also use harmonised standards.
Supporting evidence and guidance
Use appropriate comparators in your analysis
You should use a range of comparators in your analysis. This is to ensure that your selected comparator does not risk being misleading.
You can use any ethnic group as the comparator. A larger comparator group makes some comparisons more reliable. In practice, the availability of data is often a main consideration.
RDU has used the white British group as a comparator. This is preferable to comparing to the white group as a whole because it can show any disparities associated with white minority groups, such as Gypsy, Roma and Irish Travellers.
Comparing with the white British group does require you to disaggregate your data below the white group. This might not always be possible.
Using the total population as your comparator is the most ‘neutral’ approach. This avoids the perception that the white or white British groups are some sort of ‘ideal’. This approach does include an element of comparing an ethnic group against itself, as the group will be in the comparator.
If you are analysing an ethnic group with a small population, such as the Gypsy and Irish Traveller group, this might be an acceptable compromise as the impact on the total will be small.
Supporting evidence and guidance
Find out whether the geographic clustering of some ethnic groups has produced counterintuitive results
Consider whether the disproportionate concentration of some ethnic groups in urban areas has led to counterintuitive results. For example, ecological fallacies such as Simpson’s paradox or the modifiable areal unit problem.
You should document issues like these in metadata associated with the analysis.
Supporting evidence and guidance
Consider whether you measure differences between ethnic groups by analysis of raw data or after adjustment to take into account other socio-economic and demographic factors, or both
You should consider whether you measure differences between ethnic groups by:
- analysis of raw data
- the remaining differences after taking other factors into account, for example through regression
- both of these methods
For example, people in ethnic minority groups tend to be younger than white British people and are more likely to live in large urban areas. This can impact on your comparisons if the data is not adjusted for age and geography.
You should investigate other statistical issues when using regression analysis, such as collinearity.
You should use correct techniques to address analytical questions. Different types of analytical adjustment can answer different questions.
You can use other methods to improve the reliability of ethnicity data, such as adding together more than one time period, or using rolling averages. Note any limitations of using these methods.
Supporting evidence and guidance
2.3 Data reporting
Reporting of ethnicity data should be a priority
The introduction to these standards noted the importance of ethnicity data. Reporting ethnicity data should be seen as a priority. Sometimes it is an obligation.
Use GSS harmonised categories for reporting on ethnicity data
The same considerations apply here as in the data analysis section around:
- using the correct harmonised standards
- aggregating ethnic groups correctly
- noting the limitations of the way you have aggregated ethnic groups
Report potential biases to allow users to understand limitations in the ethnicity data, and how this impacts on the interpretation of the analysis
You should report any biases in the metadata. These might be due to data collection, analysis or reporting. These should be included in any commentary. This allows users to understand any limitations of the data and the impact on the interpretation of your analysis.
In particular, you should report any risks and biases that may arise from the way administrative systems collect and categorise data. You should do this even if it is not always straightforward to change these systems.
You might report some of the following issues:
- the proportion of ethnicity records that have been proxy reported and by whom
- the proportion of imputed ethnicity records
- the proportion of records with missing ethnicity
- the response rates
- the impact on ethnic groups of weighting data
- the impact of any sample boosts on the reliability of estimates for different ethnic groups
- the impact of data linkage, including reporting data linkage rates and accuracy for different ethnic groups
- the consistency of ethnicity data over time
- design factors for complex surveys
You should keep metadata up to date.
Supporting evidence and guidance
Report measures of reliability so users can correctly understand and interpret the data
You should report appropriate measures of reliability. This allows your users to make informed decisions about how to use your ethnicity analysis.
You might report the following:
- a measure of the variability and bias of the ethnicity coding, for example from a special repeat study
- confidence intervals
- standard errors
- coefficients of variation
- sample sizes – both weighted and unweighted numerators and denominators
- measures of relative likelihood
- the use of overlapping confidence intervals or appropriate statistical tests to detect significant differences in data
- how aggregating time periods has impacted on the reliability and timeliness of estimates
Supporting evidence and guidance
Consider whether you report differences using raw data, or report them after adjustment to take into account other socio-economic and demographic factors, or both
You should consider whether you report differences between ethnic groups by:
- analysis of raw data
- the remaining differences after taking other factors into account, for example through regression
- both of these methods
You should understand any issues reporting either of these ethnicity analyses, or reporting both.
Supporting evidence and guidance
Be transparent in your reasons for using specific comparators
Whichever comparators you use (for example, another ethnic group or a time period) you should report the reasons for using them.
Report your reasons for comparison with specific distributions, such as the census of population.
Follow best practice when writing about ethnic groups - for example, the writing principles developed by RDU
RDU has developed guidance for writing about ethnicity that you use in your reporting. The guidance shows:
- words and phrases RDU uses
- words and phrases RDU avoids, such as ‘BAME’
- how RDU describe different ethnic groups
- capitalisation of the names of ethnic groups
Supporting evidence and guidance
3. Key considerations: Trustworthiness
3.1 Data collection, reporting and analysis
Collect ethnicity data in a respectful way - it should support public interest
Your collection, analysis and reporting of ethnicity data should support a legitimate public interest. You should do this in the least intrusive way.
You should collect data in a respectful way. Understand the risks to data quality or survey response when asking for sensitive information. These might include the burden on survey respondents, or emotional impact, for example, in the case of children.
Supporting evidence and guidance
Understand what data can be legally collected about ethnicity, and comply with relevant legislation
If you are collecting ethnicity data, you should understand what data you can legally collect about an individual’s ethnicity. Follow relevant legislation.
Ethnicity data is classed as special category data under the General Data Protection Regulation (GDPR). Special category data is personal data that needs more protection because it is sensitive.
To lawfully process this data, a lawful basis under Article 6 of the UK GDPR and a separate condition for processing under Article 9 must be identified. These do not have to be linked.
Build capability
To improve ethnicity data quality, you should dedicate resources to building capability in assessing, improving and communicating.
You might do this through training and sharing best practice.
Protect the privacy and identity of individuals in your data at all times
You must protect the privacy and identity of individuals in your data at all times. This is during data:
- collection
- storage
- analysis
- reporting
Be clear and open with people about how you will protect their information.
You should apply relevant security standards to keep data secure. If necessary, use disclosure control methods when releasing statistics.
Supporting evidence and guidance
- ICO special category data: asking people about their ethnicity
Regularly review your ethnicity data to ensure that it remains relevant
You should understand the public debate on data about ethnicity. This will help ensure your statistics stay relevant to a changing society.
Your ethnicity analyses and reports should be regularly reviewed with users and other stakeholders. This will help you prioritise any development of the data.
You might identify user needs that are impacted by how your ethnicity data are collected. You should consider how you can meet those needs in your work programme. This will involve working with stakeholders and subject experts.
4. Key considerations: Value
4.1 Data collection, reporting and analysis
Your ethnicity statistics should meet their intended uses and inform public debate
Your ethnicity Statistics should meet their intended uses.
The statistics should inform public debate.
You should seek to understand your user base and the questions that users want to answer with your data.
Your supporting commentary should provide clarity and insight. It should describe any assumptions. This will enable your users to draw the correct conclusions from your data.
You can enhance your insight into ethnicity data by consultation with subject experts.
Users of ethnicity data are diverse and have different data needs. They will include community groups and leaders representing ethnic groups. You should understand whether they have other specific requirements. This might include the availability of information in different languages.
Enhance ethnicity statistics to meet new or evolving user needs
You should identify any evolving or new user needs for ethnicity statistics. You should try and enhance the data that inform these statistics to meet the user needs.
Where you cannot meet a user need, you should report why this is the case. You should also report anything in the existing data that will help these users.
Report new ethnicity datasets to the ONS Equalities Data Audit
You can increase the user value of data by adding new ethnicity data collections to the ONS Equalities Data Audit.
Supporting evidence and guidance
Make decisions about whether to continue, discontinue or adapt ethnicity data and statistics in discussion with users
You should make decisions about whether to continue, discontinue or change ethnicity data and statistics in discussion with users.
You should publish explanations of changes to data collections. The explanations should include evidence of the rationale for the change. You should also publish any analysis that informed the change.
Your decision-making processes should be transparent and open
There may be times when you are unable to meet the requests of everyone who has an interest in your statistics. In these cases, it is important to be open about your decision-making process. You should document evidence used to inform these decisions, particularly in relation to areas of contention.
Supporting evidence and guidance
5. Summary of ethnicity data standards
5.1 Quality
Data collection
- Collecting ethnicity data should be a priority
- At the start, think about how you will use the ethnicity data you collect
- Collect ethnicity data using the GSS harmonised standards, or more detailed groups that you can align with the harmonised standards
- Collect data on religion and national identity
- Ask people to report their own ethnicity, where possible
- Design data collections to increase response rates for different ethnic groups
- Design data collections to increase the representativeness of ethnic groups
- Use data linkage to improve ethnicity data quality
Data analysis
- Analysing ethnicity data should be a priority
- Weight survey data to correct for bias. You might include ethnicity as one of the weighting factors
- Use harmonised categories for analysing ethnicity data
- Use appropriate comparators in your analysis
- Find out whether the geographic clustering of some ethnic groups has produced counterintuitive results
- Consider whether you measure differences between ethnic groups by analysis of raw data or after adjustment to take into account other socio-economic and demographic factors, or both
Data reporting
- Reporting of ethnicity data should be a priority
- Use GSS harmonised categories for reporting on ethnicity data
- Report potential biases to allow users to understand limitations in the ethnicity data, and how this impacts on the interpretation of the analysis
- Report measures of reliability so users can correctly understand and interpret the data
- Consider whether you report differences using raw data, or report them after adjustment to take into account other socio-economic and demographic factors, or both
- Be transparent in your reasons for using specific comparators
- Follow best practice when writing about ethnic groups - for example, the writing principles developed by RDU
5.2 Trustworthiness
Data collection, reporting and analysis
- Collect ethnicity data in a respectful way - it should support public interest
- Understand what data can be legally collected about ethnicity, and comply with relevant legislation
- Build capability
- Protect the privacy and identity of individuals in your data at all times
- Regularly review your ethnicity data to ensure that it remains relevant
5.3 Value
Data collection, reporting and analysis
- Your ethnicity statistics should meet their intended uses and inform public debate
- Enhance ethnicity statistics to meet new or evolving user needs
- Report new ethnicity datasets to the ONS Equalities Data Audit
- Make decisions about whether to continue, discontinue or adapt ethnicity data and statistics in discussion with users
- Your decision-making processes should be transparent and open