4. Tax gaps: Income Tax, National Insurance contributions and Capital Gains Tax

Question 1

Introduction

Accepted Answer

This chapter explains parts of the Income Tax, National Insurance contributions (NICs), and Capital Gains Tax gaps that are estimated using data from random enquiry programmes (REP), established bottom-up statistical methodologies and illustrative methodologies.

For Self Assessment, the REP results are combined with estimates for large partnerships (those with 5 or more partners) to produce an overall Self Assessment tax gap. This overall figure also includes wealthy individuals in Self Assessment.

For PAYE, the overall tax gap is made up of 3 components:

small businesses (estimated using REP data)
mid-sized businesses (estimated using risk-based methods)
large businesses (estimated using an illustrative approach)

Estimates for the hidden economy and marketed avoidance schemes (mainly used by individuals) are added to the overall Self Assessment and PAYE gaps. Together, these produce the total Income Tax, NICs, and Capital Gains Tax gap estimate.

Question 2

Self Assessment

Accepted Answer

Overview

The Self Assessment tax gap model estimates the tax gap arising from incorrect returns filed by individuals and small partnerships (covering Self Assessment business and Self Assessment non-business taxpayers). It is based on evidence from the REP, which provides a representative view of non-compliance from a selection of returns subjected to full compliance checks.

The Self Assessment population includes all individuals and small partnerships required to file a Self Assessment tax return.

Tax gap calculation (step-by-step)

Measure under-declared tax liabilities in a random sample.
Calculate the average amount of under-declared tax in the random sample.
Scale the sample to the population.
Apply the non-detection multiplier to account for unidentified tax due.
Add non-payment (tax liability that will never be paid).
Deduct compliance yield.
Project latest 2 years in line with growth in tax liabilities.

Methodology

Data and Sampling

This section describes the data sources used for the Self Assessment tax gap model and how the sample is selected, structured, and prepared for analysis.

Data sources

The Self Assessment tax gap model is based on data from the REP. Each selected return undergoes a full compliance check, during which caseworkers review the records from individuals and small partnerships in detail and identify any under-declared liabilities.

Population counts used to scale up the REP to the full population are drawn from HMRC administrative systems and reflect the total number of individuals or small partnerships required to file a Self Assessment tax return.

Sampling approach

The REP uses a stratified random sample of Self Assessment taxpayers issued with a notice to file a return. Large partnerships (5 or more partners) are excluded from the REP and are estimated separately.

Sample sizes

Sample sizes for the Self Assessment REP vary by year, reflecting changes to sampling design and operational capacity. Annual sample size figures provide important context for the robustness of the estimates, as larger samples generally support more stable results. A table of Self Assessment REP sample sizes is published each year and shows how the number of cases selected has changed over time.

Table A4.1: Sample sizes for the Self Assessment REP

Tax return year	Sample size
2005 to 2006	5,234
2006 to 2007	2,925
2007 to 2008	2,864
2008 to 2009	2,708
2009 to 2010	2,116
2010 to 2011	2,029
2011 to 2012	2,238
2012 to 2013	2,190
2013 to 2014	2,042
2014 to 2015	1,770
2015 to 2016	2,118
2016 to 2017	2,258
2017 to 2018	1,809
2018 to 2019	2,532
2019 to 2020	1,898
2020 to 2021	2,041
2021 to 2022	1,958
2022 to 2023	1,008

Note for Table A4.1: 

Sample size figures from 2010 to 2011 onwards have been adjusted due to reclassifying some cases as being within the population of interest

Deselections

Some sampled cases cannot be worked, for example if the taxpayer is bankrupt, insolvent or otherwise outside the scope of the population. These cases are treated as deselections.

To avoid biasing the sample we include cases that are deselected from the sample but are still within the population of interest. If the individual has undergone a recent compliance check, we substitute the outcome of this earlier compliance check into the case. If no such previous compliance check exists, we assign a value based on the average yield and probability of being non-compliant in the taxpayer’s stratum using data from results of compliance checks into taxpayers in the corresponding stratum.

Outliers

Outliers are individual cases with large yield which are very different from the yield of the other cases in the sample. Due to the nature of our samples, our estimates are particularly sensitive to extreme values. To ensure that these small number of cases do not have an undue influence on the tax gap calculation, their yield values are capped. This allows us to use all valid information while smoothing the year-on-year variability.

Yield data is modelled using a representative statistical distribution. The final value used for each tax year is calculated as a 3 year moving average of the 99.85th percentile from this distribution, calculated based only on the results of years where the sample was stratified. For years before stratification, and years where a full 3 years of stratified results are not available, a value based on the last 3 complete stratified years is used.

This approach allows all valid cases to be included while ensuring that extreme values do not disproportionately influence the final estimate.

Estimating under-declared liabilities

This section explains how the results from the REP are transformed into an estimate of the total amount of under‑declared Self Assessment tax for individuals and small partnerships.

REP case outcomes

Calculating average under-declared liabilities

For each stratum, the model calculates the average amount of under‑declared liability identified in the completed REP cases. This produces a set of stratum‑specific averages that reflect the differing compliance behaviour of individuals with different income or turnover.

Smoothing and stability

A 2 year rolling average has been introduced for ‘Measuring tax gaps 2026’ edition for the 2022 to 2023 tax year, to account for reduced sample size in that year. We apply rolling averages to the business and non-business average under-declared liabilities separately.

Scaling to the population

The average under‑declared liability for each stratum is multiplied by the number of individuals in that stratum. Adding these results across all strata yields an estimate of the total under‑declared liability for the entire Self Assessment population.

This section describes the adjustments applied to the estimated under‑declared liabilities to ensure the Self Assessment tax gap reflects the full extent of non‑compliance and produces a consistent time series.

Alignment of earlier years

The 2009 to 2010 tax year is the last year which uses a simple random sample, as random samples for subsequent years have been stratified to improve the accuracy of the results. Samples drawn from Self Assessment business taxpayers are stratified by turnover from 2010 to 2011 onwards, with samples drawn from Self Assessment non-business taxpayers stratified by level of income from 2011 to 2012 onwards. Our stratification was updated from 2017 to 2018 onwards to also consider if a taxpayer has been identified as wealthy (individuals with income greater than £200,000 per annum or assets over £2 million).

From 2015 to 2016 we used an optimal allocation method to increase the accuracy of our estimates. When sampling, we consider the variability of the tax at risk across the strata in the population. We select a greater proportion of cases in strata where the variance of tax at risk values is known to be high.

Following a review by experienced caseworkers of the 2018 to 2019 and 2019 to 2020 REP we established a compliance yield uplift and applied it to the data.

Non-detection

We make an adjustment for one source of systematic uncertainty, which is non-detection of non-compliance. The REPs will not identify all incorrect returns or the full scale of under-declaration of liabilities, so estimates produced from the unadjusted results of the programmes would underestimate the full extent of the tax gap. We apply non-detection multipliers to the results of the REPs to account for this. 

The non-detection multiplier has been derived using expert judgement. More information about our plans to carry out a programme of development to introduce new non-detection multipliers in future editions of ‘Measuring tax gaps’ can be found in HMRC’s working paper ‘Non-detection multipliers for measuring tax gaps’. 

The multipliers are kept consistent year-on-year. The size of the multipliers varies depending on the complexities of each tax regime and the type of non-compliance found. 

Table A4.2: Non-detection multipliers for Self Assessment tax gap

	Multiplier for central estimate	Multiplier for lower estimate	Multiplier for upper estimate
Self Assessment (business)	1.908	1.000	3.075
Self Assessment (non-business)	1.260	1.000	1.928

Non-payment

In ‘Measuring tax gaps 2025 edition’ we improved the methodology for the estimate of non-payment for Income Tax, NICs, and Capital Gains Tax in PAYE, for tax years since 2018 to 2019. The new methodology is an estimate of eventual non-payment attributable to the year of tax debt creation. These methodological improvements do not extend back beyond 2018 to 2019.

Prior to 2018 to 2019 non-payment refers to tax debts that are written off or remitted in a tax year by HMRC and result in a permanent loss of tax. 

As separate figures of non-payment are not available for just the taxpayers within the scope of the REPs, the amounts are split in proportion to the tax gap resulting from the relevant section of the populations.

Compliance yield

The REPs provide an estimate of the tax gap due to incorrect returns. However, HMRC carries out a wider programme of compliance activity to identify and correct erroneous returns. To calculate the net tax gap, it is necessary to subtract the yield from this activity. 

The figures for yield are taken from HMRC’s systems for recording the outcomes of compliance checks and relate to cases settled during each year rather than compliance checks into returns relating to a specific tax year. Yield by year of settlement is used as a proxy due to the extended timeline for completing all the compliance activity related to the liabilities for a year.

Other estimates

Large partnerships in Self Assessment 

An illustrative estimate has been produced for Self Assessment large partnerships by assuming that the tax at risk will represent a similar proportion of liabilities to all other Self Assessment taxpayers, as shown by the results of the Self Assessment REP.

An adjustment is made to project a component of the Self Assessment large partnerships estimate for 2019 to 2020.   

Wealthy taxpayers

To calculate the tax gap for wealthy individuals in Self Assessment, we identify wealthy taxpayers in the risk-based data and the Self Assessment REP data. We then apply the same methodology used to estimate the Self Assessment tax gap to the wealthy population to get the wealthy portion of the Self Assessment business and Self Assessment non-business tax gaps. We also calculate a separate wealthy tax gap estimate using the risk-based data, capturing the riskiest wealthy taxpayers not included in the REP data.

The large partnerships tax gap is not measured directly and as such we cannot use data matching to find the wealthy portion of the large partnerships tax gap. Instead, we find the percentage of liabilities from large partnerships which comes from wealthy taxpayers and apply that percentage to the Self Assessment large partnership tax gap. The net tax gap is calculated by deducting compliance yield from the gross tax gap and adding non-payment.

Projections for years with incomplete data

Due to the timing of returns and the duration of compliance checks, full REP data is not available for the most recent tax years. For the years after 2022 to 2023 the model applies projection factors based on the most recent complete information. The projections assume a stable gross tax gap and incorporate up‑to‑date information on liabilities, non‑payment and compliance yield. These projections are replaced with actual data in subsequent editions once sufficient REP cases have been settled.

Data features

The most recent Self Assessment REP random sample is for the 2022 to 2023 tax year. From 2014 to 2015 approximately half of the sample was worked as a desk-based compliance check rather than the standard face to face approach before the move to a fully desk-based approach was implemented in the tax year 2016 to 2017. An internal evaluation of the effect of working REP cases as a desk-based compliance check as opposed to face to face was carried out and found no statistically significant evidence that it affected the outcome of the compliance check. 

In ‘Measuring tax gaps 2022 edition’ we postponed using the Self Assessment REP data for 2018 to 2019 due to increased uncertainties in the REP arising from different ways of working during the pandemic. Following a review by experienced caseworkers of the 2018 to 2019 and 2019 to 2020REP we have established a compliance yield uplift and applied it to the data. 

Timing

There are 2 timing factors that affect when the Self Assessment tax gap estimates can be produced. The first arises from delays inherent in the returns process. Individuals have until the 31 January following the end of the tax year to submit their return. HMRC then has a further year in which to open a compliance check.

The second relates to the compliance checks themselves. Random enquiries can be complex and may take several years to complete. As a result, full REP results for the most recent tax years are not yet available at the point of estimation.

To produce timely estimates despite these delays, forecasts are used for open or unsettled cases. Where possible, caseworker forecasts are used to estimate the likely outcome of ongoing compliance checks. Where these are not available, forecast values are generated based on the outcomes of similar settled cases. These forecasts allow the model to produce a best estimate for each tax year while recognising that results will be revised once more complete information becomes available.

Validation

As part of each year’s programme, HMRC conducts a validation exercise for a sample of cases. These cases are checked to confirm that the compliance check outcomes (for example, the amount of yield) have been recorded accurately. Any inaccuracies are corrected prior to calculation of the tax gap for that year.

Further, internal HMRC Quality Assurance (QA) processes are in-place to ensure quality estimates. As part of this, an independent analytical assurer reviews the model and estimates and completes QA documentation. Any issues found are communicated and rectified prior to publication.

Data Issues and Limitations

The Self Assessment tax gap model relies on REP case data and population counts from HMRC administrative systems. While these sources provide a strong foundation, several inherent limitations affect the precision of the tax gap estimates.

Compliance checks cannot identify all incorrect returns or the full scale of non‑compliance. As a result, the raw REP findings under‑report the true tax gap. Although non-detection is addressed through a multiplier elsewhere in the methodology, the underlying limitation stems from the inherent constraints of compliance checks themselves.

Finally, the model depends on administrative population data that may be revised over time. Changes in customer group definitions, alignment exercises, or reclassification of cases can lead to updates to historical data. These revisions ensure consistency but highlight that population measures are not entirely static.

Sources of Error

There are 2 main sources of error associated with the results of REPs which could result in the true values of the tax gaps differing from the estimates produced. These are: 

sampling variation in the data: the whole population is not subject to a compliance check, so even though the sample is designed to be representative, its characteristics may differ from the population purely by chance 
systematic uncertainty where the sample results consistently tend to under-report the true values for the population, or where the sample does not include subgroups of the full population, for example those participating in avoidance

Uncertainty rating

Self Assessment business and non-business (individuals and small partnerships) 

The uncertainty ratings for the Self Assessment business and non-business tax gap estimate is ‘medium’. The Self Assessment tax gap model captures the majority of the tax base and its population, uses a robust stratified REP methodology and is supported by reliable case data. Areas of uncertainty include the projections required to account for years where REP data is not available, forecasting yield for ongoing compliance checks, and the treatment of outliers.  

These factors contribute to variation in both the measured under‑declared liabilities and the adjustments applied for undetected non‑compliance. The rating remains unchanged from the previous edition.

Self Assessment large partnerships 

The uncertainty rating for the Self Assessment large partnerships tax gap estimate is ‘very high’. This rating is given because the model may not capture the appropriate tax base and population, the illustrative methodology is heavily assumption-based, and the data may not be representative. Key areas of uncertainty include the omission of non-compliance specific to large partnerships and the assumption that large partnership taxpayers behave in the same way as the Self Assessment business and non-business taxpayers. This is unchanged from the previous edition.

Wealthy taxpayers

The uncertainty rating for the wealthy individuals in Self Assessment tax gap estimate is ‘high’. This rating is given because the model captures most of the tax base, uses a robust stratified REP methodology, and is supported by reliable case data. However, uncertainty remains due to the need to project results for years where REP data is not yet available, the use of forecasts for ongoing compliance checks, and the fact that not all non‑compliance can be identified through compliance checks alone. Additional uncertainty arises from the omission of non-compliance specific to wealthy large partnerships and the assumption that wealthy large partnership taxpayers behave in the same way as the wealthy individuals in Self Assessment.

Question 3

PAYE for small businesses

Accepted Answer

Overview

The PAYE REP allows us to estimate the tax gap arising from PAYE non-compliance of small businesses. The employer may be an individual, partnership, public body, charity, or a company and will be required to make returns under the PAYE regulations to account for Income Tax and NICs. 

The figures relate to Income Tax, NICs, and Student Loan Repayments collected through PAYE due on earnings and other income from employment. The scope of these figures also includes tax due on occupational pensions taxed through PAYE. 

Tax gap calculation (step-by-step)

Measure under‑declared tax in a random sample.
Calculate the average amount of under-declared tax in the random sample.
Apply a rolling average to the average amount of under-declared tax in 2024 to 2025.
Scale the sample to the whole population.
Apply the non-detection multiplier.
Add non‑payment (tax liability that will never be paid).
Subtract compliance yield.

Methodology

Data and Sampling

This section describes the data sources used for the PAYE small businesses model and how the sample is selected, structured, and prepared for analysis.

Data sources

The PAYE for small businesses model is based on data from the REP. Each selected return undergoes a full compliance check, during which caseworkers review the business’s records in detail and identify any under‑declared liabilities.

Population counts used to scale up the REP to the full population are drawn from HMRC administrative systems and reflect the total number of live employers. Cases that are dormant, dissolved, or otherwise outside the scope of the small business population are excluded.

Sampling approach

The REP for small businesses uses a stratified random sample of PAYE returns. Since April 2013, stratification has been based on trade class and number of employees. This allows the sample results to be weighted to reflect the structure of the underlying population and improves the accuracy of the tax gap estimate.

Before 2016 to 2017, samples were selected using the former SME customer group definition. From 2016 to 2017 onwards, the sample has been drawn solely from the small business customer group. Earlier years have been aligned to ensure consistency across the time series.

Sample sizes

Sample sizes for the PAYE small businesses REP vary by year, reflecting changes to customer group definitions, sampling design, and operational capacity. Annual sample size figures provide important context for the robustness of the estimates, as larger samples generally support more stable results.

For the 2024 to 2025 tax year, the PAYE small businesses sample size was halved. Internal analysis shows that this still represents a robust sample size. A table of PAYE small businesses REP sample sizes is published each year and shows how the number of cases selected has changed over time.

Table A4.3: Sample sizes for small businesses PAYE REP

PAYE small businesses- Accounting period ending in year	Sample size
2005 to 2006	1,285
2006 to 2007	1,184
2007 to 2008	1,077
2008 to 2009	1,174
2009 to 2010	1,180
2010 to 2011	496
2011 to 2012	673
2012 to 2013	692
2013 to 2014	761
2014 to 2015	780
2015 to 2016	855
2016 to 2017	795
2017 to 2018	808
2018 to 2019	802
2019 to 2020	759
2020 to 2021	700
2021 to 2022	519
2022 to 2023	674
2023 to 2024	667
2024 to 2025	304

Note for Table A4.3: 

Since the tax year 2016 to 2017 the PAYE sample size given is for small businesses only. Before 2016 to 2017 HMRC’s former small and medium-sized enterprises (SME) customer group classification was used.

Deselections

Cases in the REP are not worked for several reasons, and this is done in a non-random way. This means that the cases which are not worked are likely to be systematically different from the cases that are worked. Cases which are not worked are called deselections or rejections depending at which stage of the production process the decision to not work the case was taken. 

To avoid biasing the sample cases that are deselected from the sample but are still within the population of interest are included. If the individual or business has undergone a recent compliance check, the outcome of this earlier compliance check is substituted into the case. If no such previous compliance check exists, a value based on the average yield and probability of being non-compliant in the taxpayer’s stratum is assigned.

Outliers

Outliers are individual cases with significantly larger yield than other cases in the sample. Due to the nature of the REP samples, estimates are particularly sensitive to outliers. To ensure that these small number of cases do not have an undue influence on the tax gap calculation, their yield values are capped. This allows us to use all valid information while smoothing the year-on-year variability. 

Yield data is modelled using a representative statistical distribution. The final value used for each tax year is calculated as a 3 year moving average of the 99.85th percentile from this distribution, calculated based only on the results of years where the sample was stratified. For years before stratification, and years where a full 3 years of stratified results are not available, a value based on the last 3 complete stratified years is used. 

Estimating under-declared liabilities

This section explains how the results from the REP are transformed into an estimate of the total amount of under‑declared PAYE for small businesses.

REP case outcomes

Each REP case provides a measured outcome for that business: either no adjustment or an identified amount of under‑declared Income Tax, NICs, and student loan repayments in PAYE. These outcomes represent the observed levels of non‑compliance within the stratified sample.

Calculating average under-declared liabilities

For each stratum, the model calculates the average amount of under‑declared liability identified in the completed REP cases. This produces a set of stratum‑specific averages that reflect the differing compliance behaviour of small employers of differing trade classes and numbers of employees.

Smoothing and stability

A 2 year rolling average has been introduced for ‘Measuring tax gaps 2026’ edition for the 2024 to 2025 tax year, to account for a reduced sample size in that year. The method involves taking the rate and value of non-compliance for each of the last 2 years, averaging these results, applying a double weight to the current year, and applying this to the 2024 to 2025 year. Prior years do not have this treatment applied.

Scaling to the population

The average under‑declared liability for each stratum is weighted by the number of live small businesses in that stratum. Summing these results across all strata yields an estimate of the total under‑declared liability for the entire small businesses PAYE population.

This section describes the adjustments applied to the estimated under‑declared liabilities to ensure the PAYE for small businesses tax gap reflects the full extent of non‑compliance and produces a consistent time series.

Alignment of earlier years

For tax years prior to 2016 to 2017, REP data was collected under the former SME customer group rather than the current small businesses customer group. To ensure consistency across the time series, an adjustment is applied to align earlier years with the modern population definition. This involves applying a conversion factor derived from historical data where both definitions overlap, allowing estimates based on SME samples to be made comparable with the small businesses population used in later years.

For later years, small businesses estimates are calculated by flagging and removing mid-sized employers from the random sample to create a small business-only sample. However, for historical years the conversion factor is still applied because the source data does not contain the information required for the identification of mid-sized employers, therefore they cannot be removed. 

Non-detection

An adjustment is made for non-detection of non-compliance, which is a source of systematic uncertainty in REPs. The REPs will not identify all incorrect returns or the full scale of under-declaration of liabilities, so estimates produced from the unadjusted results of the programmes would underestimate the full extent of the tax gap. Non-detection multipliers are applied to the results of the REPs to account for this. 

The non-detection multiplier has been derived using the ‘Delphi’ approach, where caseworkers are asked to estimate the amount of non-compliance that is not accounted for in the audit. More information about the Delphi approach and our plans to carry out a programme of development to introduce new non-detection multipliers in future editions of ‘Measuring tax gaps’ can be found in HMRC’s working paper ‘Non-detection multipliers for measuring tax gaps’.

The multipliers are kept consistent year-on-year.

We use different multipliers to estimate lower and upper bounds to show the full range of uncertainty in our estimates. This range is based on the 95% confidence intervals of the estimate which are then adjusted for non-detection. A multiplier of 1 is used for the lower bound estimate, assuming that all non-compliance has been detected in the REP.

Table A4.4: Non-detection multipliers for PAYE small businesses

Multiplier for central estimate	Multiplier for lower estimate	Multiplier for upper estimate
1.260	1.000	1.520

Non-payment

Some tax liabilities will not ultimately be paid. To reflect this, the model includes an estimate of non‑payment attributable to each tax year.

In ‘Measuring tax gaps 2025 edition’ we improved the methodology for the estimate of non-payment for Income Tax, NICs, and student loan repayments in PAYE, for tax years since 2018 to 2019. The new methodology is an estimate of eventual non-payment attributable to the year of tax debt creation. These methodological improvements do not extend back beyond 2018 to 2019.

Prior to 2018 to 2019 non-payment refers to tax debts that are written off or remitted in a tax year by HMRC and result in a permanent loss of tax.

As separate figures of non-payment are not available for just the taxpayers within the scope of the REPs, the amounts are split in proportion to the tax gap resulting from the relevant section of the populations.

Compliance yield

The REPs provide an estimate of the tax gap due to incorrect returns. However, HMRC carries out a wider programme of compliance activity to identify and correct erroneous returns. To calculate the net tax gap, it is necessary to subtract the yield from this activity. 

The figures for yield are taken from HMRC’s systems for recording the outcomes of compliance checks and relate to cases settled during each year rather than compliance checks into returns relating to a specific tax year. Yield by year of settlement is used as a proxy due to the extended timeline for completing all the compliance activity related to the liabilities for a year.

Compliance yield figures can be found in Table 4.10 of ‘Measuring tax gaps 2026 edition’.

Projections for years with incomplete data

The PAYE REP has no lag and so no projections are required.

Data features

The latest observed PAYE random sample is for 2024 to 2025. From 2015 to 2016, approximately half of the sample was worked as a desk-based compliance check rather than a face-to-face approach before the move to a fully desk-based approach was implemented in the year 2018 to 2019. An evaluation of the effect of working cases as a desk-based compliance check as opposed to face to face was carried out and found no statistically significant evidence that it affected the outcome of the compliance check.

Timing

There are 2 factors which influence the timing of the latest available tax gap estimate for a particular type of tax return: 

delays inherent in the returns process
delays due to the complexity of some random compliance checks; it can take several years before sufficient random compliance checks relating to a particular tax year are settled to robustly report the results

At the time of estimation, some compliance checks from earlier years’ REPs will still be ongoing. To estimate tax gaps for each year, it is necessary to make assumptions about the cases that are yet to be settled at the date the compliance check results are analysed. Where possible, caseworker forecasts are used. Where caseworker forecasts are not possible, we forecast for such compliance checks based on the results of settled compliance checks with similar durations.

Finally, estimates for earlier years are revised from what has previously been published. This is due to using actual data for tax years that were previously projected for and long-running cases settling for different amounts to what was previously forecasted.

Validation

As part of each year’s programme, HMRC conducts a validation exercise for a sample of cases. These cases are checked to confirm that the compliance check outcomes (for example, the amount of yield) have been recorded accurately. Any inaccuracies are corrected prior to calculation of the tax gap for that year.

Further, internal HMRC Quality Assurance (QA) processes are in-place to ensure quality estimates. As part of this, an independent analytical assurer reviews the model and estimates and completes QA documentation. Any issues found are communicated and rectified prior to publication.

Data Issues and Limitations

The PAYE for small businesses model relies on REP case data and population counts from HMRC administrative systems. While these sources provide a strong foundation, several inherent limitations affect the precision of the tax gap estimates.

Compliance checks cannot identify all incorrect returns or the full scale of non‑compliance. As a result, the raw REP findings under‑report the true tax gap. Although non-detection is addressed through a multiplier elsewhere in the methodology, the underlying limitation stems from the inherent constraints of compliance checks themselves.

Finally, the model depends on administrative population data that may be revised over time. Changes in customer group definitions, alignment exercises, or reclassification of cases can lead to updates to historical data. These revisions ensure consistency but highlight that population measures are not entirely static.

Sources of Error

There are 2 main sources of error associated with the results of REPs which could result in the true values of the tax gaps differing from the estimates produced. These are: 

sampling variation in the data: the whole population is not subject to a compliance check, so even though the sample is designed to be representative, its characteristics may differ from the population purely by chance 
systematic uncertainty where the sample results consistently tend to under-report the true values for the population, or where the sample does not include subgroups of the full population, for example those participating in avoidance

Uncertainty rating

The uncertainty rating for the PAYE small businesses tax gap estimate is ‘medium’. This means that the model captures the majority of the tax base and its population, the stratified REP methodology is robust, and the REP data is reliable and suitable for purpose. Areas of uncertainty include it being necessary to forecast the expected compliance yield for compliance checks that are still ongoing. Additional uncertainty arises from the non-compliance which is missed or not fully investigated in a compliance check. This uncertainty is mitigated by the inclusion of a non-detection multiplier. The rating is unchanged from ‘Measuring tax gaps 2025 edition’. 

Question 4

PAYE for mid-sized businesses

Accepted Answer

Overview

The PAYE for mid-sized businesses model estimates the tax gap using risk-based enquiries, focusing on the riskiest businesses rather than a random sample of businesses. To prevent a few very large under-declared tax cases from distorting the overall results, the model applies an extreme value methodology, ensuring a more typical picture of non-compliance.

Risk-based enquiry data is available from 2014 to 2015 onward; earlier years use previous figures from ‘Measuring tax gaps 2020 edition’. As some risk-based enquiry cases remain open in recent years, projections are used to provide the best current estimates. A non-detection multiplier, based on expert judgement, accounts for undetected non-compliance.

Step-by-step tax gap calculation

Estimate under‑declared tax in observed risk-based enquiries.
Estimate the extreme value lower bound.
Estimate the upper bound.
Take the average of the lower bound and upper bound.
Apply the non‑detection multiplier.
Add non‑payment (tax liability that will never be paid).
Subtract compliance yield.
Project recent years in line with changes in tax liabilities.

Methodology

Data sources

The PAYE for mid-sized businesses model mainly uses operational data from HMRC’s risk-based enquiries. These enquiries provide detailed information on identified under-declarations of tax within the mid-sized businesses population.

To scale the findings from these enquiries up to the whole mid‑sized businesses population, the model uses HMRC’s administrative data on how many businesses are in this group and their overall PAYE tax liabilities. Operational data is available from 2014 to 2015 onward, so earlier years continue to use the figures published at the time.

Risk-based enquiry outcomes

HMRC carries out risk-based enquiries to investigate identified high-risk cased and determine whether the correct amount of PAYE has been declared. The outcome of each enquiry might show no issues, or it may identify additional tax that should have been declared. These outcomes provide direct evidence of non-compliance that is used in the model.

Some enquiries identify very large amounts of under‑declared tax, while most identify much smaller amounts. To prevent a small number of unusually large cases from distorting the overall results, the model uses a statistical approach that reduces their influence while keeping them in the dataset. This makes the final estimates more stable and reflective of typical patterns of non‑compliance across the mid‑sized businesses population.

Forecasting open cases

Many risk-based enquiry cases for mid-sized businesses remain open at the point of estimation, so the model forecasts the expected compliance yield for these cases to complete the dataset used to estimate the tax gap. For open cases, the model pairs each case with a similar closed case and uses the observed yield from the matched case as the forecast. These forecast values are replaced by actual outcomes in subsequent editions once the underlying cases have been settled.

Extreme value methodology (lower bound estimate)

The extreme value methodology is used to estimate under‑declared tax when most of the value is concentrated in a small number of cases.

Use risk‑based enquiry results: the method begins with the outcomes of risk‑based enquiry cases, which show how much tax has been under‑declared in each case.
Identify extreme‑value behaviour: the results typically show that a small number of cases account for most of the under‑declared tax.
Apply a threshold cut‑off: cases that do not follow this extreme‑value pattern are removed so that only those consistent with the expected distribution are included.
Fit a power‑law model: the remaining above‑threshold cases are fitted to a statistical power‑law model to estimate under‑declared tax among high‑yield cases.
Estimate cases without risk-based enquiries: the model then estimates how many similar above‑threshold cases may exist among businesses that were not subject to risk‑based enquiries.

For ‘Measuring tax gaps 2026 edition, the extreme value method has been improved. The model now uses the observed yield for the above-threshold risk-based enquiry cases, where previously it used the under-declared tax estimated by the model for these cases.

The method does not assume the presence of additional high‑yield cases beyond those observed, so the resulting estimate is likely to underestimate the true level of under‑declared tax. For this reason, this method is used as a lower bound estimate, with further adjustments applied elsewhere in the methodology.

Upper bound estimate

The upper bound estimate is likely to overestimate the true level of under‑declared tax.

Assume average behaviour matches risk-based enquiry cases: the model assumes that businesses not subject to risk‑based enquiries have the same average tax gap percentage as those that were subject to risk-based enquiries.
Apply this average rate to the population not subject to risk-based enquiries: this percentage is applied across all businesses that were not selected for a risk‑based enquiry.

This method produces a deliberately higher estimate. As risk-based enquiry cases were selected based on expected high levels of non‑compliance, applying their average rate to the whole population overstates the likely level of non‑compliance.

Central estimate

As the lower bound method is likely to produce an underestimate and the upper bound method is likely to produce an overestimate, the average of these 2 results is used as a reasonable estimate of non-compliance for the tax gap.

This section describes the adjustments applied to ensure the mid-sized businesses PAYE tax gap reflects the full extent of non‑compliance and produces a consistent time series.

Non-detection

Not all incorrect returns or under‑declared tax will be identified through risk‑based enquiries. To account for this, the model applies a non‑detection multiplier, which adjusts the risk-based enquiry results to better reflect the true level of non‑compliance in the mid‑sized businesses population. This multiplier is based on HMRC expert opinion and is reviewed regularly to ensure it reflects the latest understanding of risk-based enquiry effectiveness.

Table A4.5: Non-detection multiplier for PAYE mid-sized businesses

Tax Years	Multiplier
2014 to 2015 onwards	1.3

Non-payment

Some tax liabilities will not ultimately be paid. To reflect this, the model includes an estimate of non‑payment attributable to each tax year.

The method estimates eventual non-payment attributable to the year of tax debt creation. This does not extend back beyond the 2018 to 2019 tax year. For years before 2018 to 2019, non-payment refers to tax debts that are written off or remitted in a tax year by HMRC and result in a permanent loss of tax. 

Compliance yield

To calculate the net tax gap, compliance yield is subtracted from the gross tax gap. Compliance yield for mid‑sized businesses differs from the small businesses approach because it is attributed to the accounting period (year of liability) rather than the year in which compliance activity is settled. This means it is different to the compliance yield published in HMRC’s Annual Report and Accounts.

In the mid‑sized businesses model, compliance yield is calculated as the total yield from closed cases plus the estimated yield from open cases, ensuring that all compliance activity is aligned to the correct liability year. This provides a more accurate reflection of the tax corrected within the period and helps maintain consistency with the risk‑based enquiry data underpinning the model.

Timing

Risk-based enquiries can be complex and may take several years to complete. This is partially accounted for by forecasting expected compliance yield for open cases.

Differences between the forecast yield and actual yield may lead to revised tax gap estimates in subsequent publications, but the use of forecasting reduces the chance that these revisions are significant. The tax gap for more recent years is likely to be subject to larger revisions because a higher proportion of the compliance yield is estimated.

Projections for recent years

There are more open cases in more recent accounting periods as there has been less time to complete these enquiries. The use of projected data for these years reduces the chance of large revisions to these years in future.

Both the compliance yield and gross tax gap figures for the tax years from 2023 to 2024 onward have been projected. This is done based on the percentage of compliance yield and gross tax gap to liabilities for 2022 to 2023.

The projections will lead to revised tax gap estimates in subsequent publications when these projections are replaced with actual estimates based on risk-based enquiry data.

Sources of Error

There are 3 main sources of error that may cause the true mid‑sized businesses PAYE tax gap to differ from the model estimates.

First, systematic uncertainty arises when risk‑based enquiry results under‑report the true level of non‑compliance or when parts of the population are not fully captured.

Second, variations in risk‑based enquiry data occur because risking approaches change over time, which can affect the amount of tax identified and introduce differences between years.

Third, uncertainty in population numbers can affect results, as the definition of the mid‑sized businesses population can shift.

Some of this systematic uncertainty is addressed by the non‑detection multiplier.

Uncertainty rating

The uncertainty rating for the mid‑sized businesses PAYE tax gap estimate is ‘medium’. The model captures most of the tax base and uses detailed operational data from risk-based enquiries, supported by an extreme value methodology that reduces sensitivity to unusually large cases. However, uncertainty remains because a proportion of cases are still open at the time of estimation and must be forecasted, and because the results can be affected by changes in how cases are identified and worked. These factors mean that estimates for the most recent years are more likely to be revised as additional risk-based enquiry outcomes become available. This uncertainty rating is unchanged from ‘Measuring tax gaps 2025 edition’.

Question 5

PAYE for large businesses

Accepted Answer

The PAYE for large businesses model uses an illustrative methodology. An illustrative estimate is produced by assuming that the tax at risk will represent, over the long term, a similar proportion of liabilities to small businesses employers as identified in the results of the REPs. The estimated tax at risk is then adjusted to reflect compliance yield and non-payment. 

An adjustment to the estimate of the tax gap was made following on from the introduction of the PAYE Real Time Information (RTI) system, where information on payroll taxes is recorded more accurately and on a more frequent basis, allowing HMRC to identify debts and act at an earlier stage than previously. This was done by estimating the impact of RTI on the tax gap estimates from the REP and applying this change to the estimate for large businesses. 

The uncertainty rating for the large businesses PAYE tax gap estimate is ‘very high’. This means that the model may not capture the appropriate tax base and population, the illustrative methodology is heavily assumption-based, and data may not be representative. Areas of uncertainty include the assumption that PAYE large businesses behave in the same way as PAYE small businesses. The illustrative estimate uses data from PAYE small businesses, assuming the tax at risk in large businesses will represent a similar proportion of liabilities to PAYE small businesses. This uncertainty rating is unchanged from ‘Measuring tax gaps 2025 edition’.

Question 6

Avoidance

Accepted Answer

Overview

The avoidance model estimates the avoidance tax gap for Income Tax, NICs, and Capital Gains Tax. It is estimated using information that HMRC collects on tax avoidance schemes.

As there is a time lag in identifying new users of avoidance schemes, projections are used to provide the best current estimates. A non-detection multiplier, based on expert judgement, accounts for undetected non-compliance.

Step-by-step tax gap calculation

Estimate the tax at risk due to avoidance schemes.
Apply the non-detection multiplier.
Subtract compliance yield.
Project recent years in line with past trends in user numbers.

Methodology

Data sources 

The avoidance tax gap is estimated using information that HMRC collects on tax avoidance schemes and records on its management information system. This includes avoidance schemes for individuals, trusts, partnerships and employers. The information that HMRC collects relates to disclosed and undisclosed schemes. 

Disclosed schemes are arrangements (including any scheme, transaction or series of transactions) that will or are intended to provide the user with a tax advantage when compared to a different course of action and, under tax legislation, must be disclosed to HMRC. You can find more information about disclosure of tax avoidance schemes (DOTAS) on GOV.UK.

Undisclosed schemes are arrangements identified by HMRC but not disclosed under DOTAS legislation.

For schemes disclosed under DOTAS, information is captured during the following process: 

promoters of avoidance schemes that are covered by the avoidance disclosure rules must disclose any new schemes to HMRC when they are made available to potential users 
disclosures must contain sufficient detail for HMRC tax specialists to understand how the scheme works
for each disclosure HMRC issues a scheme reference number to the promoters, and taxpayers who participate in the scheme are required to notify HMRC of the reference number on their tax return (described here as a notification)

When reviewing both disclosed and undisclosed avoidance schemes, tax specialists record an estimate of the tax under consideration based on the relevant information relating to these ongoing enquiries. Any additional tax (compliance yield) that is collected following completed enquiries is also recorded. 

Detailed taxpayer-level data on avoidance schemes is available for large businesses and wealthy individuals. This enables comparison of the tax under consideration and compliance yield for an individual scheme user.

Audit data is available from 2015 to 2016 onward, so earlier years continue to use the tax gap estimates published in ‘Measuring tax gaps 2024 edition’, which were based on a different methodology.

Estimating the tax at risk

The avoidance tax gap model takes internal data on avoidance schemes at the usage level and identifies the tax at risk due to these schemes. A tax at risk estimate is calculated for each individual scheme usage to produce an estimate of the total tax at risk in the avoidance population.

Non-detection

Not all of the tax at risk due to avoidance schemes will be identified through audits. To account for this, the model applies a non‑detection multiplier, which adjusts the tax at risk to better reflect the true level of non‑compliance in the avoidance population. This multiplier is based on HMRC expert opinion and is reviewed regularly to ensure it reflects the latest understanding of audit effectiveness.

Table A4.6: Non-detection multiplier for avoidance tax gap

Tax Years	Multiplier
2015 to 2016 onwards	1.05

Compliance Yield

To calculate the tax gap, compliance yield is subtracted from the estimated tax at risk. Compliance yield for avoidance is attributed to the accounting period (year of liability) rather than the year in which compliance activity is settled. This means it is different to the compliance yield published in HMRC’s Annual Report and Accounts.

In the avoidance model, compliance yield is calculated as the total yield from closed cases plus the estimated yield from open cases, which is projected for recent years in line with past trends in user numbers. The model ensures that all compliance activity is aligned to the correct liability year which provides a more accurate reflection of the tax corrected within the period.

Although compliance yield arises from this activity, it is not presented separately in the tax gap tables. Information on compliance yield from tackling avoidance is published in HMRC’s Annual Report and Accounts.

Timing

Audits of avoidance schemes can be complex and may take several years to complete, leading to a time lag in new users of these schemes being identified.

Projections for recent years

The effect of this time lag causes uncertainty in more recent accounting periods as there has been less time to complete audits. The use of projected data for these years reduces the chance of large revisions to these years in future.

The number of individual scheme users has been uplifted for the tax years from 2023 to 2024 onward, in line with past increases in user numbers, to account for this time lag. This uplift in user numbers adjusts the tax at risk in these years, as additional users have additional tax at risk. The compliance yield for these years is then adjusted based on the change in tax at risk.

The projections will lead to revised tax gap estimates in subsequent publications when these projections are replaced with actual estimates based on identified users.

Sources of error

The main source of error in the avoidance tax gap estimates is that HMRC may not identify all avoidance schemes; this may lead to an underestimation of the tax gap which cannot be quantified. The non-detection multiplier will only account for missed yield when conducting compliance checks, not schemes that have not been identified. 

Uncertainty rating 

The uncertainty rating for the avoidance tax gap estimate is ‘high’. The quality of the data is uncertain due to time lags in identifying users of avoidance schemes and the lack of information on any avoidance schemes not captured in the database. These factors mean that estimates for the most recent years are more likely to be revised as users are identified. This uncertainty rating is unchanged from ‘Measuring tax gaps 2025 edition’.

Question 7

Hidden economy

Accepted Answer

Overview

The hidden economy model estimates the hidden economy tax gap for Income Tax, NICs and Capital Gains Tax. It estimates unpaid tax from legal economic activities (such as undeclared casual work) that are entirely hidden from HMRC. It is estimated from 2 groups of individuals, called moonlighters and ghosts.

Moonlighters are individuals who are employees in their legitimate occupation but do not declare earnings from other sources of income. Ghosts are individuals who do not declare any of their income to HMRC, whether earned or unearned.

The moonlighters’ tax gap is split into an earned income and unearned income tax gap. Earned income refers to individuals whose undeclared source of income is from employment. Unearned income refers to individuals whose undeclared source of income is not from employment but from sources such as lettings or interest.  

The earned and unearned income tax gaps have separate methodologies. The earned income tax gap is based on surveys commissioned by HMRC and the unearned income tax gap is based on data matching of administrative HMRC data and third-party information.

The ghosts’ tax gap is based on earned income only because ghosts participate in the hidden economy surveys but do not declare any of their income to HMRC. This means there is no administrative HMRC data on ghosts to use for data matching for the unearned income tax gap.

Step-by-step tax gap calculation

Estimate the moonlighters’ and ghosts’ earned income tax gaps from the survey data.
Estimate the moonlighters’ unearned income tax gap using data matching.
Impute between data points for each estimate because surveys and data matching are not completed every year.
Construct a full time series by forecasting and retrospectively applying the impact of policy changes.
Combine the estimates together to estimate the moonlighters’, ghosts’ and overall hidden economy tax gaps.

Methodology

Earned income tax gap

The moonlighters’ and ghosts’ earned income estimates are based on data from 2 hidden economy surveys. The Hidden Economy Survey (HES) was commissioned by HMRC in 2015 to understand the nature of the hidden economy and the characteristics of those involved. This survey data was used to calculate the earned income tax gap estimates in the 2015 to 2016 tax year. For more information on the 2015 Hidden Economy Survey, go to GOV.UK.

The Hidden Economy Survey Wave 2 (HESW2) was commissioned by HMRC in 2022 to provide updated insights into the hidden economy. This survey data was used to calculate the earned income tax gap estimates in the 2021 to 2022 tax year. For more information on the 2022 Hidden Economy Wave 2 Survey, go to GOV.UK.

In total, 9,640 people were surveyed in 2015 to 2016, while 5,538 people were surveyed in 2021 to 2022. The surveys captured data on prevalence of, and income from, hidden economy activities.

The estimate for unpaid tax on moonlighters’ earned income from the survey samples is calculated by subtracting the tax paid on declared income from the tax that would have been due on their earnings if they had declared all their income. For ghosts, this is calculated by applying the relevant tax rate to the undeclared income estimated from the survey observations.

This covers Income Tax and NICs with allowances made for whether the hidden economy activity in question would be classified as self-employment or employment. An assumption for under-reporting of income is also included.

The sample estimates are then grossed up to the total population using the prevalence rates of moonlighters and ghosts with earned income in the population. These prevalence rates come from the surveys and are weighted to account for people who did not respond to the survey, to ensure they are representative of the overall population.

Unearned income tax gap

The moonlighters’ unearned income estimate covers individuals who have additional sources of income that are not from employment. Some of these sources of income would require them to submit a Self Assessment return to supplement their normal tax payment through PAYE. 

The sources of income covered by unearned income are lettings, interest, capital gains on property, chargeable events, Individuals Savings Accounts (ISAs) and secondary income (for example, activities such as hobbies or online selling that are not regular enough to be considered employment).

It is not necessary for most taxpayers to submit a Self Assessment tax return where all tax liabilities are withheld at source. For example, employment income where tax is deducted under PAYE. However, there are risks within this population, for example due to taxpayers not informing HMRC about sources of income, especially where they may exceed tax-free allowances. If a Self Assessment tax return should have been completed, then: lettings, interest, ISA income and chargeable events would be subject to Income Tax; capital gains on property would be subject to Capital Gains Tax; and secondary income would be subject to Income Tax and NICs.

HMRC has used data matching of administrative data and third-party information for tax years 2014 to 2015 and 2019 to 2020 to measure the extent to which taxpayers fail to declare these additional sources of unearned income accounting for relevant tax allowances. An estimate of additional tax due is calculated from the identified undeclared income. Third-party data matched with administrative tax records includes rental deposit schemes, and bank and building society interest declarations.

Because of the large amount of data involved in the data matching exercise, it is only conducted on a representative sample of the population already in PAYE. The results are then grossed up from the sample to the population to estimate the moonlighters’ unearned income tax gap. 

Timing

Surveys and data matching exercises are not completed every year and there are currently only 2 data points for each. A full time series between 2005 to 2006 and 2024 to 2025 is constructed by imputing between these data points, alongside forecasting and retrospectively applying the impact of policy changes.

Imputation

The earned income tax gap estimates for tax years 2015 to 2016 and 2021 to 2022 come from the 2 commissioned surveys. The data between these 2 points is imputed to create a consistent trend; this assumes that any change between the data points is gradual rather than a sudden change in the trend when a new data point is available.

The unearned income tax gap estimates for tax years 2014 to 2015 and 2019 to 2020 come from the data matching exercises. The data between these 2 points is also imputed to create a consistent trend.

Applying the impact of policy changes

The hidden economy time series is constructed by forecasting and retrospectively applying the impact of policy changes on receipts, using the Office for Budget Responsibility’s certified policy costings estimates. This involves multiplying the measured data points by a time series of tax receipts adjusted for changes in tax policy over time.

The earned income tax gap estimates are calculated retrospectively back to 2005 to 2006 using the 2015 to 2016 survey data and forecast forward up to 2024 to 2025 using the 2021 to 2022 survey data. This is done in line with Income Tax and NICs policy changes.

The unearned income tax gap estimates are calculated retrospectively back to 2005 to 2006 using the 2014 to 2015 data matching exercise and forecast forward up to 2024 to 2025 using the 2019 to 2020 data matching exercise. This is done in line with Income Tax, NICs, and Capital Gains Tax policy changes.

This approach allows the constructed time series to account for changes in both tax rates and the tax base over time. For example, frozen personal allowance thresholds increase the potential tax revenue from hidden economy activities, all else being equal.

After applying the impact of policy changes, there is now a full time series for the moonlighters’ and ghosts’ earned income tax gaps and for the moonlighters’ unearned income tax gap. The moonlighters’ earned and unearned income are added together to produce a time series for the moonlighters’ tax gap. The time series for the ghosts’ tax gap is entirely made up of the ghosts’ earned income tax gap. The moonlighters’ and ghosts’ tax gaps are added together to produce a full time series for the hidden economy tax gap.

Data issues and limitations

The limitations associated with the results of the data matching exercise relate to the coverage of the third-party data used to establish evidence of additional undeclared income. Coverage varies across different sources of income, being especially reliable for lettings and interest income, whereas it is less reliable for the remaining sources identified. Additionally, there are other sources of income that could not be investigated due to unavailability of data. The resulting estimate should be interpreted broadly as a lower limit for the true scale of the tax gap relating to this group of taxpayers.

The ghosts’ tax gap is likely to be an underestimate because it is based on earned income only and does not have an unearned income estimate due to the lack of HMRC administrative data for this group. This is because ghosts by definition do not declare any of their income to HMRC. The ghosts’ estimate should be interpreted as a lower limit. While the moonlighters’ estimate is more comprehensive, the ghosts’ underestimate does mean that the overall hidden economy estimate is also likely to be an underestimate.

Sources of error

The main sources of error in the hidden economy tax gap estimates are because the estimates are based on samples, rather than the whole hidden economy population. Sampling variation in the data means that even though the sample is designed to be representative, its characteristics may be different to the population purely by chance.

For the hidden economy surveys, non-response error could introduce bias into the results. While the results are weighted to account for people who do not respond to the surveys, it is possible that people who participate in the hidden economy could be less likely to respond. The surveys are commissioned by HMRC but are conducted by independent research agencies to reduce the likelihood of this bias.

Uncertainty rating

The uncertainty rating for the hidden economy tax gap estimate relating to Income Tax, NICs, and Capital Gains Tax for moonlighters is ‘high’. This means that the tax base may not be fully captured. For the tax base we do cover, the model uses the 2 surveys alongside the 2 data matching exercises and estimates a lower limit. Areas of uncertainty include the low coverage of moonlighter income sources and the lack of independent moonlighter data sources. This uncertainty rating is unchanged from ‘Measuring tax gaps 2025 edition’.

The uncertainty rating for the hidden economy tax gap estimate relating to Income Tax, NICs, and Capital Gains Tax for ghosts is ‘very high’. This means that the tax base may not be fully captured. For the tax base we do cover, the model uses the 2 surveys and estimates a lower limit. Areas of uncertainty include the potential underestimate of the tax gap, due to the lack of data matching exercises and limited coverage of the hidden economy ghost population. This uncertainty rating is unchanged from ‘Measuring tax gaps 2025 edition’.

Cookies on GOV.UK

Introduction

Self Assessment

Overview

Tax gap calculation (step-by-step)

Methodology

Data and Sampling

Data sources

Sampling approach

Sample sizes

Table A4.1: Sample sizes for the Self Assessment REP

Deselections

Outliers

Estimating under-declared liabilities

REP case outcomes

Calculating average under-declared liabilities

Smoothing and stability

Scaling to the population

Model Adjustments and Refinements

Alignment of earlier years

Non-detection

Table A4.2: Non-detection multipliers for Self Assessment tax gap

Non-payment

Compliance yield

Other estimates

Large partnerships in Self Assessment

Wealthy taxpayers

Projections for years with incomplete data

Data features

Timing

Validation

Data Issues and Limitations

Sources of Error

Uncertainty rating

Self Assessment business and non-business (individuals and small partnerships)

Self Assessment large partnerships

Wealthy taxpayers

PAYE for small businesses

Overview

Tax gap calculation (step-by-step)

Methodology

Data and Sampling

Data sources

Sampling approach

Sample sizes

Table A4.3: Sample sizes for small businesses PAYE REP

Deselections

Outliers

Estimating under-declared liabilities

REP case outcomes

Calculating average under-declared liabilities

Smoothing and stability

Scaling to the population

Model Adjustments and Refinements

Alignment of earlier years

Non-detection

Table A4.4: Non-detection multipliers for PAYE small businesses

Non-payment

Compliance yield

Projections for years with incomplete data

Data features

Timing

Validation

Data Issues and Limitations

Sources of Error

Uncertainty rating

PAYE for mid-sized businesses

Overview

Step-by-step tax gap calculation

Methodology

Data sources

Risk-based enquiry outcomes

Forecasting open cases

Extreme value methodology (lower bound estimate)

Upper bound estimate

Central estimate

Model Adjustments and Refinements

Non-detection

Table A4.5: Non-detection multiplier for PAYE mid-sized businesses

Non-payment

Large partnerships in Self Assessment 

Wealthy taxpayers

Self Assessment business and non-business (individuals and small partnerships) 

Self Assessment large partnerships 

Data sources 

Uncertainty rating 

Hidden economy