Research and analysis

Cervical screening: invasive cervical cancer audit 2016 to 2019: Appendix A: data completeness

Published 30 November 2023

Applies to England

1. Data Completeness and limitations

When considering the findings presented in this report, the varying degree of completeness of the available information should be considered. The difficulties involved in ensuring the completeness of essential data fields are described below.

It is rare for data to be reported as missing, but missing data should be distinguished from incompleteness of record.  Missing data may be unavailable (e.g. where a death certificate, which does not provide information about cancer staging, has been used), or may not yet have been recorded as part of the audit. For this reason, we have used the term ‘none recorded’ to describe cases where final stage is still pending and the term ‘none available’ to describe cases where after considerable effort, no staging data has been available.

Other cases may be subject to reporting delays, having been submitted to the audit before all essential fields could be completed. In these instances, missing fields are updated as and when data become available, with the result that complete information may not be received for some months after the case has been registered. An additional challenge, which can create further delay, is the need to coordinate between the various aspects of the audit process when a case of cervical cancer is diagnosed.

2. Cancers and population controls

Cases of cervical cancer are identified by NHS hospital staff (primarily via gynae-oncology), and confirmed by histology. A small proportion of cancers will be missed by the audit, and a very small proportion will be excluded because the patients are not registered with an NHS GP. Table 1 illustrates the limited extent of this problem, comparing the number of registrations for cervical cancer in a given calendar year with the number of cases picked up by the audit over the same period of time.

Controls are selected randomly (subject to matching) from people registered with an NHS GP. All those selected are included in the audit.

3. Dealing with missing values

Cases reported in the MB1 series (Cancer Registration Statistics in England, Office for National Statistics) between 2016 and 2018 were compared to those recorded in the audit for the same period, by age at diagnosis (Table 1). The aim was to ascertain whether there is a subset of cases for which a delay in their inclusion in the audit is more likely, and whether this is related to age at diagnosis. 81% of the cervical cancer cases registered in England between 2016 and 2018 were recorded in the audit for the same period. However, the audit data are more likely to include cases diagnosed in people between the ages of 25 and 64 (audit includes 85% of all registered cancers in this age group), than cases diagnosed in people over the age of 65 (audit includes 64% of registered cancers in this age group). The completeness of the audit data, compared with MB1 decreased with increasing age at diagnosis.

We assessed the completeness of audit data for FIGO stage by comparing the distribution of staged cancers diagnosed netween April  2011 and March 2012 across 5 audit years (Table 2). The table shows that if we were to assign a stage to cases with this information missing, assuming that stage was missing at random, we would be overestimating the proportion of cases diagnosed with early stage cancer. For example, based on data received as of October 2011 we would have assumed 47.6% of cases diagnosed between April 2011 and March 2012 had stage IA cancer. However, by October 2012 that proportion decreased to 42% and remained as such since then. This suggests that cases with unknown FIGO stage are more likely to be advanced stage cancer.

In the audit reports published in July 2011 and May 2012, we assumed that data for FIGO staging was missing at random, which would have led to an overestimation of the proportion of stage IA cancers and an underestimation of the proportion of stage II+ cancers. For this report (and the 2 previous reports), we have used a more complicated model that takes into account the differential delays in obtaining stage.[1]

A- 1a  National Cancer Registration Statistics (MB1 Series), published by the Office for National Statistics, compared with those reported in the audit

Age at diagnosis
Total cases reported
<20 20-24 25-29 30-34 35-39 40-44 45-49 50-55 55-59 60-64 65-69 70-74 75-79 80-84 85+ Total 25-64    
MB1 series* 2016/18   (n) 5 156 966 1093 936 826 802 595 563 473 348 349 259 246 236 7853 6254  
Audit 2016/18   (n) 1 129 856 967 824 727 655 483 458 352 248 230 176 133 130 6369 5322  
Difference (n) 4 27 110 126 112 99 147 112 105 121 100 119 83 113 106 1484 932  
Proportion (%) 20.0 82.7 88.6 88.5 88.0 88.0 81.7 81.2 81.3 74.4 71.3 65.9 68.0 54.1 55.1 81.1 85.1  
  • MB1 Cancer Statistics are published by calendar year, audit data are normally reported by financial year (1 Apr to 31 Mar).

A -1b Cancers in people aged 25 to 64, diagnosed between April 2011 and March 2012

Observed stage by year of Audit data                                                   (n) Proportion assuming missing at random (%)
Received as of IA IB II III+ IB+ None recorded Total IA IB II+      
Oct-11 285 207 50 28 29 124 723 47.6 38.1 14.3      
Oct-12 656 520 169 124 73 161 1703 42.5 36.8 20.7      
Oct-13 691 562 194 152 73 135 1807 41.3 36.3 22.4      
Oct-17 774 601 214 172 71 117 1949 42.2 35.2 22.6      
Oct-19 773 600 209 171 77 115 1945 42.2 35.4 22.4      

A-2a Proportion of essential data collected for cases in section A: personal and cancer details

Section A: Essential fields
    Date of Birth Date of Diagnosis Stage* Histology        
  Cases n % n % n % n %
Current report (2016/17-2018/19) 6,369 6,369 100 6,369 100 6,200 97.3 5617 88.2
Fifth report (2013/14-2015/16) 6,028 6,028 100 6,028 100 5,718 94.9 5,490 91.1
Fourth report (2009/10-2012/13) 8,784 8,784 100 8,784 100 8,014 91.2 8,543 97.3
Third report (2009/10-2011/12) 6,508 6,508 100 6,508 100 5,901 90.7 6,336 97.4
Second report (2007/08-2010/11) 8,566 8,566 100 8,566 100 7,423 86.7 8,197 95.7
First report (2007/08-2009/10) 6,231 6,231 100 6,231 100 5,197 83.4 5,922 95.0

*Cases where data collection is complete and stage is missing are considered to be staged as a reasonable amount of effort has been made to collect the data. Incomplete cases with a stage recorded as X (or missing) are considered not to have stage. Please refer to section 6 for full details regarding missing data.

A-2b Proportion of data collected for cases in section A: personal and cancer details

Treatment (in those with known tx, excluding those reported as none*) Treatment (in those with tx recorded including those recorded as none) Index of Multiple Deprivation                     (Cases) Index of Multiple Deprivation                  (Controls)
  Cases n % n % n % Controls n %
Current report (2016/17-2018/19) 6,369 4,766 74.8 4,926 77.3 5,667 89.0 12,392 8,782 70.9
Fifth report (2013/14-2015/16) 6,028 3,990 66.2 4,254 70.6 5,224 86.7 11,580 8,451 73.0
Fourth report (2009/10-2012/13) 8,784 5,970 68.0 6,183 70.4 6,843 77.9 17,270 7,345 42.5
Third report (2009/10-2011/12) 6,508 4,146 63.7 4,394 67.5 5,104 78.4 12,841 4,423 34.4
Second report (2007/08-2010/11) 8,566 5,199 60.7 5,675 66.3 6,485 75.7 16,920 7,964 47.1
First report (2007/08-2009/10) 6,231 3,086 49.5 3,382 54.3 4,723 75.8 12,335 5,947 48.2

*Where treatment was recorded as “None” we assume it means “none other than palliative care”. Attempts have been made to clarify this issue and there is now a category for palliative care; however some misclassification may still remain and therefore they are excluded from this column

A-3 Proportion of cases with FIGO stage reported as none recorded, none available* or IB or worse (1B+), by age and audit year ( from April 2016 to March 2019)

None recorded None available IB+ (NOS) Total
Age % % % %
<25 0.0 0.0 4.0 4.0
25-49 1.9 1.0 0.8 3.6
50-64 3.9 2.6 1.9 8.3
65+ 4.3 3.4 1.7 9.4
Audit Year        
2016/17 1.9 1.8 1.3 5.0
2017/18 2.2 1.6 0.8 4.6
2018/19 3.9 1.5 1.4 6.8
Previous reports        
Current report (2016/17-2018/19) 2.7 1.6 1.2 5.4
Fifth report (2013/14-2015/16) 2.2 5.1 2.2 9.5
Fourth report (2009/10-2012/13) 8.8 1.9 3.8 14.5
Third report (2009/10-2011/12) 9.3 1.1 4.4 14.8
Second report (2007/08-2010/11) 12.0 1.6 4.2 17.8
First report (2007/08-2009/10) 16.6 N/A 4.4 21.0
  • Where stage is reported as none available instead of none recorded a reasonable amount of effort has been made to find the stage, but none has been available. This is derived from cases recorded as “Audit complete” which means that no further details are being sought for these people. The option to report cases as “none available” has only been available to all SQAs since April 2012

A-4 Proportion of data collected for cases in section B: cytology

Section B: Cytology
      Completeness of data among recorded cytology tests          
      Date test was taken   Result of Testb   Action Codeb  
Audit year Cases Tests on all casesa n % n % n %
Current report (2016/17-2018/19) 6,369 24,334 24,334 100 24,210 99.5 24,312 99.9
Fifth report (2013/14-2015/16) 6,028 21,764 21,764 100 21,707 99.7 21,746 99.9
Fourth report (2009/10-2012/13) 8,784 35,810 35,810 100 35,803 100 35,781 99.9
Third report (2009/10-2011/12) 6,508 26,619 26,619 100 26,619 100 26,594 99.9
Second report (2007/08-2010/11) 8,567 34,910 34,910 100 34,910 100 34,870 99.9
First report (2007/08-2009/10) 6,231 25,972 25,972 100 25,954 100 25,951 99.9

a. Cytology tests known to the Audit and taken before diagnosis

b. Cytology data obtained directly from Open Exeter should have all three data fields complete. Missing data, we believe, is the result of inclusion into the audit of cytology tests taken before the programme started in 1988 and a few slides that were found in the laboratory, but not recorded on Exeter. These tests will not have “Action Code” as this field is generated by Exeter.

A-5 Proportion of data collected for cases in section C: colposcopy

Section C: Colposcopy
Audit report Cases with an Action Code of suspend Cases with a suspend and a colposcopy   Cases with a colp but no suspend No. of Colp appts Date of colp   Satisfactory exam or DNA*   Colp procedure
    n % n n n % n % n
Current report (2016/17-2018/19) 3,893 2944 75.6 772 5,987 5987 100 4934 82 4409
Fifth report (2013/14-2015/16) 3,674 2,397 65.2 627 4,378 4378 100 4,378 100 3,601
Fourth report (2009/10-2012/13) 6,073 3,963 65.3 494 6,823 6823 100 6,823 100 5,479
Third report (2009/10-2011/12) 4,523 2,843 62.9 430 5,195 5195 100 4,347 84 4,287
Second report (2007/08-2010/11) 5,884 3,604 61.3 647 7,167 7,167 100 5,942 83 5,620
First report (2007/08-2009/10) 4,308 2,412 56.0 557 4,348 4,348 100 3,445 79 3,249
  • DNA, did not attend