Research and analysis

Inter-subject comparability of exam standards in GCSEs and A Levels

Generating new evidence about the impact on performance standards of statistically aligning subjects based on a Rasch analysis of subject difficulty.

Applies to England and Northern Ireland

Documents

Inter-Subject Comparability of Exam Standards in GCSE and A Level: ISC Working Paper 3

Request an accessible format.
If you use assistive technology (such as a screen reader) and need a version of this document in a more accessible format, please email publications@ofqual.gov.uk. Please tell us what format you need. It will help us if you say what assistive technology you use.

Details

Aims

The study presented in this report aims to achieve the following objectives, to:

  • gain an improved understanding of the issues of inter-subject comparability in GCSEs and A levels
  • gain an understanding of the impact of aligning statistical standards between subjects, based on Rasch analysis, on exam performance standards for individual subjects
  • generate new evidence regarding the impact on performance standards of statistically aligning subjects based on a Rasch analysis of subject difficulty

Conclusions

Results from Rasch analysis of GCSE and A level data from over a period of four years suggest that the standards of exams from different subjects are not consistent in terms of the levels of the latent trait specified in the Rasch model that is required to achieve the same grades.

There is considerable variability in statistical standards between subjects at both individual grade level and the overall subject level. Results from linear and multinomial logistic regression analyses based on prior attainment and concurrent performance also show substantial inter-subject variability in difficulty, in terms of the statistical model that has been specified. Although the difficulties derived using prior attainment are positively correlated with the difficulties derived using the Rasch model, the strength of the correlation is moderate for the mid-grades and weak for the bottom or top grade. The difficulties derived using the concurrent performance measure are highly correlated with the Rasch-model-derived difficulties.

Findings from this study are broadly consistent with those from studies reported by other researchers.

It has been demonstrated that the alignment of statistical standards between subjects based on comparisons using the Rasch model would result in a substantial change in grade distributions and a likely change in performance standards that are based on subject-specific grade criteria.

Published 18 December 2015