Research and analysis

Comparability of standards between subjects: Nuttall, Backhouse and Willmott (1974)

Research from NFER and the Schools Council on comparability of standards in GCE O-levels and CSEs.


Comparability of standards between subjects: Schools Council Examinations Bulletin 29

This file may not be suitable for users of assistive technology. Request an accessible format.

If you use assistive technology (such as a screen reader) and need a version of this document in a more accessible format, please email Please tell us what format you need. It will help us if you say what assistive technology you use.


The investigations reported here stem from the examinations research programme conducted by the ‘Research into CSE’ and the ‘Research into GCE’ projects at the National Foundation for Educational Research under the sponsorship of the Schools Council.

The report consists of a description of methodological studies in the area of comparability of standards in different subjects. Except in the case of two boards, the data were collected in the 1968 CSE Monitoring Experiment and the results must be treated with caution since they relate to the examinations of summer 1968.

Two main methods were employed: one used external reference tests and the other internal evidence, namely the examination grades achieved in other subjects. The methods were found to lead to essentially the same results. It was concluded that, with the samples used and treating the sexes together in both the GCE O-level sector and the CSE sector, English (language and literature) and possibly art appeared to be consistently leniently graded and that chemistry and French appeared to be consistently severely graded. Further, in the GCE sector, physics appeared to be severely graded and, in the CSE sector, mathematics appeared to be severely graded.

The question of sex differences in examination performance was also investigated. The pattern of differences between subjects was not the same for boys and girls analysed separately. The performance of boys was worse in French and English literature than expected on the basis of their overall performance and better than expected in mathematics and geography, while the performance of girls was worse than expected in chemistry and physics and better than expected in English language and English literature.

The necessary assumptions of the methods are critically discussed. Accepting the validity of these assumptions, the implications of the findings for grading procedures and comparability of standards are explored. The report concludes with a series of questions raised by the results of the research about the nature of examination grading and standards, and it is hoped that they will stimulate public discussion of these controversial topics.

Published 15 June 1974