Research by the University of Cambridge into predicting GCSE grades based on past Key Stage 2 performance.
This report has explored a number of the issues relating to the use of predictions based on Key Stage 2 results to set GCSE grade boundaries. For the majority of GCSE awards no evidence has emerged to suggest there is anything inappropriate in the current methods used. In general the evidence in this report supports the way in which Key Stage 2 data is used. There are also some minor areas where the current process for creating GCSE predictions could be improved upon:
- The accuracy of these predictions could be very slightly improved by using logistic regression based upon normalised scores.
- The current Key Stage 2 grade inflation adjustment suffers from not taking account of differences in the prior attainment distribution of candidates in different subjects and in different exam boards.
- Key Stage 2 based predictions tend to under-predict the likely extent of differences between exam boards.
- Our analysis suggests that the currently recommended tolerances are too low at grade C.
For the reasons listed above we would recommend that the predictions are no longer based upon Key Stage 2 levels and are instead created using logistic regression. This procedure would need three steps:
- Normalised Key Stage 2 scores would need to be created based upon national achievement in these tests each year. This task could be completed well in advance of awarding.
- A logistic regression model would need to be run for each subject detailing how the probability of achieving different grades changes according to the past performance of candidates. In our experience this has been a relatively straightforward process that could be applied in any standard statistical package.
- Once exam boards have matched normalised scores to their own GCSE data, this model would then need to be applied to each specification to produce a prediction of the likely percentage of candidates to achieve each grade.
Whether or not this new approach is adopted, we would recommend that the approach to setting tolerances for predictions is made more specific to each award.
Although we are recommending that the guideline tolerances for GCSEs should be increased, especially at grade C, we have not explored how inter-board comparability would be strongly maintained in this context. Specifically it is not clear how Ofqual could ensure that exam boards apply consistent decision processes within this context and how any appearance of a ‘race to the bottom’ could be avoided within the tolerance levels recommended by this report. This issue will require for ongoing discussions between Ofqual and the exam boards.