The quality of exam marking

Ian Stockford's speech to the Association of Colleges Annual Conference.

Good afternoon and thank you for the invitation to come and speak with you today.

As you know, I’m here to talk about the quality of GCSE and A level marking. We’ve just heard from one of the exam boards about their take on this subject and I’ll be talking a little more about the regulator’s role in this process.

I’d like to begin by saying that I don’t believe there is a ‘crisis’ in exam marking, as suggested by the title of this session. But that’s not to say that there aren’t concerns which have been expressed by a number of people within the sector about the quality of marking, and – given that you’ve chosen to attend this session above others that are available to you – probably by many in this room.

We know from our annual perceptions survey that, of those that chose to respond, around 60% of headteachers and a third of teachers don’t believe that the marking of GCSEs is accurate. The proportion is a little lower for A levels at 44% of headteachers, but this still reflects concerns amongst those that responded to our YouGov poll following the summer 2014 exams. But what are the facts? And is the apparent concern over quality of marking justified?

I hope to reassure you in the next few minutes that the system is not in crisis and, along with that, I’ll outline what we’re doing to secure delivery of high quality marking by exam boards.

I’d start by saying that we absolutely recognise that the current system is not perfect. Early last year we published the results of a year-long investigation into the quality of GCSE, AS and A level marking in England.  That report concluded that, in general, there are good controls around the marking system delivered by exam boards. But, of course, there is always room for improvement, and in our report we identified potential actions in several areas.

These actions included the need for better monitoring of marking quality and for quantification of that quality of marking. It highlighted the need for better capture of the data during live marking and mechanisms to feed those data back into the marking process. This is to drive improvement during the live marking process but also in post-hoc analyses. The work also identified the need to re-design the enquiries about results and appeals system so that when there are genuine errors in marking there is an appropriate system in place to address these instances. And, I’ll say more on this in a moment. The report highlighted the need for improvements to be made to the design of mark schemes and also the need for improvements in the understanding of the marking of those within the system.

So what are we doing on this?

One important consideration for any regulator is that its regulations don’t stand in the way of innovation and, indeed, that we incentivise developments that are going to add quality. An important innovation in marking that was discussed at length in our report, and is in the process of contributing to improvements in the system is the introduction and increased role of on-screen marking.

On-screen marking has a number of direct benefits. Yes, it helps to improve the security and speed of the logistics around marking, but it also facilitates item level marking. That is the distribution of parts of a single student’s exam paper across a number of examiners. This approach reduces the effect of bias caused by student responses to the rest of the exam paper and allows examiners to focus solely on the quality of response they are considering at that moment. This approach also reduces the influence of a single examiner on an individual student’s script balancing any slight variations in standard across the question paper.

The immediate and electronic availability of the granular data from the marking process opens up a wealth of opportunities for exam boards to implement some sophisticated examiner monitoring and has the potential to improve marking quality through better identification and removal of anomalous marking.

But there are a number of indirect benefits too for us as regulator. The live monitoring processes that exam boards use provides us with the potential for a valuable data led insight into the controls exam boards have in place. We’re currently looking at ways that we could use these data from exam boards to better effect to enable us to hold exam boards to account without this use of the data having any unintended consequences on the exam boards’ practices.

Along with the technical and technological benefits, we do recognise that the move to on-line processes also introduces its own challenges. These include the technological barriers that it presents to some examiners – meaning that they may feel unable to confidently engage in marking – and the potential for the electronic marking systems to represent a critical point of potential failure in the successful delivery of marking. We will be taking a keen interest in the way that the exam boards are addressing these challenges and managing any risks that exist.

As I mentioned, our report also highlighted the importance of mark scheme design and the need for action in this area. In the lead up to the summer 2015 exam series, exam boards produced quality of marking action plans focusing on assessments where they had previously identified particularly large numbers of changes through the enquiries process.

When it comes to the design of mark schemes for extended and complex responses there is no magic solution. Indeed, the aspects that make an effective mark scheme can be somewhat counter intuitive. What we do require from exam boards, however, is to have effective processes in place to be able to evaluate the performance of their qualifications and to act on the findings of those evaluations. So rather than a magic solution, we expect exam boards to evaluate and refine their mark schemes on an on-going basis to optimise their designs.

Taken together this paints a picture of a number of quite significant incremental changes that should have a positive impact on the quality of marking year on year.

So how do we square that with the perceptions of reducing marking quality?

Part of the problem are the data widely used as a proxy for quality of marking. Typically the data used for this purpose comes from the enquiries about results system. However, it’s not reasonable to make assumptions about the direct link between EAR data and marking quality. And there are two primary reasons why the assumption is flawed:

  1. First is the implicit assumption that enquiries about results are always made on the basis of evidence of poor marking.
  2. Second, that the changes currently made to students’ marks through the enquiries process all stem from actual errors in marking.

It’s the second of these points that I’d like to focus on. One of the distinctive things about general qualifications in the UK is the combination of question types that exam boards use to assess students. These include short answer or objective questions which only have one right answer. They also include questions which require longer answers for which markers need to apply an element of judgement. Therefore it is entirely possible that for this type of question, a student might legitimately be given a different mark by two different markers marking the same answer, who are equally capable and experienced. That would not necessarily mean that either mark is wrong, as both could represent a reasonable application of the mark scheme.

Earlier this year we carried out some research looking at how potential marking errors are reviewed as part of the enquiries about results process. We found evidence that, in the current system, exam boards rightly correct genuine marking errors but that they can also sometimes change legitimate marks. This can give a misleading impression about the number of marking errors made and means that enquiries are not a good proxy measure of marking quality.

We will report data for this year’s enquiries about results in December, but they have risen year on year, and I probably won’t be giving too much away in anticipating a continuation of that picture this year.

Rather than using the enquiries system as a source of data of marking quality it is more important to consider the effectiveness of that process in delivering what it is intended to do – address instances of marking errors. As I’ve said, the significant concern that our research uncovered was the potential for students to be awarded additional marks where the original mark they received represented a perfectly reasonable interpretation of the mark scheme. This is problematic from the perspective of fairness across the system since schools or parents who can afford to pay for a review of their marks may be given an unfair advantage, not because they are getting marking errors corrected that they wouldn’t otherwise but simply because they are seeking a legitimate mark that they have been awarded to be overturned.

To reflect these findings, we will soon be holding a public consultation on proposed changes to be made to our regulation of the enquiries about results system and I hope you will engage with that consultation process. These changes would go hand in hand with other changes that seek to further improve and provide greater transparency on the quality of marking.

For example, we are going to identify metrics which will help us to more directly measure the quality of marking avoiding the current reliance on enquiries data as a proxy. We are conducting research in this area and when we have developed some proposals for this suite of metrics we will invite exam boards and other stakeholders to agree the best approach for measuring and reporting on marking quality.

Finally, we welcome all efforts to encourage good teachers to play their part in marking and awarding. I know that marker capacity and planning are being actively considered by exam boards and we will continue to engage with them to assure us of their ability to get the job done and to the highest possible quality.