Research and analysis

Public perceptions of reliability in examinations: Summary

Published 16 May 2013

Applies to England, Northern Ireland and Wales

1. Overview

Overview by: Andrew Watts, AlphaPlus Consultancy Ltd.

The report describes a study designed to explore what the general public thought about the issue of errors in public examinations.

2. Public reactions to errors in examination results

The causes of error in exam results are varied and complex. However, in the interests of transparency and raising awareness of the examination system, Ofqual feels that it has a responsibility to communicate with the public about this issue. This study was commissioned as the first step in a wider research programme to understand how best to approach discussions on this topic. Ipsos MORI ran groups to discuss participants’ understanding of error in exam results, and to ask participants whether such errors are acceptable and, if not, who should be held responsible for them.

Seventy-two people took part in the discussions, in groups of six. The groups were made up of teachers, students, parents, employers, examiners and members of the general public. The total group was balanced according to gender, age, social class and ethnicity, but the groups of six were arranged according to ‘type’ - for example, examiners, teachers and students were put together. The report notes that the data from such discussions represent different views in society but that the small numbers involved do not allow conclusions about the numbers in society agreeing or disagreeing with different points of view. The report illustrates the different points of view by quoting the actual comments of participants to their groups.

3. Perceptions of ‘error’ in assessment results

The report notes that Ofqual’s definition of an error is ‘when a student does not receive a grade in keeping with their level of attainment (i.e. when a student doesn’t receive the grade that fairly represents their achievement) at the point at which they are examined’. In discussing this, there was a clear difference in the participants’ minds between errors in the assessment process, and preventable errors. Participants put bad luck or simple mistakes - things which cannot be eliminated completely - as inevitable features of the examinations process. Preventable errors, however, would include systemic errors, like the setting of a bad exam paper, which the participants saw as unacceptable. They said that the exam boards should be held responsible for such errors, which should have been picked up before the exam took place.

Another distinction used was that between ‘procedural error’ and ‘measurement error’. The participants believed that the procedural errors - such as a paper not covering the syllabus fully or an examiner failing to mark a page of an answer script - were avoidable. Measurement error, however, was accepted as something which could not be entirely eliminated.

4. Three kinds of error in assessments

The report divides the kinds of error into three: student-related, examiner-related, or test-related. All of these would cover features which prevented candidates from gaining the grade that they might deserve. Participants felt that student-related errors, such as being ill on the day or arriving late or revising the wrong things, were the responsibility of the students, and were not thought of as errors at all. All participants, including parents and students, felt that such things were ‘bad luck’, an inevitable part of taking exams - but not errors, which should refer to things that the boards could put right.

The participants, though aware of the pressure that examiners work under, were more concerned about examiner-related error. Such errors might occur when an examiner misinterprets a candidate’s answer, or adds up the marks incorrectly. However, even though participants expressed concern about this, there was still some understanding that it is natural for people to make mistakes. The students in the group expressed particular concern about examiners’ errors in those subjects where a high degree of judgement is required in marking. Some of the groups were surprised to hear that many examiners were full-time teachers who marked scripts at home. Some suggested that examiners should work together under supervision in marking centres. There were things about the exam system that the students didn’t know, for example that double-marking and moderating already take place.

The type of error of most concern to the groups was ‘exam-error’, where, for example, there were mistakes in the question papers. The teachers particularly insisted that exam papers should be error-free.

5. How examination error affects students

The participants wanted to add a further category of error to the discussion - the errors that could occur because of the effect of teaching, perhaps where the syllabus had not been fully covered. Ofqual tended to discount these incidents because the teaching takes place before the examination itself.

Participants, being most concerned about errors that caused negative outcomes for students, were not worried about errors in students’ marks, but were worried about errors in students’ grades. Employers noted that candidates who fail to gain a C grade at GCSE may well be automatically excluded from job selection lists.

The participants were not sure how much error there is in the system (teachers suggested a much lower figure than parents) and were generally less concerned about the amount of error in the system, than about the cause of errors and how these can be prevented.

6. Public discussion of error

The participants were asked their opinion of the need for wider discussion about error in exams. They were uneasy with the use of the word ‘error’, as was noted earlier, when they said that an illness which affected a student’s result negatively was ‘bad luck’ or ‘just life’. Ofqual’s use of the word has a technical meaning in educational assessment, which the general public does not recognise - the word ‘error’ to them means that someone has made a mistake and should be held responsible.

Some felt that public discussion of this topic would be useful, but the teachers and examiners particularly were wary of a debate that might not be well informed. Generally the participants felt they had little information about how the exam system works and would like to know more. They doubted the need for three exam boards and felt that one board could maintain a common standard, though the teacher participants were more aware of the benefits of having three boards. The examiners felt that the differences indicated the need for boards to be checked for consistency. They were not clear about Ofqual’s role, but they saw a need for some body to monitor comparability between boards. Some parents expressed lack of confidence in the exam system, particularly when they considered their own children being given the wrong grade. The report suggests that this tendency to think negatively could be the result of lack of reliable information about the system. Parents suggested that schools might run examination open days, though some of the teachers knew of situations where these had had very little response. Some concern was expressed about the special considerations (arrangements made for candidates with special access requirements, e.g. extra time for dyslexics) and appeals systems, which were said to be too lenient and which strong-minded parents could exert pressure on. Finally, students would like more information about the way their scripts were marked.

7. Next steps

The findings of this report were used to refine the questions put to groups of people in a later study. The report noted particularly that, when they were asked to talk about ‘examination errors’, participants responded differently from when the discussion focused on actions which ‘caused an error in the mark or grade awarded’ (in the sense of the student not getting the grade they deserved). This helped later researchers as they planned the next stage of the Ofqual project.