Speech

GCSE, AS and A level reform: It all adds up

Jeremy Benson presents a speech to the Capita Secondary Maths Conference.

Good morning

Let me start by apologising on behalf of my colleague, Ian Stockford, who was due to be with you today but is unfortunately unwell and asked me to take his place. I’m Jeremy Benson, and although my main focus now is on vocational qualifications, I was heavily involved in the early stages of the GQ reform programme – and of course I continue to take a close interest in work that is a very important and visible part of what Ofqual does.

I am very pleased to be here today to give you our perspective on the reforms to GCSEs, AS and A levels in England that we are a part of introducing. I say a part of, because we are just one of several stakeholders each with a unique role to play in the process. And of course the reforms themselves sit in the context of the wider objectives for mathematics education; we recognise that context, but we make no apologies, as the qualifications regulator, for focusing on the qualifications and trying to make them as good and valid as they can be.

As Vanessa has just set out, the Department for Education is responsible for developing new content for those qualifications being reformed, and specifically in the case of GCSEs, more demanding content. Our focus has been on working with subject and assessment experts to develop and test thinking on a range of matters relating to assessment, which has led us, at a high level, to move away from modularisation and review the extent of non-exam assessment; and specifically for GCSEs, to reconsider tiering in each subject and determine a new grading structure.

Of course some of these changes don’t apply to maths. Linear specifications in maths already existed before reform and the subject has been exclusively assessed by exam for many years. Maths is also one of the few subjects which will be tiered under the new system.

These assessment designs represent our first involvement in the process. Then there are the exam boards, whose responsibility it is to take the Department’s content, and our regulatory conditions and guidance, and craft subject specifications that meet our rigorous accreditation standards. This process is ongoing for those subjects being first taught in September 2016 and 2017 and there is much work still to do. But for GCSE maths, along with English and English literature, we’ve already found the right formula and pupils started studying these new subjects earlier this month.

I would like to use this speech in particular to reflect on the decisions we have taken on GCSEs, and maths specifically. For those of you who have followed progress, you will know it has not been as easy as two plus two, but I want to explain why we are confident we have got to the right answer as regulator. Before delving into the detail, it is perhaps worth setting out the decisions we have taken for GCSEs at the most general level. In summary:

  1. Assessment will be mainly by exam, with other types of assessment used only where they are needed to test essential skills. Maths was previously 100% exam and this remains the same.
  2. There will be new, more demanding content, which has been developed by the Department in consultation with us and the exam boards. This is particularly relevant for GCSE maths.
  3. Courses will still be designed for two years of study – but they will no longer be divided into different modules and students will take all their exams in one period at the end of their course.
  4. Exams can only be split into ‘foundation tier’ and ‘higher tier’ if one exam paper does not give all students the opportunity to show their knowledge and abilities. Maths is one of few subjects that remains tiered. This is because, manageable assessments cannot be designed that would both allow students at the lower end of the ability range to demonstrate their knowledge, skills and understanding, and that would stretch the most able students. The examinations would inevitably include some questions that would be too simple and others that would be too challenging for significant numbers of students.
    The two tiers are focused on grades 1-5 and 4-9 so there is an overlap. To help make sure that the level of attainment indicated by grades 4 and 5 is consistent, regardless of the tier for which a student is entered, an awarding organisation must ensure that at least 20% of the marks available in the assessments for each tier are made available through questions that are common to both tiers. These questions must be targeted at a level of difficulty consistent with grades 4 and 5. The idea is that the boards use the performance on the overlap marks during awarding to try to ensure that the grade boundaries for grades 4 and 5 on each tier are at the same standard and there is no easier or harder route to the overlap grades.
    Some of you will recall that up until 2006 there was a three tier system. Within the three-tier system the lower (Foundation) tier could only lead to a Grade D or below. As a result, the 30 per cent of the age cohort entered for this tier were pre-destined to, in the eyes of many, “fail”, by getting below a C. This was of course changed so that all students were given the opportunity to achieve at least a C.
  5. Returning to my list of general decisions, we are considering allowing resit opportunities each November in English language and maths only. This would be consistent with government aim for every student to be able to achieve a GCSE in maths.

Importantly, we are considering and consulting on assessment arrangements for each new GCSE subject independently. In the case of maths, strengthened mathematical problem solving is a key feature of reformed qualification, as is also the case for AS and A level maths, which will be introduced for first teaching in 2017. The Department’s subject content makes clear that mathematical problem solving is not just for the highest achieving candidates. It is a core part of mathematics which can and should be accessible to the full range of candidates. In precise terms, this means that students studying the new qualification should be able to:

  • translate problems in mathematical or non-mathematical contexts into a process or a series of mathematical processes;
  • make and use connections between different parts of mathematics;
  • interpret results in the context of the given problem;
  • evaluate methods used and results obtained; and,
  • evaluate solutions to identify how they may have been affected by assumptions made.

This is new territory for maths at this level. There is a great deal of intangibility surrounding maths assessment and there are many views and solutions put forward about how this can and should be done. It was therefore vital that we took the time to engage with the exam boards to reach a common understanding of how problem solving could and should be assessed before any actual assessments were designed. Consistency is always key.

In addition to this broader content, the exam boards had the challenge of raising the expected level of difficulty of the qualification, and doing so in a way that continued to provide appropriate differentiation across the full range of candidate ability.

This has been achieved through increased content which I have already mentioned. But many would also argue that full linearisation in itself can increase demand, particularly for certain groups of students who may work better by learning in bite- sized chunks.

It has been vital, through the process, not to lose sight of the fact that GCSE maths is essentially a whole cohort subject. It needs to accommodate the mathematicians, scientists and engineers of the future, although they’ll inevitably go on to study maths at a higher level; along with those who find mathematics incredibly challenging and are unlikely to further their study beyond GCSE. One of the challenges through this process has been to deliver something that provides the opportunity for students to stretch themselves across this range, culminating in assessments that fairly differentiate between students of different abilities while simultaneously encapsulating the desired changes to content.

Both of these issues – the introduction of new elements such as mathematical problem solving and the need to differentiate between students of different abilities – were addressed as part of our accreditation of each exam board’s specification. This was by no means a simple tick-box exercise. With the number of moving parts here and the multi-faceted nature of assessment standards – essentially the level of difficulty – defining the assessment standard in any meaningful way in advance was very challenging.

To prepare for this process, we undertook a degree of pre-accreditation preparation like never before. This included developing industry standards such as more precise definitions of the assessment objectives, publishing rules about the design of the assessment, and, as part of the development of the Department’s subject content, mapping that content to tiers of assessment. This was all aimed at reducing the potential differences in the assessment standards between specifications, but it was not intended to limit valid differences in how each exam board addressed our requirements.

Even after this initial work, none of the boards’ initial submissions – which include the specifications, sample assessments and, for the first time, an assessment strategy – were accredited by our panel of six independent experts on first submission. Some submissions were considered too demanding, the language used in others too complicated. But eventually each board developed an appropriate assessment strategy that we believe can be implemented in a way that ensures they can continue to meet our regulatory requirements over time.

So, if that is what accreditation can do, and did, what doesn’t accreditation do?
With my vocational qualifications hat on, I often talk about the ‘life-cycle’ of qualifications: the idea, which is central to our regulatory approach, that every stage of the design, delivery and award of a qualification has to be right for the qualification to be sufficiently valid. It’s necessary for a qualification to be well-designed, but it’s not sufficient. A qualification that looks good at accreditation could still be awarded wrongly. So while up-front accreditation is sometimes necessary – as it is with GCSEs and A levels – it’s not the end of the story – though it’s surprising how often others assume that it is, and that accreditation is some sort of guarantee.

In particular, what accreditation cannot do – and wasn’t designed to do – is say for certain that the level of expected difficulty across the qualifications from the different boards was precisely the same. Accreditation is not a comparability exercise. We cannot say for certain that the level of demand across the qualifications is precisely the same. So in order to reassure ourselves, the exam boards, students and other stakeholders on this matter, we undertook a comprehensive research programme into the boards’ sample exam papers earlier this year. The work comprised four strands, to varying extents innovative in nature. The programme involved 35 mathematicians making 35,000 judgements of relative question difficulty in strand 1; 3,865 students from 30 different schools sitting sample papers and 50 markers scoring them in strand 2; 33 maths experts and more students in strand 3; and another five maths experts in strand 4.

Our findings were in some ways surprising but also incredibly useful. There were, indeed, differences between the approaches of the exam boards – we knew that at the point of accreditation – but this research allowed us to evaluate in a more detailed way the consequences of those differences. One key finding was that, even having taken account of the students piloting the sample assessments not having been taught the new curriculum, the assessments might have failed to effectively differentiate between students across the range of abilities. In response to these findings, exam boards submitted fresh sample papers to us, which we evaluated qualitatively and quantitatively. After some final adjustments we judged the boards’ final sample papers to be very similar, in terms of expected difficulty, and also likely to differentiate across the full ability range of students. They meet our requirements.

But of course our work has not ended there.

We are looking ahead to 2017, and thinking about the checks and balances we will put in place beforehand to make sure that the real exams are able to effectively assess students at all levels of ability and differentiate between them. We will provide more details in the period up to summer 2017.

As I have explained, the methodologies we applied to evaluating the different GCSE maths specs were to a great extent innovative. When we released those results earlier this year we said that we would consider the benefits of undertaking similar research for other new GCSEs or A levels prior to exam board specifications being accredited.

Given the amount of change in GCSE science we more recently decided that it would be appropriate to undertake some supplementary research looking at the various sample papers (for biology, chemistry and physics) submitted to us as part of accreditation. This work ran parallel to accreditation and we are looking at how that evidence may inform our accreditation decisions.

In summary, we asked just over 100 science teachers to judge the expected difficulty of more than 6,000 questions taken from the exam boards’ sample papers and the question papers they used in 2014. Each teacher was asked to compare pairs of questions and decide which of them their students would find it more difficult to answer. They were asked to do this for 1,000 pairs of questions and the questions and teachers were split by subject – biology, chemistry and physics – to make meaningful comparisons.

When we announced this research back in August we said we were uncertain about what we would find and how meaningful the data would be. We have analysed data from these judgements to provide evidence to the accreditation panel about the expected difficulty of the sample papers. As it turns out, the evidence has proved not to be as beneficial as it was with GCSE maths. Nevertheless, we will now continue to look at this work from a purely research point of view. We will evaluate it further and consider changes that might make the approach, and subsequent findings, beneficial when looking at other subjects.

Those studying the new GCSE maths from this year will have the opportunity in turn to study the reformed A level from 2017. Those of you who have been following the reform process closely will know that new AS and A levels in maths and further maths were due to be taught from 2016. However, we advised Government that it would be beneficial to have more time to be sure that the new A level content requirements, particularly those relating to problem solving, were sufficiently well specified and commonly understood by each exam board, and that the new A and AS levels would sit sensibly alongside the new GCSE. To that end, we convened an A level Mathematics Working Group in March this year to provide expert advice in a number of areas, specifically mathematical problem solving, modelling and the use of large data sets in statistics.

This working group will report later this autumn and its work will help to achieve three key aims:

  1. Support finalisation of assessment objectives and weightings for AS/A level maths and further maths.
  2. Inform the development of our regulations, including Conditions, Requirements and Statutory Guidance, such as the technical interpretations of the assessment objectives.
  3. Support the development of high quality assessments, particularly in relation to mathematical problem solving.

Alongside the report from the working group we will be launching a consultation on assessment objectives for A level maths to which I would encourage you all to respond. Returning to GCSEs, perhaps the most obvious change, certainly for those outside the education system, will be the introduction of a new grading scale. Reformed GCSEs will from 2017 be graded from 9 to 1, instead of A* to G.

As a result, students taking GCSEs over the transition period will receive a mixture of 9 to 1 grades, where they are taking reformed subjects, and A* to G grades, for other subjects. The reasons for this new scale are two-fold. Firstly we felt it was important that people could recognise immediately that these are new qualifications. I have already talked to you about the many changes that have been introduced. However, as the qualifications are still called GCSEs there could be confusion over whether students have studied the old or the reformed qualifications. The new grades will show this straight away.

Secondly, you’ll notice that there is now an extra grade on the scale: we’ve gone from 8 possible grades to 9. This has been done to allow greater differentiation at the top end of the scale, which was one of the policy objectives of the reform programme. We want the very highest achieving students to be able to demonstrate their full potential, while also still ensuring that the scale covers the full range of abilities.

I’ve nearly finished but I do want to touch on standard-setting. When there is a change in the design of a qualification, there is always the potential for students sitting new qualifications to be disadvantaged relative to earlier or later cohorts. The material is less familiar to teachers and students and the support materials are less developed than when a specification is mature. This may have a material impact on student performance around this transition. To protect against any adverse effects, we have therefore committed to use a statistical method, known as comparable outcomes, in 2017. We consulted on this principle last year, and received broad support for it.

Assuming a comparable cohort, this means that: broadly the same proportion of students will achieve a grade 4 and above on the new scale as currently achieve a grade C and above; broadly the same proportion of students will achieve a grade 7 and above as currently achieve a grade A and above; the bottom of grade 1 will be aligned with the bottom of grade G; and roughly the top third of students gaining a current grade C and bottom third of grade B will be awarded a grade 5 on the new scale.

Although statistical evidence, particularly from the performance of the cohort in Key Stage tests, has been used as a central source of evidence in awarding for many years and as a regulatory tool more recently, this approach places a greater reliance on statistics than would otherwise normally be the case. It is, however, the best approach when there is so much change in the system, to protect the interests of students and to retain as much stability as is possible at a time of change.

In future we will also have the potential to draw on evidence provided by the new National Reference Test in the awarding of GCSEs. We recognise that sole reliance on statistical evidence in the way I’ve described on an ongoing basis would impact on the extent to which genuine changes in the performance of students can be reflected in the grades received by students. We are therefore looking at ways that these changes in performance could be reflected without leaving the system vulnerable to grade inflation. In future, a key part of this evidence will come from the National Reference Test, which we are now rolling out. Each year, around 300 schools in England will take part in the tests. At each school, around 30 students will take an English paper and around another 30 students will take a maths paper. The tests will take around an hour to complete and will be administered and invigilated by the National Foundation for Educational Research.

We will publish national test results towards the end of August each year. Over time, the tests have the potential to provide a valuable additional source of information that may be taken into account when GCSEs are awarded.

Field trials for the test are being conducted next week, while a full-scale trial in March 2016 will allow us to check processes ahead of the first live test in 2017.

Finally, I would like to provide an update on where we are with accreditation. We are currently looking at submissions from exam boards for 20 new GCSEs and 11 A levels for first teaching in September next year. We aim to have these available to schools as soon as possible; however, our priority has to be to ensure that their specifications meet our standards. We have today issued a new postcard explaining our accreditation process, along with a more detailed blog. As with GCSE maths last year, it is not unusual for specifications not to be accredited on first submission, and the exam boards are responding to our feedback in those circumstances. We will provide updates on those specifications that have been accredited on our website.

That is all I wanted to say today, but I hope you will understand that it is only a partial set of the issues we, as the qualifications regulator in England, are tackling. There is certainly a lot of change at present; but as I hope is clear, we are thinking about it very carefully and taking considered, evidence-based decisions. I am confident that the sum total of the work that we and many others are doing will lead to better qualifications. And I am confident that what we’re doing, in every way, all adds up.

Thank you.