Research and analysis

National Reference Test annual statement 2025

Published 21 August 2025

Applies to England

Ofqual has today, Thursday 21 August, published the results of the National Reference Test (NRT) in 2025. The National Foundation for Educational Research (NFER) Results Digest shows the results alongside those from previous years.

Having analysed the results from both the English and maths NRT, and following application of our established principles to support our judgments, we decided not to require an adjustment to the awarding of either GCSE English language or GCSE maths for summer 2025.

Background

In February and March 2025, almost 14,000 year 11 students from more than 350 schools in England took the National Reference Test in English or maths, which is administered by NFER. The tests are designed to provide evidence of the performance of 16-year-old students in English language and maths and were introduced to provide additional evidence to support the awarding of GCSEs in these subjects. The first live NRT, taken in 2017, was benchmarked against the first awards of the reformed GCSEs in English language and maths, and subsequent tests compare the performance of students with those in previous years. 

Results are reported at 3 grade boundaries – grade 7, grade 5 and grade 4. Results are reported as expected percentages of students achieving those grades (and above) based on changes in performance on the NRT. This overview focuses on grades 7 and 4, since these would be the points through which an adjustment would most likely be applied operationally.

Results for 2025

The results for the 2025 tests are outlined below. Because these tests use a sample of students, we report ‘confidence intervals’ around the results. These confidence intervals represent the possibility that if we had taken a different sample of students, we would get a slightly different result. The results show the changes in the expected percentage of students at and above the grade 7 and grade 4 boundaries, compared with 2017.

The NRT results are compared with 2017 because this is the baseline year of the NRT, and, with the exception of 2021 and 2022, this is the year with which we have previously compared results. The current versions of GCSEs in English language and maths were first awarded in 2017, and we know that when new assessments are available, performance typically dips in the first year and then subsequently improves. This is known as the sawtooth effect. When considering any changes in performance compared with 2017, we take into account any changes in performance that are typically observed when new qualifications are introduced, as we did when making decisions about using the NRT in 2019 and in 2020. For example, were we to see a significant increase in NRT performance compared with 2017, we would need to consider whether this reflected a genuine change in attainment.

The results for 2025 show that, in English, performance is statistically significantly lower at grade 4 (at the 0.05 level of significance) when compared with 2017.[footnote 1]  There is no statistically significant difference at grade 7 when compared with 2017.

In maths, there is a statistically significant upward difference at grade 7 compared with 2017 (at the 0.05 level of significance). There is no statistically significant difference at grade 4.  

Expected percentage of students at each grade (with associated confidence intervals) based on NRT 2025

Subject Grade 4 and above Grade 7 and above
English language 2017 69.9 (68.3-71.5) 16.8 (15.6-18.0)
English language 2025 67.0 (65.3-68.6) 18.4 (17.1-19.6)
Maths 2017 70.7 (69.3-72.1) 19.9 (18.6-21.2)
Maths 2025 70.8 (69.5-72.2) 22.2 (21.0-23.4)

Using NRT evidence in awarding 

The NRT provides an additional source of evidence to support the awarding of GCSEs in English language and maths. Where there is a statistically significant difference in performance, Ofqual can require exam boards to adjust the grade standards when setting GCSE grade boundaries.

In order to help us interpret the NRT results, we carry out additional analyses to consider a number of factors that could potentially impact on the results of the NRT, beyond genuine generalisable changes in the performance of the cohort. The analyses differed slightly this year due to the absence of key stage 2 (KS2) data, typically as a measure of prior attainment to consider any potential bias in the sample. In its place this year, we considered the school-level recent prior achievement of those participating in the NRT compared with the national profile. We also consider the findings from the student survey in relation to student motivation and students’ views of the importance of the NRT and GCSEs in English language or maths.

In considering the evidence from the NRT, we aim to make sure that: 

  • our decisions are consistent over time and between subjects, regardless of the direction of any change 
  • we take account of contextual evidence from the student survey and other sources, and that we act cautiously in making any adjustments to grade standards 
  • we document and publish the reasons for our decisions

School sample

In both English and maths, the mean GCSE performance of participating schools tended to show a slight under-representation of both schools with the very lowest and very highest recent prior achievement, compared with the national profile.  

This is not, in itself, problematic, and similar variations are seen in the samples in previous years. In both NRT subjects, analysis shows that any bias relative to the national profile of schools is relatively stable across the iterations of the NRT, and modelling these effects shows that the subtle differences in profile compared with 2017 are not sufficient to explain any differences in performance on the tests.

Student survey

Immediately after taking the NRT, students also take a short survey to capture, among other things, their NRT-specific test motivation, preparation for GCSEs, and motivation, feelings and attitudes about learning the relevant GCSE subject. The aim of the survey is to provide context for any changes in NRT results. The survey was introduced in 2017, and 2025 saw the ninth administration of the survey. 

Compared with their counterparts in 2017, students taking the NRT in both subjects in 2025 reported lower perceived importance of the NRT, greater indifference to their own NRT performance and less preparation for the NRT. Those in maths, but not those in English, also reported less test-taking effort.  

Modelling exploiting the historical relationship between self-reported test motivation and test performance suggests that the decrease in test motivation was not a major contributor to the finding of a statistically significant lower performance in English at grade 4 this year compared with 2017.

Interpreting the results

English

The statistically significant difference at grade 4 compared with 2017 could be interpreted as suggesting a small downwards adjustment to grade standards this summer would have been appropriate. Further, our analysis of the sample suggests that the small differences in the profile of the sample do not account for the lower results in 2025, and neither does the slightly lower test motivation. 

We have always been clear, however, that we would be cautious in using evidence from the NRT to inform awarding. For us to make an adjustment this summer we would need to be confident that any lower performance indicated by the results of the NRT reflected a genuine change in the attainment of the GCSE cohort. This year’s English NRT results are, however, showing a potential reversal in the downward changes recently observed, particularly when considering the evidence at grade 5 and above, as reported in the NFER Results Digest. We would also expect differences to be at the 0.01 significance level, given the high-stakes nature of GCSE, before making an adjustment.

As noted above, our decisions seek to be consistent over time, taking into account relevant context. The results of the NRT this year are comparable with previous years where no adjustment has been applied and there is no contextual evidence to suggest this year’s results should be interpreted differently.

On balance, we decided that there was not sufficient evidence of a genuine decline in performance, relative to 2017, such that we should make a downwards adjustment to GCSE English language this summer. In subsequent years, we will continue to make decisions informed by the principles outlined above and any trends over time.

Maths

The statistically significant higher result at grade 7 compared with 2017 could be interpreted as suggesting a small upwards adjustment to grade standards would have been appropriate at this grade this summer. Further, our contextual analyses suggest that the small differences in the NRT sample, and the student survey results, are unlikely to fully account for change in performance. 

We have considered, however, that the 2025 NRT outcomes in maths are similar to those seen in 2024 and 2019, and that the change compared with 2017 is only significant at the 0.05 level of significance. In 2019, we decided not to make an adjustment because we considered that the increase relative to 2017 was likely due to the sawtooth effect and the improvements that we might expect in the first few years that a qualification is available. To make an adjustment now would appear to both reward a level of performance that we have previously decided not to (because we were not confident that it reflected a genuine improvement in attainment) and be inconsistent with our decision last year. Furthermore, as noted above, given the high-stakes nature of GCSE, we would expect differences to be at the 0.01 level of significance before making an adjustment.

Taking into account our principles of consistency (that our decisions should be consistent between years) and acting judiciously (that we will be cautious in applying any adjustment), we therefore decided not to make an adjustment in maths at grade 7 this summer. In subsequent years, we will consider any trends and apply the principles outlined above, when deciding whether any change in the NRT reflects a genuine improvement in attainment.

  1. There are different levels of statistical significance. A 0.05 significance level indicates a 1 in 20 chance of the difference occurring by chance; at the 0.01 level of significance, that reduces to a 1 in 100 chance.