I’ve spent most of the last twenty years working in and around English schools. I suspect that most children learn more now than they did in 2005. But can we be sure?

Few sources of data cover the whole period reliably. National exams suffered grade inflation for many years. SATS (at age 11) and GCSEs (at 16) were both redesigned during this period.

So I’ve been looking at international test results. They have three strengths:

  • They allow comparison to other countries
  • They cover the whole period with few changes (caveats follow!)
  • Teachers aren’t incentivised to teach to the test – so they are a more consistent measure of learning than national exams. (Daniel Koretz has shown that when a high-stakes test changes, students get better at the new test and worse at the old test (2008, 243-7).)

This post uses international tests to answer three questions:

  1. Are students in English schools learning more?
  2. If so, when did things improve?
  3. How robust are these findings?

I want to answer clearly and succinctly – and do justice to the limits and complexity of the data. I’ve erred towards clarity in answering Questions 1 and 2, saving the many caveats for Question 3.

Question 1: Are students in English schools learning more than they were in 2005?

Yes. We see substantial improvement in maths, some improvement in reading, and a mixed picture in science.

The chart below shows change in international test results over the last twenty years, colour-coded by subject.*


We see:

  • Substantial improvement in maths:
    • PISA (the Programme for International Student Assessment) found significant improvements between 2006 and 2018. In 2022, scores dropped to around where they were in 2006 (perhaps related to Covid – we discuss this below).
    • TIMSS (the Trends in International Mathematics and Science Study) found significant improvements between 2003 and 2023 for Year 5 and Year 9 pupils.
    • PIAAC (the Programme for the International Assessment of Adult Competencies – not shown) found a big increase in numeracy for 16-24 year olds between 2011 and 2021.
  • Some improvement in reading/literacy:
    • PISA found gradual improvement between 2006 and 2018; in 2022, scores dropped to where they were in 2006 (again, perhaps related to Covid).
    • PIRLS (the Progress in International Reading Literacy Study) found significant improvements between 2006 and 2021 for Year 5 pupils.
    • PIAAC found a big increase in literacy for 16-24 year olds between 2011 and 2021.
  • Mostly decline in science:
    • PISA found a decline between 2006 and 2022.
    • TIMSS found that performance fell somewhat for Year 9 and rose for Year 5 between 2003 and 2023.

The overall impression is of a general upward trend. But the number of tests involved, the noisiness of the data, and the impact of Covid muddy the picture.

When we compare England’s results to other countries however, the achievement seems clearer. So many countries are involved that exercises in comparison rapidly become exhausting for the reader. I have limited myself to four telling examples:

  1. England is:
    • Top ten (of 81 countries/systems) for maths, reading and science (PISA)
    • Sixth for Year 9 maths (the highest outside East Asia; Year 9); ninth for Year 5 maths (of 58 countries in Year 5; 44 in Year 9; TIMSS).
    • Third for reading (of 42 countries/systems; PIRLS).
  2. England’s reading, maths and science scores now significantly exceed those of Wales, Scotland and Northern Ireland (PISA).** (In 2012 Scotland exceeded England for maths and reading; Northern Ireland came close.)
  3. England exceeds Finland in reading (PISA & PIRLS) and maths (PISA, TIMSS Year 5 & Year 9).*** I italicise this because, for much of this period, “Finland” was the answer, no matter what the question.
  4. Numeracy and literacy improved more for English 16-24 year olds between 2011 and 2021 than in any other participating country (PIAAC).

Here’s an illustration – the PISA Maths scores for each of the countries mentioned above.


In 2006, England lay amidst the home nations, and far below Finland. It now comfortably surpasses all of them.

(East Asian countries – notably Singapore, Japan, Taiwan and South Korea – consistently top these tables. They are very different from England. I wrote about how Singapore succeeds here.)

We can haggle over the details (and we will, in answering Question 3). But at some point in the last twenty years, English students have, on average, come to learn more than earlier cohorts – certainly in maths, probably in reading, perhaps in primary science, and come to learn more than children in most other countries.

Question 2: When did English schools improve?

Politically, 2005-2025 saw a degree of turbulence: England was ruled by three political parties, seven prime ministers and fourteen education ministers. Ministers’ priorities and skills varied. Pinpointing when these improvements took place is crucial if we are to understand why they happened.

First, let’s look at English results against OECD averages.


There is a sharp divergence between English scores and OECD averages in reading and maths in the late 2010s. Even where all scores fall post-Covid, the gap between English scores and international averages continues to grow. (We can see the same effect in the chart showing PISA maths scores above.)

Next, here’s the chart shown earlier of international tests results, with a couple of tweaks. I’ve dropped science, given the limited improvement we see there. I’ve then highlighted (with a thicker line) each time the scores increase by ten or more points. (This is a jump which is enough to be statistically significant and which John Jerrim suggested was worth investigating, when I interviewed him).


Again, it’s noisy. It’s even clearer that the improvements in reading and literacy are solid, but not dramatic. We do see substantial improvements in all four international maths tests between 2011 and 2023. (We also see apparent improvements in maths in the early 2000s, but these are less certain: Year 9 TIMSS results rise and then fall; we don’t see an equivalent improvement in PISA.)

We can also see a fall in many scores in the first tests after Covid. As we saw in the comparison to international averages however, Covid seems to have done less damage to English students than in many other countries. For example, in PIRLS, the international average fell 19 points (2016-2021); England’s score by just 1 point. PISA finds something similar (we will discuss Covid-related caveats below).

Straight answers are elusive, but it looks like the biggest improvements came in the 2010s.

Question 3: How robust are these findings?

No dataset is ever as robust as it first appears: international, inter-temporal comparisons are particularly tricky. Here are four particularly reasons to be wary of any claim made using international test data – including mine:

1. England suffers from very low response rates.

Schools and students are randomly selected to participate in international tests. Many avoid this honour. John Jerrim (2021) enumerates many contributing factors; cumulatively they mean that around 40% of 15 year-olds selected to complete PISA tests don’t. Many of these students would score poorly (they miss the test because they are absent from school, or aren’t in a mainstream setting, for example). This means that PISA results reflect the actual student population imperfectly. Jerrim’s model suggests scores in England, Scotland and Wales are inflated by 11-14 points.

2. Covid caused chaos.

It’s unsurprising – given the impact of Covid on trips, GCSEs, and the idea of going to school at all – that international tests were also disrupted. The 2021 PISA cycle was delayed until 2022. Even then, schools in many countries were experiencing serious disruption when the tests took place, and the data was even less complete than normal. This makes comparing post-Covid results particularly tough.

3. PISA switched from paper to digital testing.

The indefatigable John Jerrim (2016) also examined the pilot digital tests to show that this affected results – often in non-obvious ways. For example, can you guess who does better on digital tests (as opposed to paper)?
– Girls or boys
– Brazilian, American, or Chinese students
– Middle- or working-class students
– Students with more or less access to IT
Maybe you guessed boys. But it’s harder to predict that students from Brazil and the US would do better, while those from Shanghai did worse; that the gap in results attributable to social class reduced; and that these variations aren’t linked to students’ access to IT. This doesn’t bear directly on how well English students did – but it does make comparisons across the period less robust than they might otherwise be.

4. Motivation matters.

I claimed above that a low-stakes test is helpful because teachers don’t teach to it. But what if students don’t try for it either? Researchers tested the effect of an incentive – $1 per correct answer – for American and Chinese students taking PISA questions. In America, students offered money tried harder and got more answers right. In Shanghai, it made no difference (Gneezy et al., 2019). Experience suggests that the motivation and reactions of English students nearer those of American than of Chinese students. This would imply that English scores are slightly suppressed.

I add these caveats, not to persuade you of anything, but that every confident statement and about England’s performance on international tests should be qualified. Where do they leave us? As John Jerrim put it when I interviewed him – with admirable frankness for an academic – “It probably comes out in the wash.” For example, we’ve said that PISA results may be inflated UK-wide. Since this inflation is similar in each nation (11-14 points in each), the finding that England is doing better than the other three countries stands. In Jerrim’s words, “If you held a gun to my head, there probably has been a degree of improvement over time in maths.” He sees literacy results as more stable, but Tim Oates has argued that, since literacy and reading are declining internationally, maintaining a stable score is itself an achievement.

Conclusion

Student learning has improved in English schools over the last twenty years: significantly in maths; somewhat in reading. We see both absolute increases in scores and increasing divergence from peer countries.

This is not to say that there aren’t plenty of problems in English schools. We will address them in due course.

But it’s worth recognising what has improved and trying to explain why it has happened. This is the first in a series of posts and interviews tackling this question.


* England’s PISA results for 2003 were disallowed by the OECD due to low responses; there’s a case that the 2000 results should also have been disallowed. Who did the test, and when it was taken, also changed between 2003 and 2006 (see Jerrim, 2013). For these reasons, I have started the PISA trend line with the 2006 results.

** With one exception: Scotland’s reading score is below England’s, but the difference is not statistically significant.

*** But not science (PISA).

References

Gneezy, U., List, J. A., Livingston, J. A., Qin, X., Sadoff, S., & Xu, Y. (2019). Measuring success in education: The role of effort on the test itself. American Economic Review: Insights, 1(3), 291-308.

Jerrim, J. (2013). The reliability of trends over time in international education test scores: is the performance of England’s secondary school pupils really in relative decline?. Journal of Social Policy, 42(2), 259-279.

Jerrim, J. (2016). PISA 2012: How do results for the paper and computer tests compare?. Assessment in Education: Principles, Policy & Practice, 23(4), 495-518.

Jerrim, J. (2021). PISA 2018 in England, Northern Ireland, Scotland and Wales: Is the data really representative of all four corners of the UK?. Review of Education, 9(3), e3270.

Koretz, D. (2008) Measuring Up: What educational testing really tells us. Cambridge, MA: Harvard University Press.

You’ll may also want to view the most recent national reports for PISA, PIRLS and TIMSS. Most of these reports only include the last few cycles of data, so to check, for example, PISA results for 2006, you’d need that particular cycle’s report.