The non-examination grade fiasco explained

Roger Porkess 23/08/2020

“Project 365 #147: 270509 Testing Times Ahead” by comedy_nose is marked with CC PDM 1.0

Share this article

The examinations results season is always an anxious time. However, when they open their envelopes, students can usually be confident that their grades reflect how much they have learnt on their courses.

But not so this year. With the closure of schools in March and the consequent cancellation of examinations, the evidence on which grades are given was no longer available and the whole procedure fell apart.

Grades serve several purposes for students.

They show how much they have learnt and so are rewards for their hard work.
They are stepping stones for their futures, in many cases determining what they will do next in their lives.
They are rewarded with certificates which are often cherished and kept for many years.

Clearly the government had to make some arrangements so that those affected could continue their careers. However, ministers failed to think outside the box. Certainly students needed some information to present to universities, employers, sixth forms etc. but this did not have to be imitations of conventional grades. Schools could, for example, have been asked to describe potential university students as, say, strong, medium or marginal, and to provide further information if requested.

Instead the government decided to require schools (colleges included) and the examination boards to award all students estimated grades. This was so obviously fraught with difficulty that the idea should have been subject to detailed scrutiny before becoming policy. Instead it was announced and the regulatory authorities, Ofqual and the Scottish Qualifications Authority, were given the impossible task of setting up a viable procedure.

What they came up with is now described as “the algorithm”. The grading is based on information provided by schools and then moderated by the examination boards. Both parties had two essential tasks to perform.

The first task for schools was to rank their students in each subject. Imagine a school with 73 students for A level English in 4 sets with different teachers; some of the teachers were new and had not yet got to know their students very well. The teachers’ judgements had to be melded to give each student a unique rank between 1 and 73. Student number 40 had to be better than number 41 but not as good as number 39. No two students could be declared the same and so share a rank. Most schools found this reasonably easy for the extreme students, the very good and very weak, but problematic for those in the middle. Many students are very similar in performance but vary from day to day and according to the topic within the subject. So the student ranked 40 might have been better described as somewhere between, say, 35 and 45. Students’ ranks provide the essential basis for the whole process but there was no scope for incorporating the inherent uncertainty and variability. Instead there was the myth that the numbers assigned were spot on.

Schools then had to draw lines separating their students into different grades. So in our example, it could be that they declared student 40 to be grade C and student 39 grade B. This task presented something of a conflict of interests. A school’s league table position depends on the grades its students achieve and so there was subliminal pressure on teachers to be optimistic in their predictions. That may explain, at least in part, why so many grades were initially reduced. In addition, teachers naturally wanted to be fair to their students, considering how they would perform on a good day.

The ranks and grade boundaries were then sent to the examination boards who had to ensure two things.

Students of the same ability should be given the same grades independent of their schools. This required possible adjustment of the grade boundaries schools had submitted. Comparability between schools was judged largely on the basis of results in previous years. It depended on the assumption that there had been no change in any school’s standard: their teaching had neither improved nor got worse, and there was no such thing as a strong or weak year group. (Neither of these was true for all schools.) Consequently, in any subject, in order to match the performance of different students in earlier years, all the students in a particular school may have been moved up or down.

The overall standard in any subject had to be comparable with that in previous years. Nationally the same proportions should be awarded the various grades. This may have resulted in every school being moved up or down.

Both of these examination board adjustments were essentially statistical. They guaranteed that overall the outcomes were comparable with those of previous years. However, that was no consolation to those students who were unhappy with their grades. What this system did not do was to ensure that individual students were necessarily treated fairly. It started with the original ranking; if this was not right, the error persisted throughout the three subsequent steps. Then, those in better than usual year groups would also have suffered from the examination boards’ method for ensuring comparability between schools, as would those who were better taught than earlier students in their schools.

Given these built-in sources of error, it is no surprise that a considerable number of students were let down by the system.

All awards have now been changed so that the schools’ original grading stands, but that will produce its own problems. In the absence of external moderation the integrity of the award cannot be guaranteed and this is shown by the inflated percentages of students being awarded the higher grades. In addition it is inevitable that some schools were more generous than others. The outcome is that a grade A does not mean what it did in previous years; nor does it necessarily mean the same thing for students from different schools. Sadly, for those students involved, their Covid-year results will be regarded with suspicion for the rest of their lives.

It is now common to blame the algorithm, and so by implication those who designed it: civil servants and those advising them. However, nobody has found a satisfactory procedure for awarding grades given the constraints of maintaining national standards and comparability between schools. Government ministers should have known that what they were asking for was almost certainly impossible before committing to a policy of awarding pseudo-grades.

The question of how ministers could make such a disastrous decision raises a deeper issue. It is not just that Gavin Williamson and his colleagues were inept; the system should have protected them from their own folly. Civil servants should have warned them of the iceberg ahead so forcefully that they changed course. It seems, however, that our traditional checks and balances no longer apply. Ministers now expect to bully civil servants and to remove them if they resist. This is a really worrying change of culture.

The door is now wide open for other bad decisions. Be they on Covid, the EU , climate change or whatever, we are all likely to suffer the consequences.