This report reviews research in the field of 'reliability' and presents a range of statistical processes for measuring reliability.
The aim of this report is to help to provide as far as possible a framework to describe, interpret and assess reliability estimates from different sources. It discusses what is meant by measurement and its reliability, and outlines approaches to estimating it.
It describes, in a relatively nontechnical format, a range of statistics currently used or proposed for measuring reliability, under three headings:
- classical test theory (CTT)
- item response theory (IRT)
- grading into a relatively small number of categories.
The report also describes a 2007 case-study looking at the reliability of key stage 2.