Guidance

Evaluating Policy in Government

This is a guide for civil servants to develop better policies by incorporating evaluation into the policymaking process.

Documents

Details

This page collates useful guidance and resources for evaluating policy in government.

It’s aimed at anyone working on policy in government, including:

  • policymakers
  • evaluators
  • social researchers
  • commissioners
  • statisticians, economists and other analysts

Good policy design requires evaluation

To ensure that government policies are effectively delivered, it is important to understand if and how a policy works for the public.

Evaluation is key to generating robust evidence on government interventions (which can mean a policy, programme or new service delivery model). Evaluation ensures that government expenditure is spent wisely and, ultimately, provides effective and efficient services for the public. The evidence that evaluations generate can be especially valuable if:

  • the intervention you are designing has a high level of uncertainty over its effects (as will often be the case if it is something that has not been tried before)
  • there are risks associated with the intervention, or
  • it is high-cost.

HM Treasury’s new Green Book states that “monitoring and evaluation of all proposals should be […] an integral part of all proposed interventions” (2022, p. 2). If we want to ensure public money is used to maximum effect, we need to test whether interventions are working, or whether adaptations to interventions could deliver improvements.

To kickstart an evaluation for any intervention, you can ask yourself or your colleagues the following questions. This will help make sure your intervention is set up for effective evaluation.

1. What does success or failure look like?

Before you start thinking about evaluation methods, you should consider what your intervention is aiming to achieve, and what outcomes you want to see. You also need to consider what context your intervention will be operating in.

A really good starting point is to develop a Theory of Change – a map of your aims and objectives and how your policy should help achieve these. This process encourages you to think about all the other things that might influence the success of your policy. It can also help you to identify what you need to measure to understand whether your policy works, why it works (or why not), for whom it works, and under what conditions.

2. Can you test the intervention before you begin delivering at scale?

Piloting an intervention can be an easy and cost-effective way of determining its feasibility. A pilot is the rehearsal of an intervention with a sample population before it is rolled out on a larger scale.

One way of doing a pilot is to use a Randomised Controlled Trial (RCT). This involves randomly assigning participants to either a treatment group that receives an intervention or a control group that does not. The difference in the outcomes of both groups is then measured and analysed after a set period of time to determine the effectiveness of the intervention. Well-designed RCTs are usually the most accurate way of determining whether an intervention is working.

Read examples of Randomised Controlled Trials:

3. Could you test different variations of the intervention to understand the most effective approach?

Rather than put all your eggs in one basket, you can design interventions with inbuilt variation to test which version is most effective.

Multi-armed RCTs simultaneously test multiple variations of an intervention. In this approach, participants are randomly assigned to receive one variant of the intervention, enabling a comparison of results between the different arms of the trial.

One of the biggest benefits of a multi-armed RCT is that it does not need to include a control group if it is not possible or desirable to deny or delay the rollout of the intervention to eligible participants. This approach allows decision-makers to understand which version of the intervention works best.

4. Can you phase the rollout of the intervention to help you understand its impact?

Limited resources or simple logistics might mean that the delivery of an intervention has to be staggered using a waiting list or phased rollout. This offers an opportunity to use a stepped-wedge or waitlist design that allows those who have not yet received an intervention to be used as a temporary control group.

It is possible to estimate impact by:

  1. comparing outcomes for treatment and control groups for the time period prior to the control group receiving the intervention, or
  2. comparing outcomes between groups who have been exposed to the intervention for varying amounts of time. Where possible, this method can be combined with randomisation, for example, by randomising the order in which people get the intervention.

Read an example of waitlist design:

5. Can you mine existing data, or exploit natural variation, to help you understand the intervention’s impact?

If good-quality administrative data is already available, it is possible to construct comparison groups based on observable characteristics (e.g. gender, age, occupation). The aim is to build a comparison (‘matched’) group that looks as similar as possible to the intervention group before the start of an intervention. The outcomes of both groups are then tracked and compared to estimate the effectiveness of the intervention.

If historical trend data exists, it might be possible to use a difference-in-differences (DiD) approach. This involves comparing the outcomes between a treatment group and a control group that have historically followed the same trend in the outcome(s) of interest (e.g. two countries whose exam results have remained parallel over time). If the outcomes for the two groups differ following an intervention (e.g. the abolition of school league tables), the change in the size of the difference can be used to estimate the effect of the intervention (hence “difference in differences”).

Read an example of a DiD analysis:

If interventions have a quantifiable eligibility threshold (e.g. age 18 or income level of less than £50,000), then regression discontinuity design can be a good evaluation option. In many cases, the group that falls just outside the cut-off is very similar to the group that just qualifies. Any difference is likely down to chance so differences in outcomes between the two groups can be attributed to the intervention.

Note that the above is not an exhaustive list of evaluation methods.

Take a look at some of the other guides and resources we have collated for more information on evaluation methods.

Developing an evaluation plan

The best place to start with planning an evaluation is to make sure there is a Theory of Change in place for the intervention you will be evaluating. A Theory of Change will set out what the intervention aims to achieve and how.

Once you have a Theory of Change for your intervention you can start to plan your evaluation. Here’s an overview of the key elements you should include in your evaluation plan:

  1. Evaluation aims/objectives
    • What is the evaluation trying to achieve?
    • What key questions are you trying to answer?
    • Are you looking to understand how the intervention is being implemented?
    • Do you want to understand what impact the intervention is having and work out if it is delivering what it set out to?
    • Are you interested in whether the intervention is worth its investment? Is it good value for money?
  2. Key outcomes and the ways of measuring them
    • What are the key/primary outcomes for the evaluation?
    • What key things do you need to measure to meet your evaluation aims?
    • Are you able to identify ways of directly measuring the outcomes of interest?
    • If not, can you identify proxy measures (things that do not quite directly measure an outcome of interest, but hopefully move in step with it)?
  3. Evaluation approach, design, methods and rationale
    • Will this evaluation include impact evaluation, seeking to assess the effect of the intervention on one or more outcomes?
    • Will the evaluation include process evaluation, seeking to understand issues around how the intervention is implemented?
    • Will the evaluation include economic evaluation, seeking to assess questions around the value-for-money of the intervention?
    • Do you plan to use quantitative, qualitative or mixed methods?
    • Can you use experimental or quasi-experimental approaches?
    • What specific methods will you use? For example, observations, interviews, surveys, and analyses of existing data?
  4. What are your data requirements?
    • What data do you need access to? Where from? How will you get access to it?
  5. Key stakeholders
    • Who should you involve in the evaluation planning, delivery and reporting?
  6. Key milestones
    • What are the key decision points for confirming the evaluation approach and selecting your methods?
    • What are the expected timings for delivering your evaluation? For the interim and final reporting of results?
    • When do you expect to publish your plans and findings?
    • What are the key decision points for your intervention? How will the evaluation findings feed into these?
  7. Resources
    • What is your evaluation budget?
    • What other resources will you be using? (e.g., evaluation staff, independent contractors)
  8. Use of findings
    • How will the evaluation results be used?
    • How will the results inform whether the intervention continues, is modified or stops?

Once an outline of your evaluation plan is established, it can be used as a basis for more detailed evaluation planning. It is also an essential component of any business case or other funding application you wish to make for the intervention.

More guidance and support

Find more resources for evaluating policy in government.

For general guidance on policy evaluation, or if you have good evaluation guidance and resources that you would like to list here, contact the Evaluation Task Force at etf@cabinetoffice.gov.uk.

To gain feedback and further develop your evaluation ideas, contact the Evaluation and Trial Advice Panel at trialadvicepanel@cabinetoffice.gov.uk.

Published 31 January 2022
Last updated 17 January 2023 + show all updates
  1. Added the "Developing and Evaluating Complex Interventions (online)" course by NCRM and University of Glasgow.

  2. This page now includes an overview of what an evaluation plan should include and questions for civil servants to help kickstart evaluations in policymaking.

  3. Edited section 7: Evaluating Complex Policies. Added "Developing Strategic Approaches to Infrastructure Planning" from the OECD and ITF.

  4. First published.