Measuring the legacy & impact of major events: case studies

Question 1

Executive Summary

Accepted Answer

As part of the development of guidance for measuring the legacy and impact of major events, this report uses four case studies; The London 2012 Olympic and Paralympic Games (referred to as London 2012 Games for reporting purposes), The Men’s and Women’s Cricket World Cups (2017 and 2019 respectively), The City of Culture Programme (Derry/Londonderry 2013 and Hull 2017), and the Manchester International Festival, to test the available methods to support and advance the guidance. The case studies use the available evidence to:

Trial methods which could be used in evaluations to measure legacy impacts
Consider which methods are best used for different contexts
Identify when it may not be advised to use certain methods

These case studies should not be considered evaluations or be used to assess the benefits of the event. At times we have chosen to not use the best available data or approaches in order to test whether inferences can still be made without optimal data or less complex statistical designs.

We use publicly available data, which captures, to a degree, the breadth of impact which events have, but not necessarily the depth of impact, as events do not always ripple into noticeable impact in national data sets. Likewise, local authorities may have local data and intelligence, which may be used to demonstrate impact at a local level, but would not be available in national data. With these caveats in mind, our choice of data and methodologies should help provide guidance which suits a range of evaluations where data agreements do not exist, or econometric expertise is limited.

While there are many studies which claim to show significant legacy benefits they do not always follow HMT Green Book principles. In particular, existing reports often do not assess benefits above a counterfactual (what would have happened without the event). We should expect that event visitors have gone to alternative events or those who found employment because of the regeneration from an event would have found a job regardless. The methods assessed through the case studies therefore aim to account for these factors to show the additional benefit of an event. Identifying methods that are robust, follow HMT Green Book Guidance for each type of major event and consider the practical challenges of evaluating a major event is challenging and therefore a range of methods have been tested in this report.

Although the methods assessed in the case studies could be used for any length of evaluation, the case studies will focus on showing legacy impact which are often considered a key reason for hosting a major event. However, showing impacts a number of years after an event is challenging. There are a range of methods available to event evaluators varying in sophistication from basic trend analysis to fixed effects Difference-in-Difference (DiD). Although more simple methods are easier to employ and can be used with less and lower quality data, they cannot be used to make causal inferences about the impact of major events above any confounding impacts (such as other sporting or cultural events, political changes or local investments). This is because of the lack of counterfactual whereby the target area (the area of interest) is compared to a control area (an area similar to the target area but unaffected by the event). Given the focus of this research is to explore the use of evaluation techniques to show legacy impacts but also to provide guidance for all types of major event evaluation including where available evidence and budget is scarce, this research explores a variety of approaches.

Trend analysis has been used both as a first step to support more sophisticated methods and to provide insight. For several indicators, we have identified post event increases which could suggest positive legacy impact. For example, there was a sustained increase in overnight stays following the year Derry/Londonderry held the City of Culture title. To add sophistication to the trend analysis we have compared to the counterfactual areas such as similar areas or cities shortlisted as hosts. This allows us to understand whether those trends have also occurred in other places and are maybe due to confounding effects rather than the event itself.

In our use of sophisticated methods, we have been successful in isolating casual impacts during and immediately following events however our ability to identify long term impacts above a counterfactual has been limited. For example, employment in the tourism sector around the London 2012 Games was significantly higher during 2012 and 2013 when compared to similar areas further away from the event however this impact quickly disappeared. A full overview of the findings is included in the table below. Where no statistically significant results are recorded it does not imply those impacts did not occur but rather that it was not possible to show them with sufficient statistical certainty using the experimental methods and selected data. The Maryland Scientific Methods Scale (SMS level) is a 5-point scale referred to throughout the report and in the tables below which is used to evaluate the methodological rigor of program evaluations. The SMS is designed to assess the quality of research studies based on the strength of their design. The scale consists of five levels:

Level 1: Correlation between a program and a measure of effect at one point in time

Level 2: Temporal sequence observed between an intervention and an outcome

Level 3: A comparison between two or more comparable units, one with and one without the intervention

Level 4: Comparison between multiple units with and without the intervention, controlling for other variables that influence outcomes

Level 5: Random assignment and analysis of a large number of units

Table 1: Findings from London 2012 Games case study

Outcomes	SMS level achieved	Significant results?	Positive improvement in outcome
Employment in the tourism sector	SMS Level 3 - using DiD (fixed effects) and synthetic control	Yes	Impact found a year after event and weak evidence of sustained impact beyond this
Turnover from the tourism sector	SMS Level 3 - using DiD (fixed effects) and synthetic control	Yes	Impact found a year after event and evidence this was sustained
Wellbeing	SMS Level 3 - using DiD (fixed effects)	No	No impact found however gap between event and data collection could miss some impact
Volunteering	SMS Level 1 – data not available pre-Games, and data only reported every other year	No	No impact found however issues with data frequency are likely to impact findings
Medals won at Olympic Games	SMS Level 1 - Trend analysis	N/A	Sustained improvement in medals

Table 2: Findings from Cricket World Cup case study

Outcomes	SMS level achieved	Significant results?	Positive improvement in outcome
Increased cricket participation	SMS Level 1 – before and after comparison within treatment and control groups	Yes – in some cases (p<0.05)	Achieved for some population segments but not sustained or directly attributable
Enhanced subjective wellbeing	SMS Level 2 – before and after comparison within treatment and control groups	No (p>0.05)	No impact found however gap between event and data collection could miss some impact
Improved financial performance	SMS Level 3 – using DiD (fixed effects)	Yes – in some cases and at varying confidence levels (p<0.10)	Some evidence there was impact up to five years after the event

Table 3: Findings from City of Culture case study

Outcomes	SMS level achieved	Significant results?	Positive improvement in outcome
Tourism jobs	SMS Level 2 - Trends analysis and DiD	No	DiD for Hull does demonstrate a positive increase but is not statistically significant. While the results are not statistically significant, the trends analysis does show positive increases in tourism related jobs which in part could be related to the UK CoC title.
Creative and Cultural sector jobs (Hull only)	SMS Level 2 - Trends analysis and DiD	No	The UK CoC generates short-term jobs, however many jobs are temporary for the delivery of the year.
Economic impacts (turnover and employment)	SMS Level 2 - DiD (PSM)	Derry - Yes Hull – No	While Derry shows positive impact, the influence of COVID-19 has impacted the data for Hull.
Increase in public funding for arts and culture (Hull only)	SMS Level 2 - Trends analysis and DiD	No	Evidence demonstrates that the UK CoC title does increase public funding into titleholders, however the funding is not sustained, but anecdotal evidence suggests organisational funding increases in the long-term.
Wellbeing (Hull only)	SMS Level 2 - Trends analysis and DiD	No	Short-term impact but not sustained after event.

Table 4. Findings from Manchester International Festival case study

Outcomes	SMS level achieved	Significant results?	Positive improvement in outcome
Creative and Culture jobs	SMS Level 2 -Trend compared to control cities	No	Trends analysis shows increase in jobs during festival years however increase is not sustained.
Wellbeing	SMS Level 2 - Trend compared to control cities and DiD	No	Short-term impact but changes are not sustained.
Business counts and GVA	SMS Level 2 -Trend compared to control cities and DiD	No	Analysis shows MIF has contributed to Manchester and the wider area’s economy and has supported growth.

The lack of concrete evidence of legacy impacts reflects the complexities of evaluating the legacy and impact of major events particularly as this report has been limited to secondary data. Many of the datasets have been captured inconsistently over time and do not go back far enough before the event to remove any pre-event impacts such as announcements. This research has therefore further exemplified the need to prepare correctly for major event evaluation and led us to the following recommendations:

Early and robust evaluation planning: Evaluations should be integrated into the early planning stages of major events, aligning with clearly articulated objectives and theories of change. This allows for the identification of relevant indicators, data sources, and methodologies before the event takes place, ensuring that evaluations are fit for purpose and minimise compromises.

Development of a robust theory of change: A clearly defined theory of change that articulates intended impacts, mechanisms, and measures is essential. This clarifies the logic of the intervention, guiding data collection and analysis towards meaningful outcomes.

Pre-event baseline data collection: Collecting comprehensive baseline data on key indicators before the event is crucial for measuring true change and isolating the event’s specific impact from other confounding factors. Baseline data should be aligned with the theory of change and reflect the geographic scope of the evaluation.

Consideration of confounding factors: Anticipating and accounting for potential confounding factors, such as concurrent policy changes, economic trends, or external shocks, is essential. This requires careful consideration of the context in which the event takes place and the use of robust methodologies that control for these influences. Evaluations should employ robust methodologies, such as difference-in-differences, propensity score matching, synthetic controls, or spatial analysis, to control for confounding factors and isolate the specific impact of the event.

Prioritizing longitudinal and panel data: Cross-sectional data provide limited insight into changes over time. Evaluations should strive to collect longitudinal or panel data that track outcomes for individuals, businesses, or communities before, during, and after the event.

Combining quantitative and qualitative approaches: Mixed-methods evaluations that combine quantitative data with qualitative insights from interviews, focus groups, or case studies offer a richer understanding of impact and explore the mechanisms behind observed changes. More qualitative methods also provide an alternative where quantitative methods are inconclusive or not possible because of data limitations.

Question 2

1. Introduction

Accepted Answer

Context

The UK government, through a collaborative Research & Development Science and Analysis Programme between the Department for Culture, Media and Sport (DCMS) and the Department for Science, Innovation and Technology (DSIT), is exploring innovative and experimental research methods. This research aims to improve upon traditional evidence development within the departments. A key focus of this program is measuring the legacy and impact of major events, recognising the significant social, economic, and cultural benefits often attributed to hosting them. These benefits can include increased local investment, expanded opportunities for participation in cultural and sporting activities, job creation, improved community wellbeing, tourism growth, enhanced national and civic pride, and even a boost to national soft power.

However, current legacy and impact evaluations often lack methodological robustness, particularly in establishing counterfactuals for determining causality and comparability. The absence of standardized measures for social impacts, such as civic pride and wellbeing, further complicates efforts to directly attribute these outcomes to major events. This challenge is compounded by the difficulty in documenting and measuring long-term impacts across the cultural and sports sectors after an event concludes. Inconsistent methodologies also hinder comparisons across different events. Given the increasing focus on place-based approaches to hosting major events, robust monitoring and evaluation of long-term legacies and impacts are crucial for program planning and delivery. This rigorous approach allows host locations to refine their offerings and fully understand the return on investment.

Purpose of the document

This document utilises four previous events as case studies to test how the processes and methods set out in the Major Event Impact and Legacy Evaluation Toolkit can be used to assess the impact of major events. The case studies therefore provide a useful example for future evaluations of major events but are not themselves a comprehensive evaluation. The case studies provide an opportunity to put theory into practice and identify some of the more practical challenges of long-term evaluations. The four case studies are:

London 2012 Olympic and Paralympic Games
Cricket World Cups (Women’s Cricket World Cup 2017 and Men’s Cricket World Cup 2019)
City of Culture Events
Manchester International Festival

The case studies were identified based on a range of factors. Firstly, as the characteristics of events are a key determinant of the appropriate methods for evaluating legacy impacts, a diverse set of case studies are needed to test a range of potential approaches. The case study selection therefore includes both culture and sporting events, events with significant infrastructure investment, and those which used existing infrastructure and events held nationally and regionally. To be able to test legacy impacts, case studies needed to be far enough in the past to provide enough proceeding years to test post-event impacts.

To keep case studies focused, not all the objectives and outcomes have been explored in the analysis. We therefore aimed to use the different case studies to explore particular types of outputs, outcomes and impacts. For example, a key objective of the Cricket World Cups was to increase participation in Cricket which is a focus for the case study.

Analytical process

For each case study event we follow the process set out in the Major Event Impact and Legacy Evaluation Toolkit set out below:

Defining the major events

The first step is defining major events. This is an important step as the choice of methodology will depend heavily on the specifics of the event. For example, whether the event is a local, regional or national event will impact on the choice of control group. The categories by which the events are categorised are:

Type: cultural; sport; or a mix of both
Scale: national; regional; or local
Importance: International; national; regional; local
Competitive process
Duration: days; weeks; months
Frequency: One off; recurring in same geography; recurring in different geography
Construction: cultural/sporting facilities; transformative construction of other civic or built capital
Geography
Catalysts: catalyst for future events; events and projects following a catalyst;

Theory of Change

Theories of change are used to model how an event is expected to achieve desired outcomes and benefits. This not only helps in understanding how an event will create impact, but also how each event should be measured and evaluated.

When considering the outcomes in the theory of change its important to remember that outcomes will arise, build up, and potentially decay over different time periods. Additionally, outcomes will be felt by different groups and spatial areas.

To support evaluations, we created example theories of change across five outcome areas:

Cultural and Social Impact
Economic and Employment Impact
Reputation
Health and Well-Being
Environmental Responsibility and Accessibility

The five theories of change were developed in collaboration with DCMS. These are available in Appendix 1 of the toolkit.

Choosing indicators

Indicators provide a means of measuring outputs, outcomes and impacts identified in the Theory of Change – and therefore provide a basis for evaluating the extent to which an event has achieved its objectives.

Scoping data sources

Once a set of indicators are identified, evaluators must find data sources which can be used to evidence the indicator. Both primary and secondary data can be used for this purpose. For the case studies in this report, only secondary data sources were available (since timing would not allow for primary data collection), which limited the ability for the case studies to measure a comprehensive list of indicators.

Methodology

Event evaluators must compare different methodologies and select the ones that are most appropriate for their event. The methodology sections in this report provide an overview of the approach, the selection of data sources, the econometric approach including how the target area and counterfactual were defined, and the model specification. There is a focus on why the methods were chosen over other options and the consequences of these decisions, referencing the particular challenges involved with major event evaluation.

Findings

The findings section explores whether the analysis identifies any significant effects from the event and explores some of the potential reasons for the results, both in terms of event delivery and the choice of methodology. A particular focus is given to legacy impacts following the event. For some indicators we have explored the use of different methodologies, and the findings section will help explore how the choice of methodology can impact on the findings. Also included in this section is an assessment of the robustness of the findings which includes consideration of how well we can trust the results.

Learnings

Each case study includes a learnings section which consolidates what has been learned during the case study analysis about how major events evaluation can best tackle the key challenges associated with major events.

Question 3

2. London 2012 Olympic & Paralympic Games

Accepted Answer

This case study seeks to test a set of methodologies that could be used to evaluate a mega-event such as the London 2012 Olympic and Paralympic Games. This contributes to the overall framework in how the legacy impacts of major events should be measured. This case study does not seek to evaluate the overall impact of the London 2012 Games but rather experiments with several techniques on a small number of metrics (e.g. tourism impact, wellbeing impact and medal count), which were chosen as these are i) closest to the original outcomes in the legacy plan and ii) are the outcomes with clear and logical routes to being realised. The analysis presented does not capture all possible outcomes and trials multiple methods which are sometimes not the optimal choice. As such, this case study should not be used to judge the success of the London 2012 Games.

2.1 Categorisation

2.1.1 Type of Major Event

The London 2012 Games events are among the most prestigious global sporting gatherings, akin to the FIFA World Cup, requiring years of planning and substantial financial outlay. The London 2012 Games were not only a showcase of elite athletic talent but also a platform for cultural exchange, symbolising a modern and inclusive spirit. They necessitated the construction of new venues, transportation enhancements, and urban regeneration, transforming parts of the city.

2.1.2 Focus of the event

The London 2012 Games were primarily a sporting event, encompassing a diverse range of disciplines. Beyond the competitive sporting events, the London 2012 Games incorporated substantial cultural elements, highlighted by the opening and closing ceremonies that celebrated British culture, history, and arts. These ceremonies were globally broadcast, offering a cultural narrative that complemented the athletic achievements on display. The London 2012 Games also included the Cultural Olympiad, a series of events and festivals showcasing the UK’s artistic talent. This blend of sport and culture underscored the dual identity of the London 2012 Games, appealing to both sports enthusiasts and those interested in cultural spectacles. This mix attracted a broad audience, enhancing the event’s appeal and reach. The cultural components also provided opportunities for local artists and performers to gain exposure on an international stage.

Beyond the cultural and sporting aspects, the London 2012 Games served as a catalyst for the regeneration of a previously underdeveloped area in East London. The substantial investment in infrastructure, including transportation, sporting venues, and the Olympic Village, aimed to create a lasting positive impact on the community. This involved transforming a former industrial area into a vibrant mixed-use space, with the Olympic Village later converted into the East Village, providing thousands of new homes, a portion of which were designated as affordable housing. The project also sought to create jobs, boost the local economy, and improve the environment.

2.1.3 Scale and Geography

The London 2012 Games were a national event with a profound international scope, primarily concentrated in London but extending across the UK for certain events, such as sailing in Weymouth. The London 2012 Games can be categorised as a ‘Mega Event’ characterised by significant investment and infrastructure development with numerous temporary and permanent venues established to accommodate different sports and events. London became the focal point, with key venues like the Olympic Stadium and Aquatics Centre located within the city. However, the London 2012 Games’ reach was regional, with some sports taking place in other areas of the UK including:

Weymouth and Portland: Sailing events
Eton Dorney (near Windsor): Rowing and canoe sprint events
Glasgow, Manchester, Cardiff: hosted some football matches
Hadleigh Farm, Essex: mountain biking events
Lee Valley White Water Centre (Hertfordshire): Canoe slalom events

2.1.4 Importance

The London 2012 Games were internationally significant, attracting millions of visitors and engaged billions of viewers worldwide through television and online platforms. The event was seen as a pivotal moment for the UK, enhancing its global reputation as a destination for major events and boosting national pride.

2.1.5 Competitive process

The selection of London as the host city followed a rigorous competitive bidding process overseen by the International Olympic Committee (IOC). The process began several years prior to the London 2012 Games, requiring potential host cities to submit detailed proposals highlighting their capabilities and vision for the event. London was shortlisted alongside cities like Paris and Madrid, with the final decision announced in 2005, seven years before the London 2012 Games. The process was as follows:

Application Phase: Cities interested in hosting the Games were required to submit applications to the IOC. This phase involved demonstrating their ability and readiness to host such a large-scale international event.

Candidature Acceptance: The IOC evaluated the initial applications and accepted those that met the basic requirements to become official candidate cities

Shortlisting: The IOC then shortlisted cities based on detailed evaluations of their proposals, focusing on aspects like infrastructure, logistics, and legacy plans. For the London 2012 Games, the shortlisted cities were London, Paris, Madrid, New York, and Moscow

Candidature File Submission: Shortlisted cities were required to submit comprehensive candidature files, outlining detailed plans for hosting the London 2012 Games, including venue construction, transportation, accommodation, and security

IOC Evaluation Commission Visits: The IOC sent evaluation commissions to the candidate cities to conduct site visits. These visits were crucial for assessing the feasibility and readiness of each city’s plans.

Final Presentations: Candidate cities presented their final bids to the IOC members, highlighting their unique strengths and addressing any concerns raised during the evaluation visits.

Selection and Announcement: The final decision was made during the 117th IOC Session in Singapore on July 6, 2005. London was announced as the host city for the 2012 Olympic Games, winning over Paris in the final voting round.

2.1.6 Duration

The London 2012 Games spanned several weeks, with the Olympics taking place from July 27 to August 12, 2012, followed by the Paralympics from August 29 to September 9, 2012. These events were preceded by extensive preparations, including test events and rehearsals, to ensure seamless execution. The timeline included various phases, from the buildup and opening ceremonies to the main competitions and closing events. The duration also encompassed post-event activities, focusing on the transition from event mode to legacy utilization, ensuring that the infrastructure and investments continued to benefit the community

2.1.7 Construction of infrastructure

The London 2012 Games prompted the construction of state-of-the-art sporting facilities, including the Olympic Stadium, Aquatics Centre, and Velodrome, each designed with sustainability and legacy in mind. These venues were part of a broader urban regeneration initiative in East London, transforming the area into the Queen Elizabeth Olympic Park. The construction efforts extended beyond sporting facilities to include transportation upgrades, such as the expansion of the London Underground and improvements to road networks.

The UK’s bid to host the London 2012 Games heavily emphasised the concept of legacy, aiming to create lasting positive impacts beyond the event itself. One of the key legacy aims was the regeneration and transformation of East London, one of the city’s most deprived areas at the time. The London 2012 Games aimed to bring investment, jobs, and improved infrastructure to the area, creating a lasting economic and social uplift.

2.1.8 Catalysts for future events

The Olympic Park continues to host competitions in a variety of sports, including swimming, field hockey, athletics, football, triathlon, and rugby.

The most successful 2012 Olympic legacy event, RideLondon was developed by the Mayor of London and his agencies and first held in August 2013. Now, it is the world’s biggest festival of cycling. Since its first edition, more than 500,000 riders have taken part in the event, collectively raising in excess of £80 million for more than 1,000 charities over the same period.

Launched in 2016, Swim Serpentine is an open water swimming event staged in the Serpentine lake in Hyde Park, the venue for both the marathon swimming and triathlon events at the London 2012 Games.

2.2 Event objectives and Legacy Strategy

To identify the data and indicators required for the evaluation we must decide on the objectives of the major event. The objectives of a major event may change through the event lifecycle so can be updated. However, where primary data collection is necessary it is important to collect a baseline and therefore agreeing the initial objectives is particularly important. Best practice in preparation for an evaluation will include a Theory of Change (ToC).

Before the London 2012 Games, DCMS published a legacy strategy “Plans For The Legacy From the 2012 Olympic and Paralympic Games”.^{[footnote 1]} In this document, Government set out their ambitions for the London 2012 Games. The emphasis on legacy was a distinctive feature of the original UK bid and responded to concerns about the underutilisation of the infrastructure created to stage previous Games.

The aims covered four themes and included:

Sport: Harnessing the UK’s passion for sport to increase grass roots participation and competitive sport and to encourage physical activity;

Economic: Exploiting the opportunities for economic growth offered by hosting the London 2012 Games;

Community Engagement: Promoting community engagement and achieving participation across all groups in society through the London 2012 Games;

East London regeneration: Ensuring that the Olympic Park can be developed after the London 2012 Games as one of the principal drivers of regeneration in East London.

Following the conclusion of the games, further legacy objectives were included, such as bringing communities together, with a particular focus on driving social change and encouraging individuals to volunteer their time and skills.

2.2.1 Theory of change

The original evaluation of the London 2012 Games, “London 2012 meta-evaluation” produced detailed logic models and ToC for each of the four key themes above.^{[footnote 2]} This detailed ToC provided the basis for developing the key evaluation questions and identifying indicators to measure outcomes.

The logic model for the economic legacy mission spans far beyond just tourism impacts, covering an array of outcomes such as UK-level growth and sustainability, inward investment, tourism impacts, sustained improvements in accessibility standards, improvements in workforce skills and employability, carbon footprints/ emissions. For the purposes of this case study, we have produced a simplified ToC that is used to guide the analysis of legacy effects. Readers are reminded that this Case Study is not a full evaluation and does not attempt to supersede the London 2012 Meta-Evaluation nor its conclusions.

Figure 2.1 Theory of Change for London 2012 Games

2.3 Choosing Indicators

The indicators to test the legacy impact methodologies on were chosen based on three key considerations:

The outcomes were closely related to the outcomes in the original legacy plan
The outcomes offer clear and logical routes to being realised (i.e. the ones which are most likely to demonstrate impact)
Data exists to support the application of econometric methods

While this case study focuses on tourism employment as an indicator of the London 2012 Games’ impact, a comprehensive evaluation should consider a broader range of outcomes. As detailed in the DCMS meta-evaluation of the Games^{[footnote 3]}, these outcomes span multiple themes, including economic, social, sporting, and regeneration impacts.

Economically, the London 2012 Games’ effects extended beyond tourism employment to encompass factors like GDP/GVA changes, overall job creation (including in construction, retail, and leisure), inward investment, exports, and supplier contracts. Socially, the evaluation should consider the impact on volunteering and social action (including changes in participation rates and the capacity of the volunteering sector), cultural participation (across different demographics and art forms), engagement of children and young people (through educational and social inclusion programs), and changes in public attitudes towards disability. Within the sporting theme, a full evaluation would assess changes in participation rates (for adults and children, in competitive and non-competitive sport), the development and utilization of sporting infrastructure, elite sport performance, and the UK’s international sporting influence. Finally, the regeneration of East London should be assessed through factors like land use change, housing development, transport infrastructure improvements, community cohesion, and the overall liveability of the area. The DCMS meta-evaluation provides a framework for considering these diverse outcomes and the methodologies for measuring them.

2.4 Methodology

2.4.1 Overview of the evaluation

This case study explored the localised impacts of the London 2012 Games, using a propensity score matching (PSM) and difference-in-differences (DiD) design. Specifically, small geographical areas (formally Lower Super Output Areas, or LSOA) within 10km of where the London 2012 Games were held were compared to a matched comparison group to explore the following outcomes:

Increased jobs and revenue in the tourism industry
Improved wellbeing and life satisfaction of the local community
Increased levels of volunteering

This case study specifically experimented with different econometric methodologies that could be used to explore the local impacts of the London 2012 Games. It sought to draw methodological learnings that could be applied to evaluations of future events held in the UK. The London 2012 Games were used to contextualise the methodological findings. As stressed, this case study did not seek to understand the full impact of the London 2012 Games.

Table 2.1: Data sources for London 2012 Games Case Study

Data Source	Description of data source	Time frame	Indicators
Business Structure Database (BSD)	The (BSD) is an annual extract of the Inter-department Business Register (IDBR), a database of business organisations used throughout Government.	1997 – 2018 available in the ONS SRS	Revenue in the tourism sector Employment in the tourism sector Share of tourism revenue Share of tourism employment Growth in tourism revenue and employment
Annual Survey of Hours and Earnings (ASHE)	The (ASHE) provides information about the levels, distribution and make-up of earnings and hours paid for employees by sex, and full-time and part-time working.	1997 – 2021 (noting LSOA codes only available from 2005)	Median income
Understanding Society	A multi-topic household survey, the purpose of Understanding Society is to understand social and economic change in Britain at the household and individual levels.	1991- 2023 available in the ONS SRS	Life satisfaction Volunteering – yes/no in the last 12 months Volunteering – hours spent volunteering

2.4.2. Defining a target area

There were two feasible approaches to defining the treatment areas: i) the areas which received funding to host the London 2012 Games; or ii) the areas around where the London 2012 Games were held, i.e. the host Boroughs and surrounding areas as benefits are anticipated to spill over into neighbouring areas. This sentiment is noted in the London 2012 Meta-Evaluation, where many of the benefits, including economic benefits, were expected to accrue to the host boroughs the surrounding areas and United Kingdom as a whole.^{[footnote 4]}

Within both approaches it was considered reasonable to expect that the impacts of the London 2012 Games was concentrated around East London, where the Games and the associated regeneration took place. Based on this assumption, we assumed that the areas around the London 2012 Olympic Park were considered the ‘treatment’ or ‘intervention’ areas. In the context of the analysis this meant:

Unit of analysis: The unit of analysis is the LSOA, small geographical areas defined by the ONS which contain between 400 – 1,200 households (or 1,000 – 3,000 people).

Treatment areas: As discussed above, two approaches were adopted to define the treatment areas:

Radius around where the London 2012 Games were held: One approach to defining the treatment areas was by taking the LSOAs within a given radius of the centroid of the LSOA of the Olympic Park as the treatment areas (treated LSOAs).

The analysis tested radii of 5km and 10km from the Olympic Park to understand the spatial variation of estimated impacts. This in principle aided understanding of how the London 2012 Games impacted local areas of East London, and also more broadly London as City – i.e. the interpretation of the analysis changes as the treatment area changes.^{[footnote 5]}

The London Boroughs in which the London 2012 Games took place: The LSOAs within the London Boroughs of Barking and Dagenham, Greenwich, Hackney, Newham, Tower Hamlets and Waltham Forest were considered the treatment areas. These were the host Boroughs of the London 2012 Games.

Potential comparison areas: The absence of a spatially defined target area meant it was difficult to identify a point at which we would reasonably expect the economic impacts of the London 2012 Games to be less likely to reach. To somewhat mitigate this, we created an exclusion zone. Different radii are experimented with (from within London comparisons to 40km). A larger exclusion zone carried the benefit of a decreased likelihood of benefits contaminating comparison areas. However, it also increased the risk that comparison areas did not resemble the treated areas –particularly important when considering the uniqueness of London. This may pose a threat to the sample size, however for this case study, at low levels of geography (i.e. LSOA), this was not anticipated to affect the analysis.

Given the uniqueness of London, we also experimented with restricting the potential comparison areas to the LSOAs in non-host boroughs. This ‘within-London’ comparison tested for differential impacts in terms of changes in outcomes between East London, and the rest of London.

An illustrative example of the selection of treatment and counterfactual areas is presented in Figure 2.2 below. Figure 2.2 adopts a 10km treatment radius, and a 20km exclusion radius. Treated LSOAs would lie in the green shaded ring and excluded LSOAs would lie in the red shaded ring. All other LSOAs beyond this point could potentially be utilised to construct a counterfactual. No other exclusion criteria were applied to potential comparison areas.

Figure 2.2: Mapping Treatment and Potential Comparison Areas, using a 10km radius and 20km exclusion zone

2.4.3 Defining a counterfactual

Given the exploratory nature of this work, we adopted two possible approaches to identify LSOAs that could be used to construct a counterfactual:

Propensity Score Matching:

Propensity score matching algorithms (see toolkit for a summary of matching algorithms) were used to identify the LSOAs (outside the exclusion zone) which shared similar characteristics with those close to the event, prior to the London 2012 Games taking place (i.e. between 2005 and 2011). The matching variables included:

Table 2.2: Matching variables used for London 2012 Games

Indicator	Years	Reason for inclusion
Share of tourism employment (% of total jobs)^{[footnote 6]}	2005 - 2011	Identifies the importance of the tourism sector within the local areas
Share of manufacturing employment (% of total jobs)	2005 - 2011	A simple measure of the industrial structure of each area
Total employment	2005 - 2011	A proxy measure of the economic output in the area
Share of tourism revenue (% of total revenue)	2005 - 2011	Identifies the financial significance of tourism in the local area
Share of manufacturing revenue (% of total revenue)	2005 - 2011	Identifies the financial significance of the manufacturing industry in the local area
Total revenue	2005 - 2011	A proxy measure of the economic output in the area
Total number of firms	2005 – 2011	Provides an indication of economic activity in the area and potential employment opportunities.
Gross (median) weekly earnings	2005 - 2011	A measure of the underlying productivity of the local economy

Areas that were successfully matched were considered the counterfactual group of areas in which the treatment areas were compared against.

This analysis uses a ‘nearest neighbour without replacement’ matching algorithm. This meant that each LSOA near the Olympic Park was paired with a similar LSOA further away (beyond 30km), ensuring a direct comparison without needing to adjust for other factors. This approach allowed us to use the existing longitudinal weights from the Understanding Society dataset. Other matching algorithms might be considered depending on the specific econometric analysis and outcomes being explored.

Synthetic Control

Synthetic controls are best utilised when there is a single treated unit. For the purposes of the synthetic control analysis, the London Borough of Newham (using 2011 boundaries) in which the London 2012 Games were largely held was considered the ‘treated unit’. The analysis is therefore undertaken at a higher geographical level but looked at more localised impacts of the London 2012 Games. Following the PSM-DiD methodology, LAs beyond 30km from the centroid where the Games took place were used within the donor pool of LAs. The same variables used in the matching model were used to estimate the synthetic control weights.

2.4.4 Model Specification

Once the counterfactual was identified, the treatment and comparison areas were used within a difference-in-difference (DiD) framework. This sought to make comparisons between treatment and control areas, before and after the London 2012 Games.

The DiD design was estimated using fixed effect regressions.^{[footnote 7]} We used both a binary treatment variable and an event study. The binary treatment specification estimated an aggregate measure of impact after the London 2012 Games. The event study interacts the treatment variable and a binary year variable to estimate year specific effects. The benefit of the event study is that it shows how the magnitude and significance or the impacts change over time, and accounts for temporal variability in the impacts.

Validity of the Design: There are diagnostic practices that were used to assess the quality of the match and validity of the DiD (i.e. the parallel trends assumption). These checks should be undertaken for each means of constructing a counterfactual (i.e. PSM or synthetic control), prior to estimating the DiD:

Assessing the quality of the match: The quality of the match generated through the matching algorithm was assessed using standardised mean differences (SMDs). SMDs are a measure of the size of the difference between two groups; and are calculated by dividing the difference in means of the two groups by the pooled standard deviation of the groups. Following best practice, an SMD greater than 0.1 denotes meaningful imbalance in the baseline covariates.^{[footnote 8]} Balance tests undertaken for this case study indicate that a good quality match was achieved, suggesting that pre-treatment and comparison areas share similar levels of pre-intervention matching variables (see Figure 2.13).

Assessing parallel trends: The event study plots (Figures 2.4, 2.7, and 2.9) demonstrate the extent to which the treatment and comparison areas differed prior to the London 2012 Games. Across all the event study regressions, the vast majority of pre-intervention periods are statistically insignificant, indicating that prior to the London 2012 Games, the outcome indicators in treatment and comparison areas shared similar trends. This cannot directly confirm that in the absence of the London 2012 Games the treatment areas would continue to trend in the same way as the matched control areas.^{[footnote 9]} However, it does provide confidence that the treatment areas would have likely trended in this way in the absence of the London 2012 Games.

2.5 Findings

It should be noted that the below results are used to contextualise experimental methodologies, as opposed to representing a full evaluation of the London 2012 Games. Indicators (and therefore outcomes) were chosen to broadly align with the legacy plan, and also to ensure that the data was available to test the methodologies. Broadly speaking, the same (or similar) methods can be applied to other indicators and data sources.

2.5.1 Overview of findings

The table below presents an overview of the results of the econometric analysis undertaken to explore the legacy impacts of the London 2012 Games.

Table 2.3 Overview of findings for London 2012 Games

Outcomes	SMS level achieved	Significant results?	Positive improvement in outcome
Employment in the tourism sector	SMS Level 3 - using DiD (fixed effects) and synthetic control	Yes	Impact found a year after event and weak evidence of sustained impact beyond this
Turnover from the tourism sector	SMS Level 3 - using DiD (fixed effects) and synthetic control	Yes	Impact found a year after event and evidence this was sustained
Wellbeing	SMS Level 3 - using DiD (fixed effects)	No	No impact found however gap between event and data collection could miss some impact
Volunteering	SMS Level 1 – data not available pre-Games, and data only reported every other year	No	No impact found however issues with data frequency are likely to impact findings
Medals won at London 2012 Games	SMS Level 1 - Trend analysis	N/A	Sustained improvement in medals

The below sections present the result of the econometric analysis. Unless otherwise specified, figures present a 10km treatment area and a 30km exclusion area. This distance is used to visually illustrate the methodology and contextualise the finding. This can be updated to show the desired distance (or alternative counterfactual design) that is most relevant to the research questions.

2.5.2 Tourism (Employment)

Employment in tourism sectors was measured as the annual growth rate in the number of employees. The below figure 2.3 presents trends in the annual growth rate in the number of employees between 2006 and 2018.

Figure 2.3: Trends in the annual growth rate in the number of employees in the tourism sector, in treatment areas (LSOAs 10km from the London 2012 Games) and control areas (matched LSOAs beyond 30km from the London 2012 Games).

Descriptive pre-London 2012 Games trends: The global financial crisis in 2007 resulted in sharp decreases in employment in the tourism sector, with the number of employees decreasing by approximately 25% and 32% in treatment and comparison areas respectively. The sector’s recovery began from 2008. The figure shows that treatment and control areas were similarly affected by the 2007 financial crisis, with similar recovery trends up until the start of the London 2012 Games.

Descriptive post-London 2012 Games trends: During the year of the London 2012 Games, there was an increase in employment growth in the tourism sector across the nation; however, this appeared to be more pronounced in areas within 10km of the London 2012 games. In treatment areas, employment in the tourism sector grew by 16%, compared to 7% in comparison areas. For the remainder of the data points, employment growth in the tourism sector is higher within treated areas.

Impact: To explore the causal impact of the London 2012 Games on the growth in the number of employees several methods were tested:

Fixed Effects (event study): Figure 2.4 presents an event study chart, showing the impact of the London 2012 Games for each year, relative to the final year before the Games (i.e. 2011). A statistically significant increase was seen in the year of the London 2012 Games, suggesting that employment grew by 10 percentage points more in treatment areas compared to control areas (statistically significant at the 95% confidence level).

There is weak evidence to suggest that these effects were persistent, with statistically significant effects identified at the 90% confidence level in 2014, 2016 and 2018.

Figure 2.4: Event study chart showing impact of the London 2012 Games on growth in the number of employees in the tourism sector (using a 10km treatment radius and 30km exclusion radius)

Sensitivity to changes in the radii: The above analysis was repeated, varying the radii for both treatment and control areas (see Table 2.4 below). The results indicate that as the treatment area radius was reduced (i.e. from 10km to 5km or the LSOAs in the host boroughs) the significance of the impact dissipates both in the year of the Olympic and in the years following, irrespective of the definition of the control area. This broadly suggested that the tourism effects of the London 2012 Games were further reaching than just East London, and in fact may have benefitted London as a whole more than the East of London.

This is further evident when comparing East London to the rest of London – where we found that all post- London 2012 Games impacts are statistically insignificant.

Table 2.4 Dynamic impact of the London 2012 Games on growth in the number of employees in the tourism sector, varying the radii for treatment and control areas. Statistically significant effects have been noted with asterixis

	10km treatment 30km exclusion	10km treatment 20km exclusion	10km treatment 40km exclusion	5km treatment 30km exclusion	5km treatment 20km exclusion	5km treatment 10km exclusion	Host boroughs vs matched non-host boroughs
2006	0.0211 (0.0372)	0.0552 (0.0365)	0.0567 (0.0384)	0.0049 (0.0698)	-0.0405 (0.0625)	-0.0070 (0.0703)	0.0272 (0.0621)
2007	0.0803* (0.0436)	0.0318 (0.0406)	0.0575 (0.0429)	-0.0364 (0.0810)	-0.0390 (0.0735)	-0.0843 (0.0820)	-0.0004 (0.0744)
2008	-0.0139 (0.0433)	0.0548 (0.0419)	0.0478 (0.0428)	-0.0562 (0.0810)	-0.0503 (0.0782)	-0.0644 (0.0791)	-0.0012 (0.0737)
2009	0.0113 (0.0404)	0.0456 (0.0385)	0.0321 (0.0421)	0.0197 (0.0754)	-0.0212 (0.0676)	-0.1036 (0.0717)	-0.0279 (0.0689)
2010	-0.0016 0.0398	0.0700* (0.0387)	0.0438 (0.0399)	-0.0685 (0.0752)	-0.0307 (0.0736)	-0.0424 (0.0791)	0.0090 (0.0715)
2012	0.0986** (0.0414)	0.1506*** (0.0403)	0.1587*** (0.0414)	0.0767 (0.0822)	0.0587 (0.0743)	0.0467 (0.0749)	0.0424 (0.0689)
2013	0.0204 (0.0366)	0.0726** (0.0355)	0.0375 (0.0355)	0.0264 (0.0639)	0.0312 (0.0636)	-0.0210 (0.0641)	0.0481 (0.0686)
2014	0.0582* (0.0350)	0.0546 (0.0342)	0.0631* (0.0347)	0.0060 (0.0714)	0.0097 (0.0587)	-0.1053 *(0.0612)	-0.0363 (0.0580)
2015	0.0264 (0.0357)	0.0762** (0.0345)	0.0667* (0.0361)	0.0118 (0.0614)	-0.0218 (0.0593)	-0.0425 (0.0630)	0.0891 (0.0627)
2016	0.0710* (0.0379)	0.0615* (0.0357)	0.0853** (0.0377)	0.0584 (0.0747)	0.0544 (0.0692)	-0.0088 (0.0688)	0.0480 (0.0565)
2017	0.0309 (0.0347)	0.0590* (0.0337)	0.0529 (0.0348)	-0.1126 0.0705	-0.0497 (0.0609)	-0.1288 ** (0.0639)	0.0446 (0.0592)
2018	0.0673* (0.0345)	0.1044*** (0.0337)	0.0922*** (0.0355)	0.0647 (0.0645)	0.0242 (0.0595)	0.0213 (0.0643)	0.0285 (0.0576)

Note: *** denotes statistical significance at the 99% confidence level; ** denotes statistical significance at the 95% confidence level; * denotes statistical significance at the 90% confidence level.

Fixed Effects (single binary variable): We also tested for the presence of statistically significant effects using a single binary variable, which encapsulated all post-treatment periods into a single coefficient. In this instance, the coefficient was statistically significant at the 95% confidence level and suggested that growth in tourism employment was on average 4 percentage points higher in treatment areas compared to comparison areas after the London 2012 Games (using the 10km treatment radius and 30km control radius).

Sensitivity to changes in the radii: As above, the radii was changed to understand how the impacts vary across different specifications of the treatment and control areas. The results are shown in Table 2.5 below. In this instance, impacts were statistically significant until a 5km treatment radius and 10km exclusion radius was used. Statistically significant effects were also observed for the within-London comparisons. This implied that London as a whole benefited from the London 2012 Games (compared to the rest of the country) opposed specifically the East of London (i.e. highlighting the spatial spillover of benefits)

Table 2.5: Impact of the London 2012 Games on growth in the number of employees in the tourism sector, varying the radii for treatment and control areas. Statistically significant effects shaded blue

	Coefficient	Standard Error	95% Confidence interval
10km treatment 30km exclusion	0.0374***	0.0113	[0.015 - 0.06]
10km treatment 20km exclusion	0.0398***	0.0113	[0.018 - 0.062]
10km treatment 40km exclusion	0.0401***	0.0117	[0.017 - 0.063]
5km treatment 30km exclusion	0.0409**	0.0223	[-0.003 - 0.085]
5km treatment 20km exclusion	0.0452**	0.0212	[0.004 - 0.087]
5km treatment 10km exclusion	0.0158	0.0209	[-0.025 - 0.057]
Host boroughs vs matched non-host boroughs	0.0081	0.0167	[ -0.025 - 0.041]

Note: *** denotes statistical significance at the 99% confidence level; ** denotes statistical significance at the 95% confidence level; * denotes statistical significance at the 90% confidence level.

Synthetic control: A synthetic control was used as an alternative econometric approach to estimate the impact of the London 2012 Games. The synthetic control is best suited when there is a single treated unit. As such, the level of the analysis was changed from LSOA, to Local Authority (LA). This analysis compared the outcomes of the LA in which the London 2012 Games were held (the London Borough of Newham) to a ‘synthetic Newham’, informed by LAs whose centroids were more than 30km away from where the London 2012 Games were held.

A fixed effects regression was also used (comparing the LSOAs that make up Newham and a matched sample) to compare the results of both the different geographic level of the analysis and the different econometric approaches.

Figure 2.5: A plot of the synthetic control (a) and synthetic difference-in-differences (b)

(a)

(b)

Table 2.6: Estimated Impacts of the London 2012 Games on growth in the number of employees in the tourism sector, using different methods of estimating impacts. Statistically significant effects shaded blue

Method	Coefficient	Standard Error	Lower bound 95% Confidence interval
Synthetic Control	0.076***	0.029	[0.020 - 0.132]
Synthetic Difference-in-Differences	0.011	0.032	[-0.052 - 0.075]
Fixed effects (binary treatment variable)	0.022	0.038	[-0.053 - 0.097]

Note: *** denotes statistical significance at the 99% confidence level; ** denotes statistical significance at the 95% confidence level; * denotes statistical significance at the 90% confidence level.

Figure 2.5 shows that the choice of econometric method can heavily influence the conclusions drawn. The fixed effects and synthetic DiD yielded similar estimates in terms of magnitude, sign and significance – failing to find statistically significant effects in terms of employment growth in the tourism sector. The synthetic control, on the other hand, estimated statistically significant effects which were higher in magnitude. There are a couple of reasons which could explain this difference:

Estimated counterfactual: Visually, there are obvious differences in how the SC and SDiD have estimated the counterfactual (dotted blue line in Figure 2.5). Significantly less variability was seen in the SDiD counterfactual compared to the SC counterfactual, which almost matches the treated area trends (albeit at lower levels).

Underlying sample size of counterfactual: The SDiD typically uses more control units compared to a SC. There is also a risk that the SC captured more noise as it matched the outcome level of the treatment unit, opposed to the trend of the treated unit.

The results were broadly consistent, both in terms of coefficient magnitude and significance, with the fixed effects regression that uses a 5km treatment and 10km exclusion zone. Combined with the above evidence, it suggests that the effects of the London 2012 Games were not localised to East London, and benefits were felt in the broader London area.

2.5.2 Tourism (Turnover)

Turnover in the tourism sectors was measured as the growth rate of firm turnover. The below figure 2.6 presents trends in the annual growth rate in the turnover of firms in the tourism sector between 2006 and 2018.

Figure 2.6: Trends in the annual growth rate in turnover in the tourism sector, in treatment areas (LSOAs within 10km of the London 2012 Games) and control areas (30km beyond where the London 2012 Games were held).

Descriptive pre-London 2012 Games trends: The global financial crisis in 2007 resulted in a sharp decrease in turnover growth in the tourism sector. Turnover growth decreased by 32% and 33% in the treatment and control areas respectively. This suggests that treatment and control areas were similarly affected by the 2007 financial crisis, with very similar recovery trends up until the start of the London 2012 Games.

Descriptive post-London 2012 Games trends: During the year of the London 2012 Games, there was a notable increase in tourism turnover growth in treatment areas (of approximately 13%), however this was not observed in control areas (with a 1% decrease). For the remainder of the data points, employment growth in the tourism sector is higher within treated areas. A notable increase in turnover growth among treatment areas is observed in 2015, where there was 17% increase in turnover growth in the tourism sector, which was greater in magnitude than the year of the London 2012 Games. A corresponding increase was not observed in control areas.

Impact: To explore the causal impact of the London 2012 Games on the levels of turnover growth in the tourism sector several methods were tested:

Fixed Effects (event study): Figure 2.7 presents an event study chart, showing the impact of the London 2012 Games for each year, relative to the final year before the Games (i.e. 2011). Large increases (statistically significant at the 99% confidence level) were seen in the year of the London 2012 Games and in 2015. In both instances, it was estimated that turnover grew by 17 percentage points more in treatment areas compared to control areas.

In all post intervention periods, statistically significant effects (at least the 90% confidence level) were observed, providing evidence that the impact of the London 2012 Games persisted.

Sensitivity to changes in the radii: As the treatment radius was reduced (i.e. impacts are more localised around the Olympic Park), the statistical significance of the effects began to dissipate. A likely explanation for this is that turnover data is collected at the enterprise level – i.e. a head office, as opposed to revenue for each branch of a store. This may mean that head offices of companies that are captured in the 10km radius are not captured at the 5km radius – and so it appears that there is no change.

The results largely seemed robust to changes in the exclusion radius (which influences the choice of control LSOAs) – however there were some inconsistencies in the estimates. When looking at the 10km treatment radius, most post- London 2012 Games periods remained statistically significant as the exclusion radius increases. In most instances, the magnitude of the impact is highest when the 30km buffer is selected and drops when the 20km and 40km buffer is selected.

Figure 2.7: Event study chart showing impact of the London 2012 Games on growth in turnover for firms in the tourism sector (using a 10km treatment radius and 30km exclusion radius)

Table 2.7: Dynamic impact of the London 2012 Games on turnover growth for firms in the tourism sector, varying the radii for treatment and control areas. Statistically significant effects have been noted with asterixis.

	10km treatment 30km exclusion	10km treatment 20km exclusion	10km treatment 40km exclusion	5km treatment 30km exclusion	5km treatment 20km exclusion	5km treatment 10km exclusion	Host boroughs vs matched non-host boroughs
2006	-0.0165 (0.0405)	-0.0068 (0.0391)	-0.0153 (0.0413)	0.0233 (0.0772)	-0.0912 (0.0733)	-0.0543 (0.0797)	-0.0166 (0.0525)
2007	0.0400 (0.0496)	-0.0275 (0.0512)	-0.0049 (0.0507)	0.0281 (0.0993)	-0.0520 (0.0998)	-0.0931 (0.1098)	-0.0111 (0.0606)
2008	0.0669 (0.0488)	0.0667 (0.0465)	0.0701 (0.0480)	0.0112 (0.0939)	-0.1169 (0.0968)	-0.1242 (0.1056)	-0.0198 (0.0627)
2009	0.0604 (0.0466)	0.0070 (0.0449)	0.0318 (0.0473)	-0.0380 (0.0866)	-0.0942 (0.0833)	-0.1627 * (0.0917)	-0.0036 (0.0574)
2010	0.0362 (0.0466)	0.0515 (0.0445)	0.0184 (0.0474)	-0.0132 (0.0902)	-0.0865 (0.0887)	-0.1434 (0.0933)	-0.0199 (0.0570)
2012	0.1710*** (0.0453)	0.1087** (0.0434)	0.1409*** (0.0459)	0.1763** (0.0853)	0.0710 (0.0861)	0.0151 (0.0910)	0.0273 (0.0602)
2013	0.0872** (0.0426)	0.0425 (0.0425)	0.0234 (0.0436)	0.0819 (0.0795)	0.0703 (0.0800)	-0.0863 (0.0819)	-0.0268 0.0491
2014	0.0667* (0.0391)	0.0480 (0.0392)	0.0081 (0.0397)	0.0644 (0.0769)	-0.0990 (0.0713)	-0.1116 (0.0799)	-0.0335 0.0494
2015	0.1688*** 0.0420	0.1175*** (0.0399)	0.1154*** (0.0422)	0.1306 (0.0831)	-0.0276 (0.0731)	0.0209 (0.0828)	-0.0089 0.0489
2016	0.0795** 0.0401	0.0632 (0.0387)	0.0687* (0.0415)	0.0588 (0.0770)	0.1112 (0.0728)	-0.0792 (0.0757)	0.0448 0.0522
2017	0.0694* (0.0383)	0.0761** (0.0377	0.0715* (0.0394)	-0.0127 (0.0739)	-0.0810 (0.0701)	-0.0878 (0.0781)	-0.0635 (0.0501)
2018	0.0806** (0.0390)	0.0530 (0.0376)	0.0323 (0.0400)	0.0806 (0.0758)	-0.0748 (0.0703)	-0.0701 (0.0786)	0.0086 (0.0487)

Note: *** denotes statistical significance at the 99% confidence level; ** denotes statistical significance at the 95% confidence level; * denotes statistical significance at the 90% confidence level.

Fixed Effects (single binary variable): We also tested for the presence of statistically significant effects using a single binary variable, which encapsulated all post-treatment periods into a single aggregate coefficient. When using the 10km treatment radius and 30km exclusion radius, the coefficient was statistically significant at the 99% confidence level and suggested that growth in turnover of firms in the tourism sector was on average 7 percentage points higher in treatment areas compared to comparison areas after the London 2012 Games.

Sensitivity to changes in the radii: Similarly to employment in the tourism section, when the treatment radius was set to 5km and exclusion zone set to 10km, effects were not statistically significant. However, within-London comparisons yielded statistically significant effects at the 90% confidence level, with a smaller magnitude of impact.

Table 2.8: Impact of the London 2012 Games on turnover growth for firms in the tourism sector, varying the radii for treatment and control areas. Statistically significant effects shaded blue

	Coefficient	Standard Error	95% Confidence interval
10km treatment 30km exclusion	0.0725***	0.0134	[0.046 - 0.099]
10km treatment 20km exclusion	0.0576***	0.0131	[0.032 - 0.083]
10km treatment 40km exclusion	0.0494***	0.0134	[0.023 - 0.076]
5km treatment 30km exclusion	0.0811***	0.0262	[0.03 - 0.132]
5km treatment 20km exclusion	0.0686***	0.0251	[0.019 - 0.118]
5km treatment 10km exclusion	0.0391	0.027	[-0.014 - 0.092]
Host boroughs vs matched non-host boroughs	0.0401*	0.021	[ -0.001 - 0.081]

Note: *** denotes statistical significance at the 99% confidence level; ** denotes statistical significance at the 95% confidence level; * denotes statistical significance at the 90% confidence level.

Synthetic Control: As above, a synthetic control was used to explore the impact of the London 2012 Games within the London Boroughs in which the Games were held.

Table 2.9: Estimated Impacts of the London 2012 Games on growth in turnover in the tourism sector, using different methods of estimating impacts

Method	Coefficient	Standard Error	Lower bound 95% Confidence interval	Upper bound 95% Confidence interval
Synthetic Control	0.082	0.070	-0.056	0.220
Synthetic Difference-in-Differences	0.071	0.110	-0.019	0.161
Fixed effects (binary treatment variable)	0.005	0.046	-0.211	0.222

Note: *** denotes statistical significance at the 99% confidence level; ** denotes statistical significance at the 95% confidence level; * denotes statistical significance at the 90% confidence level.

Across all three model specifications, no statistically significant effects were identified, suggesting that there were no statistically significant differences in turnover growth in the employment sector between treatment and control areas. Interestingly, the magnitudes of the coefficients between the synthetic control methods and fixed effect regression differed, being over ten times higher. The synthetic control and synthetic DiD this time shared closer estimates.

When comparing to the within-London comparison in Table 2.8, the estimated impact of the synthetic control and synthetic DiD were about twice as high – albeit within the same order of magnitude.

2.5.3 Wellbeing

Understanding Society and the British Household Panel survey were used to explore changes in wellbeing between treatment and control areas. Respondents were asked to provide a rating of their current level of life satisfaction, on a 1-7 scale. For the wellbeing analysis, two different counterfactual options were explored:

Spatial counterfactuals: As in the above analysis, individuals in LSOAs within 10km of the London 2012 Games were compared to individuals from matched LSOAs that are beyond 30km of the Games. These distances were selected to illustrate the methods below. As in the above analysis the specific distances for the treatment and exclusion radii can be altered to best align with the overarching research questions. There may however be sample size implications for more tightly defined treatment radii

Engagement mode counterfactual: Changes in life satisfaction for those who actively participated in the London 2012 Games (e.g. going to watch an event live, employment related to the Games, volunteering during the Games) were compared to those who passively participate in the Games (e.g. watching on TV, listening on the radio, etc.). Comparisons against those who did not participate at all were not considered to be meaningful as there were likely to be significant differences (observable and unobservable) between those who chose to participate (either passively or actively) or chose not to participate (i.e. biases may arise due to self-selection).

Descriptive pre-Games trends: Across the pre– London 2012 Games waves average levels of self-reported life satisfaction were typically lower in areas within 10km of the Games compared to matched comparison areas. In treatment areas a sharp increase in life satisfaction was recorded in waves 12, 13 and 16 (corresponding to 2002, 2003 and 2006 respectively). Increases of the same magnitude were not seen in the matched comparison areas. After the increase in life satisfaction in treatment areas in wave 16 (2006) the average level of life satisfaction fell below the control area until the start of the London 2012 Games.

Descriptive post-Games trends: In the year of the London 2012 Games a small increase in life satisfaction was seen in treatment areas with no corresponding increase in matched comparison areas. However, the year after the London 2012 Games both treatment and matched comparison areas exhibited a decline in life satisfaction. Life satisfaction in both areas increased from wave 23, beginning to converge towards the end of the panel in Wave 28.

Figure 2.8: Trends in self-reported life satisfaction on a 1-7 scale, in treatment and control areas, over time. Note: Wave 11 of the British Household Panel Survey did not contain a life satisfaction question.

Causal Impact: A fixed effects regression (including individual and time effects) was used to estimate the impact of the London 2012 Games on life satisfaction. The event study plot below presents year-by-year differences in individuals life satisfaction between treatment and matched comparison areas. No statistically significant impacts were identified across all post- London 2012 Games periods.

Under this specification, the parallel trends assumption was less convincing. In several pre-treatment periods there were statistically significant differences between treatment and control areas. This therefore reduced the confidence in which we could claim that the treatment group would have trended in the same way as the control group in the absence of the London 2012 Games.

A second regression specification was run, to control for things known to drive wellbeing (e.g. household income, marital status, number of children, gender). Using this regression specification no statistically significant effects post- London 2012 Games were identified.

Figure 2.9: Event study chart showing impact of the London 2012 Games on life satisfaction. Note: Wave 11 of the British Household Panel Survey did not contain a life satisfaction question. Above event study chart does not include controls for other drivers of life satisfaction.

[Figure:Fig_2.9.svg]

Fixed effects regressions were also run using a single post-Game variable. These regressions were run both with and without controls. In both instances no statistically significant effects were identified.

The above analysis therefore suggests that the London 2012 Games did not lead to sustained increases in wellbeing.

Alternative counterfactual approaches: An alternative approach to identifying a counterfactual was tested – comparing changes in life satisfaction between those who actively participated in the London 2012 Games (e.g. going to watch an event live, employment related to the Games, volunteering during the Games) and those who passively participated in the London 2012 Games (e.g. watching on TV, listening on the radio, etc.). Fixed effects regressions were used to test for differences in self-reported life satisfaction between the two groups. In all instances effects were statistically insignificant suggesting that the mode of engagement did not lead to differential wellbeing impacts.

Departure from previous studies: The above results deviate from previous research. Dolan et al. (2016) used a DiD, comparing life satisfaction (as well as other measures of subjective wellbeing) among those living in London to life satisfaction among those living in Paris and Berlin.^{[footnote 10]} Dolan et al. (2016) found that in the year of the London 2012 Games life satisfaction was 0.7 points higher (on a 0 – 10 scale) in London compared to Paris and Berlin, statistically significant at the 95% confidence level. Dolan et al. (2016) did not find statistically significant effects beyond the year of the London 2012 Games. There are two potential reasons to why these results may deviate from prior research:

Spillover effects: the counterfactual differs between Dolan et al. (2016) and this case study. Dolan et al. (2016) utilise international comparisons, however, this case study uses spatially defined areas within the UK. It may be that there were UK wide impacts in life satisfaction, which limit the extent to which statistically significant impacts can be detected.
Timing of data collection: Dolan et al. (2016) target their data collection in 2012 around the timing of the London 2012 Games. Understanding Society (used to inform the wellbeing analysis in this case study) did not target their data collection specifically around the London 2012 Games. It may be that momentary uplifts in wellbeing had diminished by the time individuals were surveyed in Understanding Society, limiting the extent to which these momentary increases in wellbeing were captured in the data.

2.5.4 Volunteering

Understanding Society was also used to explore changes in volunteering behaviour between treatment and comparison areas. Individuals are classified as treated if they live within 10km of the London 2012 Games with individuals classified as comparison individuals if they live in a matched LSOA 30km from the London 2012 Games (identified through the analysis of employment and turnover). However, this analysis was not able to compare pre- London 2012 Games trends as volunteering questions were only introduced in the year of the London 2012 Games (and then every-other-wave from that point). The implication of this is that any observed changed in behaviour cannot be attributed to the London 2012 Games – namely because we are unable to distinguish between whether changes are due to the Games, or if outcomes were trending in the same way prior to the Games.

Figure 2.10: Trends in volunteering behaviour over time, in treatment and control areas, over the last 12 months (a) and 4 weeks (b)

(a)

(b)

Descriptive trends in the proportion of volunteers: The proportion of respondents who volunteer stayed broadly at the same level in the control areas over time. In wave 24, there is a step-change in proportions, from 17% to 24%, with no corresponding increase observed in control areas. The proportion of respondents who volunteered gradually beings to fall in the remaining periods.

Descriptive trends in volunteering hours: In all waves, control areas on average spent more time volunteering – although it should be noted that the trends were broadly similar across both areas until the final wave where a divergence in trend is observed. A spike in volunteering hours is seen in wave 24 (between January 2014 and May 2016). This affected both treatment and control areas similarly, and volunteering returns to ‘normal’ levels in the following wave. Given this spike occurs several years after the London 2012 Games, it is very difficult to identify any association.

Findings from regression analysis: The regression analysis did not identify statistically significant differences in volunteering behaviour between treatment and control areas. The analysis suggested that there were not statistically significant differences between the proportion of people that volunteer, and that there were not statistically significant differences in the number of volunteering hours.

Table. 2.10: Differences in outcomes between treatment and control areas

	Coefficient	Standard Error	Confidence Interval
Differences in the proportion of respondents who have volunteered at least once in the last twelve months	-0.039	0.086	[-0.207 – 0.129]
Differences in the hours of volunteering in the past four weeks	0.541	0.376	[-0.196 – 1.279]

Note: *** denotes statistical significance at the 99% confidence level; ** denotes statistical significance at the 95% confidence level; * denotes statistical significance at the 90% confidence level.

5.5.4 Olympic medals

In addition to the difference-in-difference analysis above we also explored whether improved elite sporting success was achieved during and following the London 2012 Games by analysing the trend in Olympic medals won by the Great Britain and Northern Ireland team.

Figure 2.11 shows an analysis of the last ten hosts of the Summer Olympic Games and their performance measured in ‘market share’ in the three editions before (t-3) and after (t+3) hosting (t-0). Market share is calculated as the sum of the medals points won by a nation in any given edition (where a gold medal equals 3 points, silver equals 2, and bronze equals 1), expressed as a proportion of the overall medal points awarded in that edition (using the same 3-2-1 scoring system). For example, Team GB won 115 medal points at Paris 2024 – 14 gold (42 points), 22 silver (44 points) and 29 bronze (29 points). The total number medals awarded in that edition were collectively worth 2,032 points. Therefore, Team GB’s 2024 market share was 5.7% (i.e. 115 divided by 2,032). If the market share scores are indexed, whereby t-0 equals 100 and all other scores are expressed relative to t-0, the unique performance of Team GB (represented by the broken purple line) can be seen from a different perspective.

2.11 Performance in the Olympic Games by previous hosts; Market share (left) and Index scores (right)

Like most other nations, Team GB demonstrated an improvement in its performance in the edition prior to hosting (t-1). Team GB’s market share in 2004 was 3.1%, increasing to 5.3% in 2008.

Host nations typically performed better at t-0 compared with t-1 and Team GB was no exception, with an increase in its market share from 5.3% in Beijing 2008 to 7.5% in London 2012.

While the market share achieved by other nations drops below t-0 at t+1, Team GB is a clear outlier. The two extra medals won by Team GB in Rio 2016 relative to London 2012 contributed to a marginal increase in its market share from 7.5% to 7.6%.

These findings are reinforced when looking at the changes in the number of total medals and gold medals won by nations before hosting (t-2 to t-1), when hosting (t-1 to t-0) and after hosting (t-0 to t+1) – see Figure 2.12 below. Team GB’s performance improved on both measures between t-2 and t-1 (+10 gold, +21 total) and again between t-1 and t-0 (+10 gold, +14 total). Between t-0 and t+1, Team GB won two more medals in total but won two fewer gold medals.

There is evidence of a decay in Team GB’s market share at t+2 (6.5%) and t+3 (5.7%), but Team GBs performance has uniquely remained above its scores during the pre-hosting period (which ranged from a low of 3.1% to a high of 5.3%).

Figure 2.12 Changes in the number of medals won by nations before, during and after hosting

In summary, the evidence points to a measurable elite sport legacy associated with London 2012, but one that needs to be appreciated in the context of a commitment to fund elite sport from the change in National Lottery rules in 1997, the immediate success achieved by Team GB, the award of London 2012, and the Prime Minister’s direct intervention to seek to win more medals in Rio 2016 than in London 2012. Therefore, hosting the Games can be viewed as a contributory factor to how Team GB performed in 2012 and the level of success achieved in subsequent editions.

This type of analysis can be reproduced for any multi-sport event (e.g. the Paralympic Games). The historical context and policy decisions aimed at supporting elite athletes and improving elite success (including but not limited to event hosting) are likely to be important determinants of the extent to which an elite sport legacy manifests and is sustained over time.

2.6 Robustness of findings

There were several factors which may have threatened the robustness of the analysis:

Outcome measurement:

Over time, there were changes in the way the data was recorded and measured:

Changes to administrative boundaries over time: The analysis of turnover and employment impacts was undertaken at the LSOA level. The LSOA boundaries change with each wave of the UK Census, to ensure that the areas comprise between 400 and 1,200 households. The analysis undertaken (2005 – 2018) spans two iterations of administrative boundaries. This is not anticipated to have impacted the analysis substantially as the boundary changes can be accounted for within the ONS Secure Research Service (SRS).^{[footnote 11]} However, publicly accessible versions of the datasets used are unlikely to have the necessary geographic granularity to accurately map the boundary changes.

That is not to say that this type of analysis cannot be done outside the SRS.^{[footnote 12]} But it will be significantly more difficult to accurately take into account the boundary changes over time and will increase the risk of measurement error.

Changes to Standard Industrial Classification (SIC) codes over time: In 2003, the SIC codes were updated (from their 1992 definitions). These codes were subsequently updated in 2007 and have not been updated since. Whilst a look-up exists^{[footnote 13]}, these codes do not map exactly. For ‘tourism’, the industry is comprised of codes from several sections, it is possible that there may have been some measurement error for years where code changes occurred.

Sample selection: The impact of sample selection is specific to longitudinal panels (such as Understanding Society) and becomes more pronounced the further past the event we try to measure impact. There are two types of sample selection that were likely to threaten the robustness of the results:

Within sample selection occurs if individuals in the treatment location move to the counterfactual location, and vice versa. This becomes increasingly likely the further events are in the past. There are two possible approaches that could be utilised to mitigate within sample selection:

Restrict the sample to only non-movers. For the purposes of this case study, we ran separate regressions including both movers and non-movers. The inclusion/ exclusion of movers did not have a material impact on the results.
Regress a binary variable that is equal to one if an individual has moved on a treatment dummy for each time period. If the treatment dummies are insignificant then we may be somewhat less concerned about residential sorting. This was tested as part of the case study analysis where it was found that those in the treatment area were less likely to move compared to comparison areas.

Differences in mobility between treatment and control areas may suggest that estimates could potentially be biased. The direction of the bias could run in either direction:

Overestimation could occur if for example people who were likely to benefit most from the London 2012 Games (e.g. those seeking new job opportunities) were less likely to move into the area, the observed positive effects might be partially attributed to the characteristics of the residents who were already there and less to the London 2012 Games themselves.

Underestimation could occur for example if people who were likely to experience negative impacts (e.g., those vulnerable to displacement or rising living costs) were less likely to move out, the observed effects might not fully capture the negative consequences for this group.

Out of sample selection occurs if individuals in the treatment or control group drop out of the sample. Where possible, restricting the sample to a balanced panel would mitigate selection issues, however, the underlying sample sizes may not always enable this option. It may be possible to test whether the treatment periods predict subsequent attrition by regressing a dummy that equals one if an individual has dropped out of the sample on treatment dummies, one for each time period in the past. If these treatment dummies turn out insignificant, we may be somewhat less concerned about attrition.

Threats to the assumptions of difference-in-differences:

For the DiD design to yield causal impact estimates, the underlying assumptions must be satisfied:

Parallel Trends: The analysis provided evidence that turnover and employment in the tourism sector trended in the same way prior to the London 2012 Games, and therefore we could be somewhat confident that they would continue trending the same way in the absence of the London 2012 Games. There are however nuances to this:

Lack of pre-intervention periods: The matching exercise to identify the comparison groups was undertaken on data from 2005 to 2011. Whist this is prior to the London 2012 Games, which were announced in 2005, with London 2012 Games spending beginning from the announcement (see bullet 3 below). The year for which LSOA codes are included in the ASHE data is 2005. This means that years prior to this cannot be used to create an LSOA panel dataset over time. It is therefore likely that matching was undertaken based on years that were impacted by the London 2012 Games.

Defining the treatment start date: The lack of data before 2005 meant that an additional analysis could not be undertaken to see how impacts varied depending on whether the start of spending led to benefits, or if it was the start of the London 2012 Games themselves that led to benefits.

Local context:

There were several concurring interventions around East London which would likely have ‘contaminated’ the impact analysis. This means it is hard to specifically attribute impact to the London 2012 Games, as opposed to other interventions which happened around the London 2012 Games (directly before/ after). The London 2012 Meta-Evaluation discusses the below challenges and sets out how the London 2012 Games accelerated growth in East London, but simultaneously the pre-existing growth and re-development in East London was a clear driver in bidding for the London 2012 Games. It is therefore likely that there are two-way causal links, that make specific attribution of impact difficult:

Regeneration programmes dating back to the 1980s and 1990s, including in particular the London Docklands Development Corporation (and Greenwich Riverside - including what is now the O2) and associated major infrastructure investments in infrastructure and other enabling investments, including major public projects (e.g. the DLR and Jubilee Line extension) and road (e.g. the Limehouse Link Tunnel), designed in particular to support the development of Canary Wharf and, associated with this, to overcome the traditional physical isolation of the area (historically fostered as a means of underpinning the security of the docks).

Planning policies - in particular the Thames Gateway policy framework - intended to (re)focus development on East London in support of objectives to ease the physical constraints on the expansion of the City of London itself and the wider constraints on the housing market created by the Green Belt.

These interventions fed into the development of major housing schemes, will have been a major enabler and driver of retail investments (particularly Westfield Stratford City) and no doubt helped to bring about substantial changes in local labour market flows.

2.7 Learning

2.7.1 What did we learn about the methods for quantifying legacy and impact?

To evaluate the legacy impact of the London 2012 Games, two primary econometric methods were used, i) fixed effects regressions (in conjunction with PSM to estimate the DiD design); and ii) synthetic controls:

Fixed effects regressions: The primary workhorse for causal inference when using panel data. As such, these marked the starting point of the analysis. Their flexibility (i.e. extending the DiD into an event study design) makes them an appealing econometric method to base an analysis upon.

A variety of matching algorithms can be used to identify a control group. We used a nearest neighbour without replacement algorithm, primarily to avoid complications in combining PSM weights and Understanding Society longitudinal weights. A good quality match was achieved, but there may be scope to achieve a tighter match by experimenting with different algorithms. The feasibility of utilising alternative algorithms would likely be dictated by the data and the econometric methods.

Binary vs event study specification: There were two approaches to presenting the results of a fixed effects regression: i) a single binary treatment variable, presenting an average impact over all post-treatment periods; or ii) an event study, presenting year by year impacts of all pre- and post-treatment impacts. Whilst the single treatment variable presented an easy to understand and interpretable aggregate measure of impact, it can sometimes mask temporal dynamics. The event study offered the benefit of demonstrating how the impact varied over time, and crucially, for how long impacts are estimated to persist for. In most cases, there is little marginal cost in estimating both specifications, allowing the researchers to fully understand the impacts of a major event.

Synthetic controls: Whilst it was possible to explore the use of alternative methodologies (e.g. synthetic controls), the existence of geographically granular data meant that this was not strictly essential. The lack of spatial confinement of the impacts meant that this type of analysis was not well suited to a synthetic control approach (that impacts were estimated to be larger and more likely to be statistically significant as the treatment radius increased). There may be instances where synthetic controls are a preferred approach, i.e., if impacts are confined to a single area/unit. For example, an evaluation of the Isle of Man TT races would likely see impacts confined to the Isle of Man, due to hard geographic borders that will minimise any spatial spillover. The nature of the data may also promote the use of the synthetic control, e.g. if granular geographic data is not available, an evaluator may be required to work at higher aggregated levels of geography (such as regional data). However, where granular output area data exists, there is no clear advantage in using a synthetic control instead of fixed effects.

Defining treatment and control areas: The econometric analysis was sensitive to changes in the spatial definition of the treatment areas – and less sensitive to changes in the control areas:

Changes to treatment areas: As the radius of the treatment area was reduced, estimated impacts became statistically insignificant. This suggests that the impacts were not concentrated around East London, rather spilled over into surrounding areas. There are a several considerations an analyst must make with respect to varying the treatment areas:

Before reducing the spatial definition of the treatment area, there must be consideration to the sample size. As the treatment radius falls – so too does the number of treated units. At low levels of geography (i.e. LSOA) this does not pose a threat to the statistical power of the econometric analysis. However, at higher levels of geography, sample sizes may become too small to undertake econometric analysis.
Changes in outcome levels may also be (partially) due to the data. For example, the BSD provides two levels of aggregation: local units (i.e. individual shops of factories) and enterprises (a collection of local units). Data on the number of employees is available at the local unit level, but turnover data is only available at the enterprise level. This may have implications as the treatment radius is narrowed – since this increases the risk that head offices may be excluded with a narrower radius around The Olympic Park.
The interpretation of the analysis also changes as the radius is varied. A larger radius looks more broadly at the impact of the London 2012 Games on London, whereas a narrower radius focuses on areas more local to East London. We also make comparisons between host Boroughs and non-host Boroughs – again changing the interpretation of the analysis. Researchers should define their treatment area such that it best addresses their overarching research questions.

Given the sensitivity of the results to the definition of the treatment area, this promotes the importance of an analysis protocol that clearly sets out and justifies the analytical approach. The protocol should be theoretically justified, and account for the availability and practicality of obtaining the necessary data. The model specification should not be justified by the results.

Changes to control areas: The choice of control areas was influenced by setting the exclusion zone, changing the pool of LSOAs in which treatment areas could be matched against. Model specification appeared to be robust against the radius of the exclusion zone – and robust to using only London Boroughs as a (matched) comparison group. This is demonstrated in Figure 2.13 below, which demonstrates that a good quality match was achieved for all radii (indicating covariate balance between treatment and comparison areas prior to the London 2012 Games). Furthermore, across the different radii, the treatment and matched comparison areas exhibit parallel trends, increasing the robustness of the model.

One important consideration when specifying the control areas is the interpretation of the results. For example, using the exclusion zone compares areas of London to the rest of the country, whereas using non-host Boroughs compares parts of London to other parts of London. Therefore, in addition to how well the control units resemble the treatment units in the counterfactual, researchers should consider the extent to which their specified counterfactual address their overarching research questions.

Figure 2.13: Standardised Mean Differences between treatment and matched comparison group for different radii

Data availability: The analysis was undertaken within the ONS SRS, which allowed for the use of geographically granular datasets that are not publicly available. Geographically granular data meant that boundary changes over time can trivially be overcome whilst simultaneously enabling analysis at lower levels of geography. This meant that assembling an LSOA level panel dataset resulted in a balanced panel. Undertaking this analysis outside the SRS, at the level of geographic granularity would be incredibly challenging due to missing observations and measurement error caused by changing boundaries over time and suppression of small samples.

However, there are conditions to using the ONS SRS to which users must adhere. This includes a project application and approval process^{[footnote 14]}, restrictions on which data sets can be combined, and also restrictions on data that can be linked within the SRS..^{[footnote 15]}

Broadly, the necessary secondary data is available to undertake an evaluation of the London 2012 Games across a range of outcomes. The key caveat, however, is that the ONS SRS will likely be required to ensure sufficient geographic granularity.

There are however some key considerations to consider for evaluations of future major events:

There would be benefit in deploying theory-based approaches, shortly after the conclusion of the London 2012 Games. The exact timings are dependent on the research questions an evaluation would seek to answer. This would allow the variety of interventions to be unpicked, in a way that quantitative analysis cannot do. A mixed methods evaluation in this instance would provide the best means of evidencing short- and long-term impacts.

Longitudinal surveys could be used to capture the before/ after impact of residents in an area. This would allow the specific (quantitative) research questions of an evaluation of the major event to be addressed. However, it should be noted that a longitudinal survey covering several post-event years would be expensive. There would also need to be consideration to ensure data collection in (potential) counterfactual areas. This may also enable additional sources of administrative data to be collected, e.g. the geographic distribution of ticket sales.

Following the principles of the HM Treasury Green and Magenta Book, evaluation should be implemented and considered during the design phase of an intervention (or in this instance the planning of a major event). For example, the creation of a logic model before the delivery of the event ensures that evaluators are able to effectively judge the success of an event against its original aims.

Whilst this case study, and more broadly this research project, focused on evidencing impacts, there should also be consideration to the Economic Case. In this instance, monetising the benefits would be relatively trivial, however given the time since the event obtaining original cost data could present more of a challenge.

2.7.2 What do the findings tell us about legacy and impact?

The analysis undertaken suggested that the London 2012 Games likely led to improvements in the tourism industry (in terms of employee and turnover growth); where there is evidence to suggest that the improvements in turnover persisted beyond the London 2012 Games. The outcomes of the London 2012 Games were tested utilising econometric methods, that align with Level 3 on both the Maryland Scientific Methods Scale and the NESTA Standards of Evidence.

A similar econometric approach was adopted to explore the existence of wellbeing impacts. The analysis did not find evidence of improvements in residents’ wellbeing, both in the short and long-term.

Whilst beyond the scope of this case study, the impacts identified in the above analysis could be easily monetised to form the beginning of a cost benefit analysis, to fully understand the extent to which the benefits justified the costs.

Question 4

3. Cricket World Cups

Accepted Answer

This case study seeks to test a set of methodologies that could be used to evaluate a major event held nationally across multiple areas. This contributes to the overall framework in how the legacy impacts of major events should be measured. This case study does not seek to evaluate the overall impact of the Cricket World Cups. This case study experiments with several techniques on a small number of metrics (e.g. participation, attendance and revenue. The analysis presented in this case study does not capture all possible outcomes and uses a range of methodologies, some of which, given the nature of experimentation, are not the optimal choice and were selected to test the effectiveness of the method. As such, this case study should not be used to judge the success of the Cricket World Cups.

3.1 Background and Categorisation

3.1.1 Type of Major Event

In the context of the major event typology outlined in the Toolkit, both the 2017 ICC Women’s Cricket World Cup (Women’s CWC) and the 2019 ICC^{[footnote 16]} Men’s Cricket World Cup (Men’s CWC) can be categorised as “national events” of international significance that are characterised by multi-city delivery utilising primarily existing infrastructure. Held every four years, these ICC events are widely regarded as the pinnacle competitions in cricket, particularly within the One Day International (ODI) format.

3.1.2 Focus of the event

The 2017 and 2019 CWCs were exclusively single-sport events contested by women and men respectively. As shown in Table 3.1, 2017 was the third edition of the Women’s CWC to be staged in the UK and 2019 was the fifth occasion that the UK hosted the Men’s CWC. Teams from eight nations qualified to participate in the women’s event (hosts England plus Australia, India, New Zealand, Pakistan, South Africa, Sri Lanka, West Indies). The men’s event featured ten teams (the same eight countries as for the Women’s CWC plus Afghanistan and Bangladesh).

Table 3.1: Previous Cricket World Cups staged in the UK

Year	Women’s / Men’s	Host Locations
1973	Women’s	England
1975	Men’s	England
1979	Men’s	England
1983	Men’s	England and Wales
1993	Women’s	England
1999	Men’s	England, Scotland and Wales*
2017	Women’s	England
2019	Men’s	England and Wales

* Some CWC matches in 1999 were also played in Ireland and the Netherlands.

3.1.3 Scale and Geography

The 31 matches of the Women’s CWC were geographically concentrated at venues located in the East Midlands (Derby and Leicester) and the South West of England (Bristol and Taunton) with the final staged at Lords in London. The Men’s CWC had a wider geographical spread with the 48 matches staged across ten different venues in England (covering eight regions) and one venue in Wales – see Table 3.2.

Table 3.2: Location of venues and number of matches used for the 2017 and 2019 Cricket World Cups

Venue	Men’s CWC 2017	Women’s CWC 2019	Total	UK Region
Lords	1	5	6	London
Derby	8	0	8	East Midlands
Bristol	8	3	11	South West
Leicester	7	0	7	East Midlands
Taunton	7	3	10	South West
Edgbaston	0	5	5	West Midlands
Sophia Gardens	0	4	4	Wales
Chester-le-Street	0	3	3	North East
Headingley	0	4	4	Yorkshire & The Humber
The Oval	0	5	5	London
Old Trafford	0	6	6	North West
Trent Bridge	0	5	5	East Midlands
Rose Bowl	0	5	5	South East
Totals	31	48	79	England & Wales

Three venues – including one in London (Lords) and two in the South West of England (Bristol and Taunton) – hosted both Women’s and Men’s CWC matches. The number of unique venues utilised across the two events was 13. The overall scale of the Men’s CWC was considerably larger than the women’s equivalent.

3.1.4 Significance

The Women’s and Men’s CWCs were nationally and internationally significant events in terms of their reach and level of engagement. The overall attendance across all matches during the men’s CWC in 2019 exceeded 750,000, and the global cumulative live audience reached 1.6 billion, with a unique broadcast audience of 706 million viewers. Every match of the women’s CWC in 2017 was broadcast live either on terrestrial TV or via digital platforms, and more than 180 million people around the world were reported to have watched the event. The domestic significance of the two events was amplified by England’s men and women emerging as the eventual winners of their respective tournaments. Moreover, the England men’s team won the CWC for the first-ever time in 2019, having finished as the runner-up on three previous occasions.

3.1.5 Competitive process

The International Cricket Council’s (ICC) executive committee decides on CWC host selection after evaluating the bids submitted by nations that express an interest to stage the event (However, it is worth noting that, because of the nature of global cricket, there are few countries that are capable of hosting international competitions on the scale of the World Cups. Therefore, in practice, the process is closer to rotating nations hosting the events rather than a pure competition). Final decisions are made several years prior to event hosting. The hosting rights for the 2019 Men’s CWC were awarded to England (and Wales) in 2006, and for the 2017 Women’s CWC the decision to award the event to England was made in 2013.

3.1.6 Duration

The Women’s CWC spanned 30 days, taking place between 24th June and 23rd July 2017. The overall duration of the Men’s CWC was 46 days, which ran from 30th May to 14th July 2019.

3.1.7 Construction of infrastructure

No new venue construction or widespread infrastructural development was required for the two events. Matches were staged at existing grounds used routinely by first-class cricket counties, which were capable of hosting domestic and international cricket fixtures. Some host venues would likely have made certain adaptations to their existing facilities as part of their broader ambitions to host international cricket and installed temporary seating to meet spectator demand for CWC matches. However, unlike the London 2012 Games, there was no major investment in facilities or supporting infrastructure that was designed to have ‘legacy’ uses. Instead, as part of the allocation of host venues, a demonstration of the venue’s ability to meet international cricket venue standards and a clear plan to achieve growth in cricket engagement and wider community benefit was required.

3.1.8 Catalysts

The host venues have continued to stage domestic and international cricket fixtures, which cannot be directly attributed to the staging of the 2017 and 2019 CWCs. The Hundred was first proposed by the England and Wales Cricket Board (ECB) as an idea in 2016 and was formally ratified by key stakeholders in 2017. With the launch of The Hundred, many of the CWC venues have become home grounds for the eight city-based franchises involved in this competition since its inaugural season in 2021.

3.2 Event Objectives and legacy strategy

A commonly stated goal associated with hosting a major sporting event of the scale and significance of the CWC is to grow the popularity of the sport featured (in this case cricket) and to inspire increased participation at grassroots level. A retrospective search revealed that the 2017 and 2019 CWC’s were no exception in this regard. This point is best illustrated by the quote below from the former Chief Executive of the ECB, Tom Harrison, in relation to 2019 Men’s Cricket World Cup:

Having the ICC Cricket World Cup played here in England and Wales gives us a once-in-a-generation opportunity. We must turn the excitement of a World Cup on home soil into a guaranteed route to draw more players and volunteers to recreational cricket. Cricket can inspire its next generation of fans and players by taking the tournament into clubs, playgrounds and classrooms across England and Wales and we will be working hard together to make the most of this moment.”^{[footnote 17]}

The ECB together with the ICC aimed to inspire and engage one million school-aged children during the year of the 2019 CWC. This ambition for growth was supported by various initiatives, notably:

the Cricket World Cup Schools Programme developed in conjunction with Chance to Shine to bring cricket to the playground and classroom within primary schools;
a World Cup Small Grants Scheme for recreational cricket clubs; and,
extending the reach of ECB’s All Stars Cricket programme among children aged 5-8.

This particular emphasis on growing participation in cricket, together with the availability of cricket participation data from annual large-scale national surveys (e.g. Active Lives), presented an opportunity to examine how cricket participation has evolved over time relative to pre-CWC levels among children (aged 5-15) as well as the adult population (aged 16+).

A more implicit objective for event hosting relates to commercial growth. The revenue associated with hosting international cricket fixtures is financially beneficial to the first class counties and the Marylebone Cricket Club (MCC), which operate the host venues involved. Beyond participation, it is therefore worthwhile to consider the extent to which the two CWCs may have contributed to the financial sustainability of the first-class county cricket clubs occupying the venues used for the 2017 and 2019 CWCs. It is also possible to examine their legacy on wider forms of cricket engagement (e.g. event attendance) and specific revenue streams for clubs such as ticket sales and membership revenue.

The main outcomes of interest for the CWCs align primarily with two Theories of Change (TOC) set out in the Toolkit, namely: (1) Health and Wellbeing; and, (2) Economic and Employment. A bespoke TOC for the CWCs incorporating elements of both these outcome areas is presented in Figure 3.1. The objectives have been taken from the ECB and ICC strategic plan to use the ICC Cricket World Cup 2019 to boost cricket participation and drive growth in the game. It should be noted that this TOC has been devised retrospectively, and its content has been inferred from the explicit and implicit actions of the ECB. To the best of our knowledge, there was no event-specific TOC devised for either event.

Figure 3.1: CWC Theory of Change

3.3 Choosing indicators and data sources

Given the retrospective nature of this evaluation for the 2017 and 2019 CWCs, we were reliant exclusively on the availability of secondary data. Based on our interpretation of the objectives of the CWCs, the most relevant secondary sources to investigate for cricket participation data in England and Wales were Sport England’s Active Lives Survey (ALS), the National Survey for Wales (which incorporates a Sport and Active Lifestyles section), and Sport Wales’ School Sport Survey. For the finance-related outcomes of interest, the annual reports and audited financial statements for the first-class county cricket clubs in England and Wales were the most pertinent and accessible source of data.

Table 3.3 presents a list of indicators that could be used to evaluate the legacy of the Men’s and Women’s CWCs, together with the with the reasons for their inclusion or exclusion.

Table 3.3: Relevant indicators

Indicator	Included in analysis?
Levels of cricket participation nationally and in geographical areas that hosted CWC matches	Yes – ALS provides extensive data for this indicator for both adults and children and young people.
Attendance at first-class county cricket clubs that hosted CWC matches compared with non-hosts	Yes – clubs’ accounts provide data on ticket sales revenue which can be used a proxy for attendance.
Membership levels of first-class county cricket clubs that hosted CWC matches compared with non-hosts	Yes – clubs’ accounts provide data on membership revenue which can be used as a proxy for membership levels.
Income of first-class county cricket clubs that hosted CWC matches compared with non-hosts	Yes – relevant data are available from clubs’ accounts.
Profitability of first-class county cricket clubs that hosted CWC matches compared with non-hosts	Yes – relevant data are available from clubs’ accounts.
Volunteering	No – national surveys such as ALS do not provide sport-specific volunteering rates.
Subjective wellbeing and health nationally and in specific geographic areas connected to the CWC	Partially – ALS has some wellbeing data. Health outcomes were not examined because there was no clear line of sight between the inferred theory of change and improvements in health as well as due to the absence of usable health data.
Television audiences and associated advertising revenues for cricket coverage	No – reliable data source not identified.

3.4 Methodology

3.4.1 Overview of case study methodology

The analysis of participation outcomes focussed on England for two reasons. First, Women’s CWC matches were held exclusively in England and only 4/48 (\~8%) of Men’s CWC matches were held in Wales. Second, cricket participation data for Wales are less granular and they are not directly comparable to the more extensive data available for England.

The Active Lives Survey provides a rich source of data on sport and physical activity for adults (16+ years) and for children and young people (5-15 years) in England. For adults, ALS data are available for every year between 2015/16 (prior to the Women’s CWC in 2017) and 2022/23. For children and young people, ALS data availability is restricted from 2017/18 (post the Women’s CWC in 2017 but prior to the Men’s CWC in 2019) to 2022/23. These datasets can therefore be examined to make an informed assessment of cricket participation rates over time in the population in general, and more precisely within geographic areas that were exposed to 2017 and 2019 CWCs. For outcomes related to participation, the key metrics that could be used based on the data captured by ALS are shown in Table 3.4.

Table 3.4: Cricket participation metrics

Data source	Indicator	Data availability
ALS Adult	Participation in cricket in the last 12 months (any/annual)	2015-16 to 2022-23 (data collection is from November to November)
	Participation in 2+ sessions of cricket in the last 28 days (regular)
ALS Children & Young People	Participation in cricket in the last 7 days	2017-18 to 2022-23 (data collection period is between September and July)
	Participation in cricket during school hours once a week or more
	Participation in cricket outside school hours once a week or more

ALS datasets were also used to examine wellbeing measures linked to life satisfaction and happiness, both of which are recognised measures utilised by the Office for National Statistics (ONS). A more restricted timeframe was used for the analysis of wellbeing (2016/17 to 2019/20 for adults and 2017/18 to 2019/20 for children and young people) for three reasons. First, enhancing wellbeing was not an explicit ambition of the CWCs. Second, from a technical standpoint, there was no baseline wellbeing data available for the Women’s CWC. Third, life satisfaction and happiness are dependent on a much broader array of factors rather than being confined to sport-specific outcomes such as participation.

Additional metrics used for the finance-related outcomes included: total revenue; major match revenue; membership revenue; domestic ticket revenue; and net profit. The data for these indicators were collated from the annual accounts of the 18 first-class county cricket clubs in England and Wales. This data covered a period of nine years - 2015 to 2023.

3.4.2 Defining a target area

Data on the geographic origin of spectators/audiences who attended and/or watched the 2017/2019 CWCs are not available in the public domain. This lack of data meant that it was not possible to select a target area for analysis aligned with those who attended and/or watched these events. Based on the nature of secondary data available for the outcomes of interest, the target area selection was informed by the jurisdictions corresponding to the county cricket clubs in England affiliated with the CWC venues. Defining the target area in this way enabled us to investigate the potential impacts of event hosting. The target area composition for outcomes relating to cricket participation is explained below.

Some of the geographic areas where the CWC venues were located were not covered explicitly by the county cricket clubs’ jurisdictions. For example, the venue used by Gloucestershire County Cricket Club for its home matches is in Bristol, which is a Unitary Authority and lies outside Gloucestershire County Council’s boundaries. To a greater or lesser extent, similar subtleties were evident in the case of Lancashire (Old Trafford), Warwickshire (Edgbaston) and Surrey (The Oval). To account for these issues, the pragmatic solution was to devise more bespoke geographical units that better correspond to the different county cricket clubs affiliated with the CWC venues. The final composition of these units was also informed by the availability of data at that level of geography.

The geographical units in the treatment group were not of equal sizes, because some county cricket clubs represented broader jurisdictions than others. Yorkshire County Cricket Club represented the entirety of the Yorkshire and Humber region. Similarly, the areas covered by the venues used by county cricket clubs for Middlesex and Surrey were representative of the London region. By contrast, the geographic areas represented by county cricket clubs like Derbyshire, Nottinghamshire, Durham and Somerset were relatively smaller (i.e. Unitary Authorities or Non-Metropolitan Districts). Because of these variations in size, the bespoke geographical units were then amalgamated to form a single treatment area or ‘event group’. Four event groups were subsequently developed, as outlined below.

Event group 1 comprising locations associated with the Women’s CWC.
Event group 2 comprising locations associated with the Men’s CWC.
Event group 3 comprising locations within event group 1 and/or event group 2.
Event group 4 which reflects the level of event exposure (i.e. one event only, or both events).

Table 3.5 maps the different event groups to the county cricket clubs affiliated with CWC venues. Geographic variables available within ALS were used to construct the relevant event groups and their precise composition is shown in Appendix A.

Table 3.5: Event groups used for the analysis of participation outcomes

County Cricket Club	Event group 1	Event group 2	Event group 3	Event group 4
Middlesex	Yes	Yes	Yes	Yes – Both
Derbyshire	Yes	No	Yes	Yes – One
Gloucestershire	Yes	Yes	Yes	Yes – Both
Leicestershire	Yes	No	Yes	Yes – One
Somerset	Yes	Yes	Yes	Yes – Both
Warwickshire	No	Yes	Yes	Yes – One
Durham	No	Yes	Yes	Yes – One
Yorkshire	No	Yes	Yes	Yes – One
Surrey	No	Yes	Yes	Yes – One
Lancashire	No	Yes	Yes	Yes – One
Nottinghamshire	No	Yes	Yes	Yes – One
Hampshire	No	Yes	Yes	Yes – One

Note: Excludes Glamorgan due to lack of comparable cricket participation data for Wales.

For wellbeing outcomes, we were able to test for changes in life satisfaction and happiness in relation to the Men’s CWC only (event group 2) because there were no baseline wellbeing data available for the Women’s CWC.

For the finance-related outcomes of interest, the data were better aligned with county cricket clubs affiliated with CWC venues as these data were drawn from their annual accounts. The full set of 18 first-class clubs is shown in Table 3.6 along with the event groups to which they were assigned.

Table 3.6: Event groups used for the analysis of finance-related outcomes

County Cricket Club	Event group 1 Women’s CWC	Event group 2 Men’s CWC	Event group 3 Either event
Middlesex	Yes	No*	Yes
Derbyshire	Yes	No	Yes
Gloucestershire	Yes	No*	Yes
Leicestershire	Yes	No	Yes
Somerset	Yes	No*	Yes
Warwickshire	No	Yes	Yes
Glamorgan	No	Yes	Yes
Durham	No	Yes	Yes
Yorkshire	No	Yes	Yes
Surrey	No	Yes	Yes
Lancashire	No	Yes	Yes
Nottinghamshire	No	Yes	Yes
Hampshire	No	Yes	Yes
Northamptonshire	No	No	No
Sussex	No	No	No
Kent	No	No	No
Worcestershire	No	No	No
Essex	No	No	No

* Because venues associated with these clubs hosted both Women’s and Men’s CWC matches, they were excluded from the DiD analysis for the Men’s CWC which occurred after the Women’s CWC.

3.4.3 Defining a counterfactual

For participation outcomes, we created control groups that can be used for comparison with the corresponding event groups:

Control group 1 comprising locations not associated with the Women’s CWC.
Control group 2 comprising locations not associated with the Men’s CWC.
Control group 3/4 comprising locations within control group 1 and control group 2.

Given each event group and its corresponding control group were pooled categories (i.e. there was one data point per event group and one per control group for each year), the analysis was based on examining differences in scores within and between groups over time. The analysis of participation outcomes for adults and children and young people was structured as shown in Table 3.7. The scope of this analysis was influenced by the availability of data for certain years and practical realities about analysing data for every possible permutation.

Table 3.7: Scope of participation analysis

Demographics	CWC 2017	CWC 2019	Either Event	Event Exposure
Comparison	Event group 1 v control group 1	Event Group 2 v control group 2	Event group 3 v control group 3	Event group 4 v control Group 4
Adults
All	✓	✓	✓	✓
Men only	🗴	✓	🗴	✓
Women only	✓	🗴	🗴	✓
Children & young people
All	🗴	✓	🗴	✓
Boys only	🗴	✓	🗴	✓
Girls only	🗴	✓	🗴	✓

The baseline year used for cricket participation among adults was 2015/16 (which relates to the year immediately prior to the Women’s CWC). The baseline year for children and young people was 2017/18 (which relates to the year immediately before the Men’s cricket CWC). ALS data are not available for children and young people prior to 2017/18. The time-series data for both event and control groups were tested for statistical significance (at the 95% confidence level, i.e. p<0.05) relative to their respective scores in the baseline year.

For the finance-related outcomes, two pragmatic matching criteria were used to develop the control group: (1) they comprised first-class county cricket clubs (which is a shared characteristic with the event group); and (2) because the venues used by these clubs were not utilised for the Men’s or Women’s CWCs (which is a point of difference from the event group). Based on these criteria, the following control groups were developed.

Control group 1 (Women’s CWC) – 13 first-class county cricket clubs (Warwickshire, Glamorgan, Durham, Yorkshire, Surrey, Lancashire, Nottinghamshire, Hampshire, Northamptonshire, Sussex, Kent, Worcestershire, and Essex).
Control group 2 (Men’s CWC) – 5 first-class county cricket clubs (Northamptonshire, Sussex, Kent, Worcestershire, and Essex).
Control group 3 (either event) – 5 first-class county cricket clubs (Northamptonshire, Sussex, Kent, Worcestershire, and Essex).

For the finance-related outcomes, the event and control groups were used within a difference-in-difference (DiD) framework to draw comparisons between the two groups of clubs (treatment and control), before and after the Women’s and Men’s CWCs. For the women’s CWC analysis, the pre-treatment period was 2015-2016 and the post-treatment period was 2017-2018 (the final post-treatment period was the year prior to the Men’s CWC). For the Men’s CWC analysis, the pre-treatment period was 2015-2018 and the post-treatment period was 2019-2023. For the combined analysis (either event), the pre-treatment period was 2015-2016 and the post-treatment period was 2017-2023. The DiD design can be estimated using a fixed effects regression^{[footnote 18]}. By interacting the treatment variable and a binary year variable, year-specific effects can be estimated, which is commonly referred to as an ‘event study’. The benefit of an event study is that it shows how the magnitude and significance or the impacts change over time, and accounts for temporal variability in the impacts. The event study plots presented in the findings section below demonstrate the extent to which the clubs in the event and control groups differed prior to the CWCs. The reference year in the event study plots is the final pre-treatment year. Coefficients show differences between event and control groups, relative to the final pre-treatment year.

3.4.4. Robustness

Visual inspection of trends for treatment and control groups, confirmed statistically by the significance of the fixed-effects regression coefficients for the pre-intervention periods. In some instances, the parallel trends assumption was violated, which limits the robustness of the analysis.

3.5 Findings

Table 3.8 presents a summary of findings emerging from the analysis conducted to examine the legacy impacts of the women’s and men’s CWCs.

Table 3.8: Summary of findings

Outcomes	SMS level achieved	Significant results?	Outcome achieved and persistence of outcomes
Increased cricket participation	SMS Level 1 – before and after comparison within treatment and control groups	Yes – in some cases (p<0.05)	Achieved for some population segments but not sustained or directly attributable
Enhanced subjective wellbeing	SMS Level 2 – before and after comparison within treatment and control groups	No (p>0.05)	No impact found however gap between event and data collection could miss some impact
Improved financial performance	SMS Level 3 – using DiD (fixed effects)	Yes – in some cases and at varying confidence levels (p<0.10)	Some evidence there was impact up to five years after the event

3.5.1 Participation

Given the breadth of the analysis conducted for this outcome, only selected participation findings are presented in this section. Additional findings are available in the appendices. First, we examine national level cricket participation rates in England among adults as well as for children and young people, regardless of where they reside in the country. The influence of confounding factors notably COVID-19 as well as the occurrence of major domestic and other international cricket events is considered in the interpretation of the data.

3.5.1. Participation (National picture – adults)

In 2015/16, the year prior to the 2017 Women’s World Cup, 2.9% of adults in England participated in cricket at least once during the year. For men, this statistic was 5.1% compared with 0.8% for women. Figure 3.2 shows the trend in cricket participation between 2015/16 and 2022/23 for all adults (the blue line), men (the orange line) and women (the green line). It also presents data on regular cricket participation (2 or more sessions of at least 10 minutes in the last 28 days).

Figure 3.2 Cricket participation rates among adults in England (left-hand side) and index scores (right-hand side) (2015/16=100), for the last 12 months (a) and last 4 weeks (b).

(a)

(b)

Key findings:

Any participation

Prior to the year 2019/20 (which was affected by COVID-19 from March 2020) there were relatively minor fluctuations in participation among all adults (ranging from a low of 2.65% to a high of 2.90%) as well as by gender (men: 4.44% - 5.11%; women: 0.80% - 1.01%).
Nationally, there were no material changes in cricket participation around the time of, or immediately following, the Women’s and Men’s CWC in 2017 and 2019 respectively. While the highest rates of participation among women in the time series were observed between 2016/17 and 2018/19 (0.94% - 1.01%), evidence of a direct cause-and-effect relationship with the major ICC cricket events held in England during this period is unsubstantiated due to the potential for any fluctuations being explained by sampling error in cross-sectional samples.
Following the implementation of COVID-19 restrictions, cricket participation rates declined predictably during 2019/20 and 2020/21, which was followed by a period of recovery in 2021/22 and 2022/23 (coinciding with the introduction of The Hundred), but these slightly higher rates still remain below the rates seen earlier in the time series.

Regular participation

A broadly similar pattern emerges when examining regular cricket participation data However, the 2021/22 and 2022/23 regular participation rates overall and for men were at a more comparable level to the pre COVID-19 years, and for women they were higher compared with any other point in the time series.

Our overall assessment of the national-level cricket participation data presented above is that it does not provide any conclusive evidence of meaningful changes in adult participation that can be attributed exclusively to the hosting of the Women’s and Men’s CWCs in England.

3.5.1. Participation (National picture – children and young people)

As per the data presented in Figure 3.3, there was a visible spike in children’s participation in cricket nationally in 2018/19 relative to the previous year (from 6.28% to 8.12%). In the post-event year, participation declined compared with the baseline (4.96% v 6.28%), followed by a recovery in 2020-21 and remained relatively stable thereafter. The scores for 2020/21 onwards are all above the baseline score. This broad pattern is repeated for boys and girls.

This data is indicative of a genuine increase in cricket participation among children and young people in the year of the Men’s CWC and it also confirms that participation rates in the latter years have remained higher than pre-event levels. It is plausible that the Men’s CWC, among other explanations, may have been a contributory factor to these increases, particularly given the emphasis on engaging one million school-aged children during the year of the men’s CWC through initiatives such as the Cricket World Cup Schools Programme. The increases observed post 2019-20 may equally have been contributed to by the launch of The Hundred or the performances of the England men’s and women’s teams in other major international cricket events around that time (see Appendix B).

Figure 3.3: Cricket participation rates among children in England (left-hand side) and index scores (right-hand side) (2017/18 =100)

We now examine changes in participation rates within locations that were affiliated with the Women’s and/or Men’s CWC matches and draw comparisons with other ‘non-host’ geographic jurisdictions in England. The purpose of this analysis is to test for differences in participation rates within and between the two groups of locations (i.e. event group and control group) before and after these events were staged.

3.5.2 Composite analysis – Event group 3/4 v control group 3/4

Focus on adults

The event group includes locations that were affiliated with one or both events and the control group consists of locations that did not host either event. The baseline year for this analysis is 2015/16, which was the first full year of ALS data prior to the Women’s CWC in 2017 (the 2016/17 ALS covers a combination of pre, during and post event data). The first post-event year for comparison is 2019/20, which provides a full year of ALS data immediately following the Men’s CWC in the summer of 2019 (the 2018/19 ALS encompasses pre, during and post event data). The relevant data are shown in Figure 3.4.

Figure 3.4: Cricket participation rates among adults for event and control groups (left-hand side) and index scores (right-hand graphs) (2015/16 =100) – Women’s CWC 2017 and/or Men’s CWC 2019

Key findings:

For participation in the last year, the scores for the event group remained above the control group throughout the time series. Excluding 2019/20, the regular participation scores (last 28 days) were also generally higher for the event group.
For both participation thresholds (any and regular), there was a statistically significant decline between the baseline year (2015/16) and the post-event year (2019/20) for both the event and control groups. This finding can be explained, at least in part, by 2019-20 being a COVID-19 affected year.
For participation in the last year, the scores for both event and control groups from 2020/21 onwards have been significantly below their respective baselines.
The pattern of regular participation is mixed. For this threshold, compared with their respective baselines, the score for the control group was significantly lower in 2021/22, whereas the score for the event group was significantly lower in 2022/23.

As an extension of this analysis, it is possible to examine changes in cricket participation based on the level of exposure for locations in England (i.e. no event exposure, exposed to one event only, or exposed to both events) – see Figure 3.5.

Figure 3.5: Cricket participation rates among adults (left-hand side) and index scores (right-hand side) (2015/16 =100) – by level of event exposure

Key findings:

For participation in the last year, the scores for the group of locations that were exposed to both events were consistently higher compared with those that were exposed to one event only, and the latter in turn had higher scores than areas with no event exposure.
The 2018/19 scores for any participation associated with ‘both events’ and ‘one event’ did not change significantly relative to their baseline scores, but the score for ‘no event’ declined significantly between this period. For the regular participation threshold, only the ‘both events’ group had a comparable score between 2015/16 and 2018/19, whereas there was a significant decline in regular participation for the two other event exposure groups.
For both participation thresholds (any and regular), there was a statistically significant decline between the baseline year (2015/16) and the post-event year (2019/20) for all event exposure groups.
For participation in the last year, the scores in 2021/22 and 2022/23 for all event exposure groups remained significantly below their respective baseline scores.
The pattern of regular participation is mixed. For this threshold, compared with their respective baselines, only the score for locations that were exposed to both events was significantly higher in 2021/22, whereas the other event exposure groups had significantly lower scores. In 2022/23, only the score for the ‘one event’ group was significantly below its baseline level. The changes in the scores for the two other groups were not significant.

Findings from segregated analysis conducted for women and men are presented in Appendix C.

Focus on children and young people

Despite the absence of a true pre-event baseline for the Women’s CWC, it is possible to look at changes in cricket participation rates among children and young people based on jurisdictions with differing levels of event exposure over the timeframe for which relevant ALS data are available – see Figure 3.6.

Figure 3.6: Cricket participation rates among children for event and control groups (left-hand side) and index scores (right-hand side) (2017/18 =100) – by level of event exposure

There appears to be no evidence of a dose-response relationship between the rate of increase in participation between 2017/18 and 2018/19 and the level of event exposure, given the similarity in the index scores for all groups in the latter year. However, a dose-response relationship could be inferred when considering the index scores for 2020-21. Given that the inaugural edition of The Hundred took place during July-August 2021, it is unlikely to have had any material influence on these findings. Any influence of The Hundred is more likely to be evident in the 2021/22 and 2022/23 data.

3.5.3 Women’s Cricket World Cup – Event group 1 v control group 1

Figure 3.7 presents cricket participation rates (in the last 12 months and in the last four weeks) for all adults, disaggregated by the event and control groups. The baseline year for the Women’s CWC was 2015/16.

Figure 3.7: Cricket participation rates among adults for event and control groups (left-hand side) and index scores (right-hand side) (2015/16 =100), for the last 12 months (a) and the last 4 four weeks (b) – Women’s CWC 2017

(a)

(b)

Key findings:

Any participation

Among both the event and control groups, there were no statistically significant changes in scores between the baseline year (2015/16) and event year (2016/17).
The score for the control group declined significantly in the immediate post-event year (2017/18) compared with its baseline by around 11% in relative terms (from 2.82% to 2.50%). By contrast, the score for the event group did not change significantly during the same period (3.16% in 2015/16 v 3.17% in 2017/18).
A possible explanation for these findings could be that the 2016/17 ALS data may not capture any potential post-event effect fully, because of the incongruence between the ALS data collection period (November 2016 to November 2017) and the timing of the event (June-July 2017), i.e. the 2016/17 ALS only incorporates four months of post-event data.
The score for the event group increased by 9% in relative terms in 2018-19 compared with the baseline year (from 3.16% to 3.46%) whereas the 2018-19 score for the control group (2.52%) remained below its baseline level (2.82%).
The scores for 2019-20 and 2020-21 are affected by COVID-19 restrictions and the scores for both event and control groups declined in these years – but the index scores for the event group remained consistently higher relative to the control group. This finding indicates that the proportionate decline in participation was less pronounced among the event group.
In the two subsequent years – 2021-2022 and 2022-23 – the scores for both event and control groups remained significantly below their respective baselines, but the rate of recovery was greater among the event group as evidenced by the index scores.

Regular participation

For the regular participation threshold, there were significant declines in the control group scores for all years in the time series compared with the baseline year. By contrast, the scores for the event group remained relatively stable (no significant changes) between 2015-16 and 2018-19.
In the post COVID-19 period, participation increased significantly for the event group in 2021-22 relative to 2015/16 (from 0.84% to 1.01%). The difference in the event group scores between 2015-16 and 2022-23 was not statistically significant (0.84% v 0.94%).
The scores for both event and control groups in the baseline year (2015-16) were almost identical (0.84% v 0.82%). For every other year of the time series (except in 2019-20 when the scores were similar), the scores for the event group were consistently higher than the control group.

The analysis conducted for the adult population in England was reproduced for women only – see Appendix E. For children and young people, there was no pre-event baseline available for the Women’s CWC as the first year of the ALS data for children and young people is for the academic year 2017/18 (i.e. September to July), which relates to the period immediately following the Women’s CWC.

3.5.4 Men’s Cricket World Cup – Event group 2 v control group 2

Focus on adults

The baseline year for the Men’s Cricket World Cup was 2017/18. As shown in Figure 3.8, the cricket participation rates in this year were consistently lower than in 2015/16, which indicates that the ALS 2017/18 data are not likely to be contaminated by the potential influence of major ICC events held in England during 2017. For consistency with the Women’s CWC analysis, the index scores for each year are presented relative to 2015/16.

Figure 3.8: Cricket participation rates among adults for event and control groups (left-hand side) and index scores (right-hand side) (2015/16 =100) – Men’s CWC 2019

Key findings:

For participation in the last year, the scores for the event group remained above the control group throughout the time series. Excluding 2019/20, the regular participation scores were also generally higher for the event group.
For both participation thresholds (any and regular), there was a statistically significant increase between the baseline year (2017/18) and the event year (2018/19) for the event group, but not for the control group.
For participation in the last year, the scores for both event and control groups from 2019/20 onwards have been significantly below their respective baselines.
The pattern of regular participation is more diverse. For this threshold, compared with their respective baselines, the scores for both event and control groups declined significantly in 2019/20, the score for the event group was significantly higher in 2021/22 and the score for the control group was significantly higher in 2022/23.
All eight of The Hundred franchises were affiliated with locations in the event group. The extent, if any, to which The Hundred contributed to these findings cannot be isolated in the data.

When analysing the ALS data for men only, the findings are highly consistent with the pattern for all adults. The only exception is that the increase in men’s regular participation score for the event group between 2017/18 and 2018/19 was not statistically significant.

Focus on children and young people

The baseline for the Men’s CWC is 2017/18 and the event coincided with the ALS data collection period for 2018/19. Therefore, the first post-event year for comparison was 2019/20. Figure 3.9 presents the cricket participation data for children and young people disaggregated by event and control groups.

Figure 3.9: Cricket participation rates among children for event and control groups (left-hand side) and index scores (right-hand side) (2017/18 =100)

A statistically significant increase between 2017/18 and 2018/19 was evident within both event and control groups, as was the significant decline in 2019/20. However, as demonstrated by the index scores, the decline among the event group in 2019/20 was less pronounced than the decline in the control group. In 2020/21, the score for the event group increased significantly whereas the score for the control group did not, relative to their respective baselines. The scores for the two subsequent years were significantly higher for both event and control groups compared with 2017/18.

When examining cricket participation by gender (boys and girls) and by setting (in school and outside school), the findings are similar to the general pattern of participation for children and young people.

3.5.5 Wellbeing

Wellbeing was examined in relation to the Men’s CWC only as there was no baseline wellbeing data available from ALS (pre 2017) for the Women’s CWC for comparison purposes. It is worth noting that enhancing wellbeing of the population was not a stated outcome by the ICC or ECB associated with hosting the CWCs, so the likelihood of there being any material changes in holistic measures such as life satisfaction and happiness that could be attributed to the staging of these events is arguably low. This view is confirmed by the wellbeing scores for adults and children presented in Figure 3.10.

Figure 3.10: Average wellbeing scores for life satisfaction and happiness domains

At face value, there were marginal increases in the average scores for life satisfaction and happiness in the year of the Men’s CWC (2019) among adults nationally and within both event and control groups, which declined in the following year. The magnitude of these changes was too small to be able to conclude that they were genuine differences and attributable to the event.
For children and young people, the average wellbeing scores declined slightly in the event year and, in the case of happiness, there was a further marginal decline in the post-event year. Even for life satisfaction, where there was a small uplift in the average scores in the post-event year for the event group, the size of the effect is too small (a change of 0.07 which equates to around 1%).
Our overall diagnosis of this data is that there is no evidence to suggest that the Men’s CWC contributed to improvements in wellbeing for adults or children. Furthermore, there is no rationale as to why a relatively short-term event such as the Men’s CWC would have an impact on measures of wellbeing that tend to be multi-faceted and not prone to significant fluctuations.

3.5.6 Club Finances

Total revenue

Figure 3.11 presents the annual revenue between 2015 and 2023 of the first-class county cricket clubs associated with the Women’s and/or Men’s CWC, relative to those that were not. The dashed black line denotes the average revenue across all 18 counties. The solid lines represent event groups:

Event group 1 (orange) includes clubs associated with the Women’s CWC;
Event group 2 (blue) includes clubs associated with the Men’s CWC; and,
Event group 3 (green) includes clubs associated with either event.

The corresponding control groups are represented by the dotted lines of the same colour.

Figure 3.11: Total revenue – comparison of event and control groups

Descriptive trends:

For event group 1 (Women’s CWC), revenue increased by 11% between the pre-event year (2016) and the event year (2017). The corresponding increase for control group 1 was 21%. In the immediate post-event year (2018), revenue declined among both groups but remained above their respective 2016 levels.
For event group 2 (Men’s CWC), revenue increased by 40% between the pre-event (2018) and event (2019) years, compared with a more modest increase of 13% for control group 2. There are marked year-on-year increases in scores from 2020 to 2023 among the clubs in event group 2, whereas the scores for control group 2 exhibit a less pronounced growth pattern.
For event group 3 (either event), the trendline appears to mirror event group 2. Similarly, the trendline for control group 3 is virtually identical to control group 2.

Event impact:

Fixed Effects (event study). Figure 3.12 presents an event study plot for event group 3 (either event), showing the impact on total revenue of the CWCs for each year, relative to the reference year (2016). A statistically significant increase is evident in the years during which the CWCs were staged (2017 and 2019) and thereafter in 2021, but this increase is not sustained in the two subsequent years. Separate regressions were run to examine the Men’s and Women’s CWC in isolation. The analysis provided inconclusive results for the Women’s CWC, indicating that total revenue among the clubs in event group 1 was lower in the year of the Women’s CWC (2017). However, the pre-treatment period (2015) was statistically significant, suggesting that the parallel trends assumption was violated, which therefore limits the validity of the analysis. No statistically significant impacts at the 95% confidence level were identified for the Men’s CWC. However, total revenue was found to be higher for event group 2 compared with control group 2 at the 90% confidence level from 2019 onwards (except in 2020). There are also threats to the robustness of the analysis as one of the three pre-intervention periods (2016) is statistically significant, suggesting that there were genuine differences between event and control group prior to the Men’s CWC.

Figure 3.12: Total revenue – event study plot for event group 3

Fixed Effects (single binary variable). We also tested for statistically significant effects using a single binary variable, which encompasses all post-treatment periods into a single coefficient. Statistically significant increases in total revenue at the 90% confidence level were observed for both event group 2 and 3, but not at the more commonly used 95% statistical confidence level. The analysis suggested that clubs affiliated to the venues which hosted the Women’s or Men’s CWC matches (event group 3) saw their revenues increase by around £1.6m compared with the control group. The corresponding relative increase among clubs affiliated to the venues which hosted Men’s CWC matches only (event group 2) was £3.4m.

Net Profit

Figure 3.13 presents the net profit between 2015 and 2023 of the first-class county cricket clubs associated with the Women’s and/or Men’s CWC, compared with those that were not.

Figure 3.13: Net Profit – comparison of event and control groups

Descriptive trends:

For event group 1 (Women’s CWC), net profit increased by 70% between the reference (2016) and the event year (2017). For control group 1, profitability increased by a considerably greater margin (293%). In the immediate post-event year (2018), profitability declined among both groups and was below its 2016 level for event group 1.
For event group 2 (Men’s CWC), profitability increased by 2113% between the pre-event (2018) and event (2019) years, compared with an increase of 172% for control group 2. From 2021 onwards, event group 2 has remained more profitable than control group 2 (the latter is loss-making in 2022 and 2023).
For event group 3 (either event), the trendline appears to mirror event group 2. Similarly, the trendline for control group 3 is broadly comparable to control group 2.

Event impact:

Fixed Effects (event study). Figure 3.14 presents an event study plot for event group 3 (either event), showing the impact on net profit of the CWCs for each year, relative to the reference year (2016). Across all post intervention periods (i.e. 2017-2023), no statistically significant effects were observed at the 95% confidence level. At the 90% confidence level, significant increases in net profit were observed in 2017 and 2019 – corresponding to the years of the Women’s and Men’s CWC respectively. The Women’s and Men’s CWC were also examined separately. The Women’s CWC exhibited statistically significant impacts in 2017 and 2018 at the 90% and 95% confidence levels respectively. The results indicated that clubs affiliated to venues that hosted Women’s CWC matches exhibited less net profit compared to clubs in the control group. When looking at the Men’s CWC in isolation, statistically significant increases in net profit are observed at the 90% confidence level in 2019 and 2021.

Figure 3.14: Net profit – event study plot for Women’s and Men’s CWCs (event group 3 v control group 3)

Fixed Effects (single binary variable). When using a single binary variable encompassing all post-treatment periods into a single coefficient, the result is not significant. Separate analyses of the Women’s and Men’s CWCs also revealed insignificant findings.

Other financial metrics:

The findings for the other financial metrics examined are summarised in Table 3.9.

Table 3.9: Summary of findings for other financial metrics

Event	Reference year	Post-treatment year(s)	Event study	Single binary variable
Membership income
Women’s CWC	2016	2017-18	Insignificant	Insignificant
Men’s CWC	2018	2019-23	Event group 2 > control group 2 from 2019 onwards***	Event group 2 > control group 2 post treatment******
Either event	2016	2017-23	Event group 3 > control group 3 from 2020 onwards***	Event group 3 > control group 3 post treatment**
Domestic match income
Women’s CWC	2016	2017-18	Insignificant	Insignificant
Men’s CWC	2018	2019-23	Event group 2 > control group 2 in 2021 and 2023**	Event group 2 > control group 2 post treatment**
Either event	2016	2017-23	Event group 3 > control group 3 in 2021 and 2023***	Event group 3 > control group 3 post treatment**
Major match income (within event group only)
Women’s CWC	2016	2017-18	Insufficient data for analysis	Insignificant
Men’s CWC	2018	2019-23	2018 > 2020**	Significant increase post treatment**
Either event	2016	2017-23	2019 > 2016 2016 > 2017, 2018 & 2020*	Significant increase post treatment**

* p<0.01; ** p<0.05; *** p<0.1

Analysis of financial statements is subject to the caveat that although we subject the data to a standardised form of analysis, the reliability of the findings can be confounded by factors such as: differences in accounting policies within counties; different year end dates; and, different interpretations of data by auditors and accountants.

3.6 Learning

3.6.1 What did we learn about the methods for quantifying legacy and impact?

What were the methods able to show us?

In the absence of a clearly articulated programme of leveraging activity, the methods employed at least enabled us to illustrate whether there had been any statistically significant impacts, regardless of whether they were intended or not. The grouping of venues that played a role in either or both CWCs along with the availability of logically derived control groups, provided the basis for more rigorous analysis than descriptive statistics alone.

The methods also tell us that real life does not occur in a vacuum and that there can be both positive (e.g. The Hundred), negative (e.g. COVID-19) and random (e.g. venue capacity) confounding factors that will impact on the magnitude and interpretation of any impacts. A key learning point is to frame evaluations within the context of Preuss’ Legacy Cube and to focus initially on the planned outcomes rather than trying to identify outcomes that were incidental to what was intended.

Previous research has indicated that the impacts of major sports events can be lagged as much as between two and four years^{[footnote 19]}. In the case of two CWCs within the space of two years, it is therefore difficult to isolate whether effects in subsequent years are attributable to one event in isolation or the cumulative effect of multiple events – as well as non-controllable events in the external environment.

Which changes to the method improved estimation (where we used multiple techniques)?

Following the initial analysis of comparing treatment groups with control groups on a descriptive basis, it was possible to improve the rigour of our testing by making selective use of Difference in Difference (DiD) analysis. This technique enabled us to test more robustly whether the differences identified in the County Cricket Clubs’ financial data were statistically significant and persistent in the aftermath of the events.

With the participation data, although it was limited by being cross-sectional, it was possible to test for different demographics (age and gender) and different participation thresholds. This analysis provides more nuanced insight to the type of any participation impact created. While the cross-sectional participation data did not lend itself to a conventional form of DiD analysis, the interrogation of cricket participation rates over time among event and control groups provided an alternative analytical lens.

Any practical issues that prevented the use of methods?

The retrospective nature of the work and the limited articulation of what the legacies of the CWCs should be, resulted in us having to ‘bolt on’ rather than ‘bolt in’ the techniques used. Ideally, legacy impact evaluations should be integral to event planning and delivery. If this point is accepted, then the measurement of legacy impacts must be consistent with the Theory of Change and robust enough to measure and attribute change. This point is made strongly in the Gold Framework in which UK Sport and DCMS^{[footnote 20]} state:

Event organisers must consider from an early stage how the impact of an event will be measured throughout the time before, during and after the event has concluded. This is a critical factor in securing any government investment for an event, where monitoring and evaluation of the proposed outcomes is crucial to demonstrating value for money. (Page 80)

Despite the best of intentions, it was not possible to employ meaningful counterfactual arguments to underpin the analysis. Initial thinking was that those locations which hosted CWC matches would be where it would be most likely to see positive changes in participation. However, the reality that people could witness the events on television and that there was no guarantee that those attending the events were local to the host venues, may have compromised the validity of the control groups.

Implications for monitoring activities and are there significant gaps in available data?

The key findings for monitoring activities in future evaluations are twofold. First, there needs to be a clear Theory of Change, Programme Theory, or Logic Model to outline the planned impacts, the mechanisms by which the impacts will be achieved, and the measures by which success or failure can be measured. Second, it follows that any data sources that are used must be fit for their intended purpose. We have found ourselves having to make numerous compromises and assumptions to deal with vagaries in the outcomes and limitations in the data sources, which in turn impact upon methodological choices. Examples include, but are not limited to:

There was no pre-event baseline for the Women’s CWC for cricket participation among children and young people;
Pragmatic grouping of geographical units was employed to account for some host venues being located outside the geographical remit of the county cricket clubs;
The unequal sizes of treatment groups meant we could not match them with corresponding control groups; and
Participation data in both the Active Lives Surveys is cross-sectional rather than longitudinal or panel data, which in turn limits the rigour of statistical tests that can be applied.

3.6.2 What do the findings tell us about legacy and impact?

Legacy does not occur by a process of social osmosis, whereby staging an event is sufficient on its own to bring about desirable outcomes. It is a widely accepted principle in the academic literature that events can act as a platform from which it might be possible to bring about change subject to a programme of leveraging activity that accompanies the event.

The CWC events were commercial events for which the end goal was to deliver them profitably for the ICC and its commercial and broadcast partners. The events were not supported with public money. Two large scale leveraging activities are known: the Cricket World Cup Schools Programme and the Dream Big Desi Women programme. The latter ran for 4 years from 2018, with a public investment of £1.2m from Sport England and matched funding from the England and Wales Cricket Board, to recruit 2,000 South Asian female cricket leaders. (The programme exceeded its initial target and recruited over 5,500 female activators over a five year period.)Although the evidence on participation increases is mixed and nuanced, one point is clear which is that any potential increases in participation were not sustained in the medium term. At best, there is plausible evidence that the events supported short-term spikes in cricket participation among certain population segments and contributed to the ongoing financial sustainability of first class county cricket clubs.

The legacy of major sporting events is a complex interplay of economic gains, social impacts, and environmental costs. Success largely depends on governance, planning, and the involvement of local communities. In other words, legacy is crucially dependent on plans to lever it and therefore adopting a laissez faire approach to legacy is not a logical strategy that can be relied on to bring about desired outcomes.

Question 5

4. City of Culture Programmes

Accepted Answer

This case study explores a set of methodologies that could be applied to evaluate the UK City of Culture (UK CoC) programme or similar cultural events. The methodologies seek to overcome current limitations in measuring legacy through the broader framework which the case studies and subsequent toolkit provide. This case study contributes to the broader framework for measuring the legacy impacts of major cultural events. However, this case study does not aim to evaluate the UK CoC programme itself. Instead, it experiments with the feasibility and comprehensiveness of using various analytical techniques to isolate the effects of the programme. The metrics of tourism impact, wellbeing, impact on public funding, and cultural participation have been selected as they feature in evaluations of UK CoC titleholders to date and have been previously identified in bidding guidance for the title.

The analysis presented in this case study does not encompass all potential outcomes of the UK CoC programme but focuses on a specific subset. Therefore, this case study should not be used as a definitive measure of the programme’s overall success but rather as an exploration of methodological approaches for future evaluations.

The City of Culture Programme case study offers the opportunity test how effectively different methods can be applied for major events in a competitive series but should not be considered a comprehensive evaluation.

4.1 Background and Categorisation

4.1.1 Type of Major Event

The UK CoC programme is a culture-led major event with the aim of delivering a year-long celebration of arts and culture. The purpose of the UK CoC is to drive culture-led regeneration, enhance social cohesion and foster economic growth in designated host cities. Unlike one-off events (or events which occur over a short duration) which involve a degree of strategic planning, a UK CoC requires more in-depth strategic planning to create benefits that commence upon the title designation and extend beyond the title year. Each UK CoC is intended to be a celebration of the host city’s unique heritage and identity through performance, exhibitions, festivals and participatory initiatives.

This event type is characterised by its dual focus on cultural and socio-economic impact.

4.1.2 Focus of the event

The UK CoC programme positions culture as a catalyst for economic growth, social cohesion, and regeneration. Awarded every four years by the DCMS, the programme encourages host places to craft transformative, long-term strategies incorporating culture as an enabler to attract investment, boost tourism, and create sustainable employment opportunities in the cultural, creative and hospitality sectors. Efforts to increase access to cultural activities in underserved communities are also integral, with the aim of fostering an inclusive sense of pride and belonging.

Legacy is a key criterion of the award, with outcomes and impacts expected to extend beyond a title year Prompts around legacy in the bidding documentation and guidance include improving cultural infrastructure, promoting sustained community engagement, and strengthening of local economies. Through culture-led regeneration, the programme seeks to unite communities and demonstrate the transformative power of arts and culture.

4.1.3 Scale and Geography

The UK CoC programme operates on a national scale and is open to cities, towns, groups of places and regions across the United Kingdom. Subsequent to this research, a dedicated UK Town of Culture competition has been launched, creating a distinct platform specifically for towns to showcase original storytelling, and empowering, accessible culture. Since inception in 2009, 71 expressions of interest have been submitted, highlighting a broad geographic range. In the bidding process, shortlisted and winning cities tend to emerge from areas of higher deprivation, such as the Midlands, the North of England, and Northern Ireland.

Winning places have been increasing in size and scope, from the smaller city of Derry/Londonderry in 2013, with a population of around 108,000, to the larger city of Bradford, the 2025 titleholder, whose district encompasses over 546,000. The programme increasingly encourages bids that engage not just city centres but wider local and regional areas, ensuring the benefits reach more communities. Approximately 15% of the UK population lives in areas that have bid for the title, demonstrating the programme’s significant reach.

4.1.4 Importance

The UK CoC programme aims to elevate the national and international profile of host cities, positioning them as vibrant cultural destinations. By developing collaboration among civic, cultural, and business stakeholders, the programme establishes sustainable partnerships and addresses disparities in cultural investment. The programme’s transformative impacts in host cities to date include increased tourism, community revitalisation, and expanded cultural participation.

Marking the local importance of the programme, economic analysis from a DCMS commissioned Evidence Review of the programme highlighted that since inception, public funding of £61.7 million has catalysed over £1 billion in additional investment into the first three titleholder’s local economies, with approximately 25% of additional investment coming from private sources.

4.1.5 Competitive process

The UK CoC title is awarded through a competitive multi-stage process beginning with the submission of an expression of interest, where places outline cultural strategies and visions for long-term impacts.

The DCMS criteria for the bidding process have evolved over the life of the programme in response to the intense competition for the award and changes in national cultural priorities and accountability, from four broad criteria areas in 2009 for the inaugural titleholder, to ten for the 2025 title holder.

Costs of bidding for the title range from £50,000 to £1.5 million, and successful bids are often rooted in a place’s existing cultural strategy. For the 2025 competition round, the DCMS awarded £40,000 to longlisted places to support their applications and provided £125,000 to the three shortlisted places who did not secure the title to allow them to realise elements of their bids.

4.1.6 Duration

The UK CoC programme spans one calendar year, typically commencing January 1 and concluding December 31. However, planning begins typically two to three years earlier, as cities leverage the bidding process to build partnerships, enhance infrastructure, and engage communities. Legacy outcomes, including economic benefits and cultural enhancements, extend well beyond the official year.

4.1.7 Catalysts

The UK CoC programme drives transformation by using culture to regenerate areas, boost economies, and strengthen social cohesion. It attracts investment, increases tourism, and elevates the cultural identity of host cities, serving as a powerful driver of renewal.

Community engagement is at the core of a UK CoC, inclusive initiatives foster civic pride and social bonds, while public and private investments fund cultural infrastructure, enhance public spaces, and create lasting opportunities. Even unsuccessful bids yield value by encouraging citizen input in local decision-making and increased place partnerships.

Collaboration between government, cultural organisations, businesses, and educators is key, delivering innovative projects and building foundations for sustainable legacies.

4.1.8 Construction of infrastructure

Host cities often utilise the title to catalyse infrastructure projects, with a focus on upgrading existing cultural and civic facilities rather than new construction projects. For instance:

Derry/Londonderry’s 2013 generated £160 million in public space improvements,
Hull in 2017 leveraged over £676 million in investments for venue refurbishments and urban enhancements.
Coventry in 2021 similarly secured over £180 million for infrastructure upgrades.

Projects relating to infrastructure typically extend beyond the event year to leave a lasting legacy by enhancing the city’s cultural landscape, improving accessibility of cultural venues, and stimulating further economic growth.

In Hull and Coventry, new venues were created and current venues redeveloped. Hull’s Humber Street Gallery actively supports visual artists in the city who were historically underserved.

4.2 Event objectives and legacy strategy

The objectives of a major event may change through the event lifecycle, and so need to be flexible to allow them to be updated. However, where primary data collection is necessary it is important to collect a baseline and therefore agreeing the initial objectives is particularly important. Previous evaluations of UK CoC titleholders to date have highlighted significant localised outcomes, and host cities are required to consider long-term legacy planning. However evaluations at present have not looked at legacy.

Objectives of a UK CoC include cultural transformation, economic growth, and social cohesion. Key strategies involve empowering communities through inclusive cultural programming, co-creating the delivery programme and legacy plans, and embedding sustainability practices. Legacy planning, typically encompasses five impact areas:

Economic Impact
Sector Development/Stability
Health and Wellbeing
Social and Cultural Value
Environmental Sustainability

Host cities are required to produce a Theory of Change, demonstrating how the programme will use culture as a transformative force, fostering continuing partnerships, enhancing cultural participation, and driving economic renewal.

4.2.1 A Common Theory of change for the UK City of Culture Programme

Using the objectives set out above and the common impact areas sought as shown across all UK City of Culture hosts so far, we have established a common theory of change for the UK City of Culture Programme for the purposes of this research. Not all objectives and impacts have been included here but instead identifies the objectives and outcomes that we have decided to focus on as part of this research.

The theory of change presented here brings together the common elements from the evaluations of the UK CoC titleholders to date.

Presenting each theory of change from titleholders so far would be a challenge, Derry/Londonderry utilised a benefits realisation plan which listed the benefits sought, Hull had theories of change across all impact areas sought (five theories of change in total) and Coventry had a singular unifying theory of change. These can be viewed in the retrospective evaluations.

Figure 4.1: Proposed Theory of Change based on the criteria of previous City of Culture competitions for the purposes of this research

4.3 Choosing Indicators and data sources to focus the case study on

For the purposes of this case study, we are restricted to secondary data and therefore are constrained on what can be measured compared to a typical evaluation. The table below sets out a long list of outcomes based on the event objectives which have been used to measure the objectives of the UK City of Culture Programme, and the relevant indicators for each. There is also a reason given as to why the indicator was included or excluded from this case study methodological experiment.

Table 4.1: Indicator long list for the UK City of Culture programme and reasons for including or excluding from case study

Objective	Indicator	Reason for Inclusion/exclusion from analysis?
Economic Impact	Employment Counts (BRES)	In relation to DCMS SIC/SOC codes against shortlisted cities through a difference-in-difference approach, therefore, is included in the analysis here and below.
	UK Business Counts by Industry (ONS)	Interesting comparator against shortlisted cities through a difference-in-difference approach therefore is included in the analysis.
	GVA by Industry (Local Authority Level, ONS)	Interesting comparator against shortlisted cities through a difference-in-difference approach therefore is included in the analysis.
	Local Government Spend on Arts, Heritage and Tourism	Using financial data from local authority returns to understand the cultural scene in host places, therefore included in the analysis.
Sector Development/Stability	Total Expenditure on Arts, Tourism, & The Historic Environment Per Head (Ministry of Housing, Communities and Local Government)	As the local authority actively plays a role in the bidding for and delivery of a UK CoC (through underwriting certain aspects) these metrics are in relation to income and expenditure and therefore should be included in the analysis.
	Income From Arts, Tourism, & The Historic Environment Per Head (Ministry of Housing, Communities and Local Government)	As above.
	Funding Data (Arts Council England, Arts Council of Northern Ireland)	As above.
Health and Wellbeing	ONS 4 Subjective Wellbeing Measures at the local authority level.	Through Understanding Society and the Annual Population Survey understand if there has been changes to wellbeing at a population level, as this is a consistent focus of a UK CoC as shown through bidding documentation it is therefore included in the analysis.
Social and Cultural Value	Cultural Participation (Active Lives)	For the majority of time across the festival’s lifetime, the cultural engagement questions commissioned by Arts Council England are a good indication of participation at local authority level therefore analysis is included.
	Cultural Participation (Taking Part)	Taking Part ran from 2005 until 2020 and was at the time the flagship survey from the DCMS, results can only be drawn down to a regional level and are therefore not granular enough for inclusion in this analysis.
	Social Cohesion Metrics (Understanding Society, Northern Ireland Continuous Household Survey, Community Life)	To understand the impact of the UK CoC on communities, metrics available at the local authority level are useful, some are available at a regional level while interesting do not provide the granularity needed for meaningful analysis.
	Volunteering Metrics (Understanding Society, Community Life)	As volunteering is a key part of the UK CoC delivery model, it is therefore suitable to include volunteering data in the analysis if the data is available at the right granularity, unfortunately for this study this is not the case.
Environmental Sustainability		In 2018, DEFRA launched a range of environmental metrics, as these are recent and do not fully cover the relevant years of early UK CoCs they are being excluded from this analysis.

4.3.2 Overview of Methodologies in current UK CoC evaluations

The following overview of methodologies is taken from the evaluations of the three titleholders of the UK CoC programme to date and what is planned in Bradford. Citizen/household surveys have been a common evaluation method across all UK CoCs to date, including Bradford 2025^{[footnote 21]}. These surveys aim to establish a baseline and track changes in various metrics over time. Hull’s approach involved a specific citizen survey conducted in 2016 and 2018, alongside citizen panels. While the survey was representative, the citizen panels were not. They did however provide insight into how the programme was being perceived. Derry/Londonderry aligned its household survey metrics with the Continuous Household Survey of Northern Ireland, although given the small and non-representative sample sizes, caution is needed when interpreting the data.

Coventry’s evaluation heavily relied on a representative household survey, which provided reliable data on long-term changes in cultural consumption and perceptions of the city through a representative Household Survey which has taken place in 2018, 2021, 2022 and 2024. Further, Coventry aligned its Household Survey with metrics used in surveys such as Understanding Society and Community Life which allows for the other cities in the dataset to act as a de facto control. However, when analysing Coventry’s data against the national data sets there was heavy contamination from COVID-19 and the fact the Commonwealth Games 2022 took place immediately after the UK CoC 2021 in a direct neighbouring geography year, meaning any causal links in the data cannot be accurately defined. Patterns were able to be identified in those who reported they engaged with the UK CoC 2021 which were then able to be compared to the general population in Coventry and then the region.

All UK CoCs have gathered monitoring data from various stakeholders, including audiences, beneficiaries, and performers, as well as data related to media and financial aspects. However, not all of this data has been incorporated into the evaluations because the data may not link specifically to outcomes.

Table 4.2: Overview of evaluation methodologies utilised/planned in the evaluations published/underway for each round of the UK CoC competition to date.

	Derry / Londonderry 2013	Hull 2017	Coventry 2021	Bradford 2025
Framework/Evaluation Plan
Benefits Realisation Plan	X
Theory of Change		X	X	X
Population Level Surveying
Citizen/Household Survey	X	X	X	X
UK-Wide Perception Survey		X
Local Perception Survey			X	X
Tourism Evaluation/Monitoring
Tourism Study	X	X	X
Tourism Visitor Survey	X		X	X
Media/Broadcast Study			X	X
Monitoring Data / Project Reporting
Capturing of Monitoring Data	X	X	X	X
Hero Project Evaluations/Focus Studies		X	X	X
Individual Post Project Evaluations	X
Audience Surveys	X	X	X	X
Volunteering Survey		X	X	X
Artist/Producer Survey				X
School Survey	X	X
Interviews/Focus Groups		X	X	X
Economic Impact Assessment
Economic Impact Assessment Study		X	X	X
Business Survey		X
Environmental Impact
Environmental Impact Monitoring				X
Social Value
Social Return On Investment Study			X	X

4.3.3 Timing of evaluations

Evaluating a UK City of Culture programme is a multi-phase process covering short- and some long-term impacts. While short-term effects (effects within 1 to 2 years post UK CoC year) are well-documented, long-term impact assessment (effects extending beyond 2 years) is often limited by resource constraints.

The evaluation typically begins 1 to 2 years before the title is secured, with baseline data collection on economic activity, cultural engagement, and social wellbeing. Frameworks are established, and stakeholders engaged to align metrics with programme objectives. During the event year, real-time monitoring tracks visitor numbers, participation, and economic impact, with interim assessments guiding adjustments.

Post-event evaluations, typically completed within 1 to 2 years of the title year providing ‘final reporting’. As such, the evaluation of Derry/Londonderry (UK CoC 2013) was released in January 2018 (following a delay due to the Derry City and Strabane District Council being established in 2014 before becoming formally into being on 1 April 2015 as a consequence of the 2014 Northern Ireland Local Government Reform). Hull’s evaluation (UK CoC 2017) was initially launched in November 2019, with a revision in April 2021. Coventry’s evaluation (UK CoC 2021) was published in November 2023, demonstrating a similar post-event timeline.

Table 4.3: Table of evaluations, launch dates and who undertook the evaluation

	Release Date:	Available At:	Undertaken By:
2013: Derry/Londonderry	January 2018	Currently not available, however the Post Project Evaluation Report (except for the appendices) is available through the University of Warwick.	Derry City & Strabane District Council, and external partners
2017: Hull	November 2019 and revised in April 2021	http://www.citiesofculture.co.uk	Culture, Place and Policy Institute – University of Hull and external partners
2021: Coventry	November 2023	http://www.coventry21evaluation.info	University of Warwick, Coventry University, Coventry City Council, and external partners

Presently there has not been a detailed study into the legacy impacts of the UK City of Culture at a place-based level due to there not being the resource or capacity within the place to undertake such work.

4.3.4 Aggregate findings from previous evaluations

Analysis from the value for money assessment into the UK City of Culture programme^{[footnote 22]}, which brought previous evaluation findings together, found that the UK City of Culture programme has to date cost £103.1 million to deliver, of which £61.7 million has come from public funding via central government or through National Lottery grants. Further, that the programme at a headline level has generated:

Over £1 billion invested into local economies of host cities, with 25% from the private sector.
More than £100 million in additional GVA across two titleholders, with job creation and increased visitor spending.
Cultural infrastructure strengthened, with ACE National Portfolio funding increasing by 35.7% in Hull and 79.3% in Coventry.
Creative industries and tourism sector growth, with 3,100 additional jobs in tourism and hospitality.
Over 7,500 trained volunteers contributing 374,000 hours, improving skills, wellbeing, and social connections.
Wellbeing scores improved among participants, with increased life satisfaction and reduced anxiety.
Over 3,800 events and activities delivered, engaging 539,209 local citizens and attracting 2 million visitors.
70%+ of attendees felt a greater sense of pride in their city, with 90% of volunteers feeling strong local belonging.
Increased public engagement in cultural activities, particularly in deprived communities (e.g., 83% participation in Derry/Londonderry).
Lasting impact on cultural participation, with Coventry seeing a 14%-36% rise in engagement in key areas from 2018 onwards.

4.4 Case-Study Methodology

To overcome current limitations in understanding the legacy of the UK CoC programme, especially longitudinal approaches that enable for isolation of effects, this study evaluates the long-term legacy impacts of the UK CoC programme using a difference-in-differences (DiD) approach. The analysis compares host cities with shortlisted but unsuccessful cities, to construct a counterfactual for identifying whether statistically significant differences can be attributed causally to the programme.

To ensure a comprehensive assessment, the study also incorporates trends analysis alongside DiD, using national datasets and spatial analysis to assess both direct and spillover effects. As a methodological starting point, examining trends before applying the DiD methods helps to contextualise variability within an indicator over time, offering insight into the stability of the data and the extent to which observed changes can be attributed to the programme rather than natural fluctuations.

Additionally, identifying peaks or shifts before or after the intervention can highlight other potential influencing factors, such as concurrent events, national/localised policy changes, or external shocks like COVID-19 which heavily contaminates datasets from 2020 onwards. This broader perspective strengthens the assessment by ensuring that programme effects are not considered in isolation, but within the wider economic and policy landscape, enhancing the reliability of the findings.

As the cornerstone of this study, the use of DiD methodologies offers a causal interpretation of the programme’s effect. This econometric approach compares the differences in outcomes before and after the intervention between treatment and control groups, helping to eliminate confounding factors that might otherwise skew results.

The use of DiD enables this study to:

Isolate programme effects by controlling for baseline differences between host and control cities;
Identify statistical significance in observed changes across key indicators;
Account for confounding factors, such as macroeconomic trends and policy shifts, which might otherwise obscure the actual effects of the programme;
Begin to assess effect heterogeneity, determining whether the programme had a differential impact across different types of cities (e.g., large vs small, economically developed vs struggling cities, urban vs rural). Although this would require further expansion in future studies.

For this study, a key challenge is data availability, as previous evaluations of UK CoCs have mainly demonstrated short-term effects. For impact areas such as cultural participation, social value, health, and wellbeing, host cities have historically relied on local data collection, which has not been consistently sustained in the years following the programme. This necessitates some retrofitting of available data.

The datasets in the following table serve as a starting point for this analysis.

Table 4.4: Data sources for the UK City of Culture programme legacy methods case study

Data Source	Description of data source	Indicators
Active Lives	Commissioned by Sport England, Arts Council England and collected by IPSOS, Active Lives asks questions on cultural participation; unfortunately, the data are only available at the local authority level and cannot be disaggregated further.	Cultural Participation Rates at a Local Authority Level.
Arts Council England	Funding data from ACE indicate how much public funding arts and culture projects are receiving within a local authority.	Funding levels for Arts and Culture at a Local Authority Level.
Arts Council of Northern Ireland	Funding data from the Arts Council of Northern Irelands demonstrate public funding in NI to arts and culture projects.	Local Area Funding Breakdown.
Northern Ireland Tourism Data	Data held by the Northern Ireland Statistics and Research Agency allow for tourism data to be broken down to local authority district levels, allowing a direct comparison with other areas in NI.	Tourism Volume and Value.
Ministry of Housing, Communities, and Local Government	Reported local authority income and expenditure data.	Tourism Expenditure. Tourism Income. Heritage Expenditure. Heritage Income. Arts Expenditure. Arts Income.
Event Evaluations	Reported evaluations from titleholders to date.	Individual UK CoC Metrics Demonstrating Reach.
Office for National Statistics	Relevant economic data. Wellbeing data.	Employment Counts. UK Business Counts. GVA by Industry. Wellbeing Measures from the Annual Population Survey.

4.4.1 Defining a treatment area/period

The study focuses on the first two UK CoC titleholders: Derry/Londonderry in 2013 and Hull in 2017. Coventry in 2021 is excluded due to insufficient post-programme data.

Given the localised nature of UK CoC, treatment areas are defined as:

The official administrative boundaries of the host cities.
Cultural districts within these cities where significant programme activities were concentrated.
Adjacent localities likely to experience spillover effects, such as increased tourism, business activity, or cultural engagement.

To capture the broader regional impact of the UK CoC, spatial analysis was used employed to track changes in economic, social, and cultural indicators beyond city boundaries. Additionally, longitudinal data was be used to evaluate whether spillover effects are temporary or persist long after the programme has ended.

Further, this study builds on a recent difference-in-difference analysis of local business growth, refining treatment areas to isolate UK CoC impacts from broader trends and more accurately measure its legacy.^{[footnote 23]}

Figure 4.2: Areas included in the analysis of Derry/Londonderry 2013

Areas include: Derry City and Strabane (intervention site marked in green), Antrim and Newtownabbey, Ards and North Down, Armagh City, Banbridge and Craigavon, Belfast City, Causeway Coast and Glens, Fermanagh and Omagh, Lisburn and Castlereagh City, Mid and East Antrim, Mid Ulster, and Newry, Mourne and Down.

Figure 4.3: Areas included in the analysis of Hull UK City of Culture 2017.

Areas include: City of Kingston upon Hull (intervention site marked in green), East Riding of Yorkshire. North Lincolnshire, North East Lincolnshire and York (nearest major city).

To explore spillover effects in more detail, where available, more granular datasets –at the LSOA or MSOA level instead of the local authority level used in this study - would be beneficial. If such data were available, an analysis of effects within a set radius, similar to the approach used in the case study on the London 2012 Games, could provide a clearer understanding of the impacts of the event.

In terms of treatment period, this study acknowledges that the UK CoC programme occurs over many years. The analysis featured in this study uses the title year as the treatment period (i.e., 2013 for the UK CoC 2013 and 2017 for the UK CoC 2017), as this year is the pinnacle of the treatment and where the majority of outputs are delivered. It should be recognised that some benefits occur before and after the year.

4.4.2 Defining a counterfactual/control groups

A robust counterfactual is crucial to estimating what would have occurred in the absence of the UK CoC designation. The study primarily uses shortlisted but unsuccessful cities as a control group due to their similar pre-treatment characteristics, including cultural ambition and strategic planning. However, using shortlisted cities presents certain limitations:

Bidding cities often experience positive effects from the application process alone, including increased cultural funding and civic engagement. This may underestimate the programme’s unique impact.
DCMS selection criteria favour cities with existing cultural infrastructure, which may introduce selection bias and endogeneity with cultural outcomes of interest to the evaluation.
Broader regional and national trends can influence all cities, requiring additional statistical controls to ensure accurate measurement of programme effects.

In the design of this study, alternative counterfactual approaches were considered. One option was to compare UK CoC cities to other event winners, such as designated European Capitals of Culture. These comparisons could provide insight into the relative impact of winning a designation. However, differences in event scale, funding structures, and selection criteria limit their comparability. Propensity Score Matching (PSM) and synthetic control methods were also potential approaches. PSM could match cities based on socio-economic characteristics (as shown in the case study for the Manchester International Festival), while synthetic control would construct a weighted combination of non-UK CoC cities to estimate what would have happened in the absence of the programme. While these methods enhance robustness, they are data-intensive and require strong assumptions about comparability, which may not fully capture the unique cultural and local policy dynamics of UK CoC titleholders.

Spatial analysis complements the counterfactual approach selected of using shortlisted cities by addressing some limitations in counterfactual selection. By examining economic, social, and cultural trends across different spatial scales, we can assess whether observed changes are unique to UK CoC titleholders or part of broader regional trends in the neighbouring areas adjacent to titleholders. This is particularly useful for identifying spillover effects and detecting other external influences, such as concurrent policy shifts or economic fluctuations. Additionally, tracking trends over time helps determine whether these effects are temporary or persist beyond the programme’s duration, strengthening the overall evaluation framework.

4.4.3 Study Design

To improve methodological rigour, statistical significance was evaluated using a DiD framework with region and time fixed effects. Ideally, a fixed-effects model at the individual level would allow for more precise control over unobserved heterogeneity, but this was not feasible due to the nature of the data used. Unlike studies that leverage panel datasets such as Understanding Society (used in the case of the London 2012 Games), this study relies primarily on national surveys with repeated cross-sections rather than tracking of the same individuals over time. As a result, individual fixed effects could not be applied, and the analysis instead incorporated region- or industry-level fixed effects. This limitation reduces statistical power and increases the risk of omitted variable bias, meaning that unobserved individual-level characteristics – such as pre-existing engagement with cultural initiatives – cannot be fully accounted for in the estimates. The smaller sample size compared to studies using panel data also makes it more challenging to detect statistically significant effects, particularly for subgroup analyses.

To strengthen the robustness of the findings, additional validation procedures were implemented:

Parallel Trends Assessment: Pre-treatment trends in business counts and employment levels across treated and control regions was compared to verify the parallel trends assumption.
Event Study Analysis: A generalised DiD approach estimating dynamic treatment effects across multiple time periods to assess whether program impacts emerge gradually or exhibit pre-treatment anticipation effects. This approach provides a more granular understanding of how policy interventions influence employment and business formation over time.

For this study, trends analysis was used to examine the patterns and trajectories in key indicators over time to identify underlying shifts and long-term effects. Alongside the DiD approach, it helps distinguish the UK CoC programme’s impact from broader economic and cultural trends by comparing pre- and post-intervention trajectories across host and control cities.

To capture indirect effects, the study examines through spatial analysis how the UK CoC title influences neighbouring regions, tracking shifts in employment, cultural participation, and tourism.

Unlike prior evaluation studies of UK CoC which have focused on short-term impacts, this study leverages national datasets to provide a comprehensive, long-term view. Key data sources include:

Public funding allocations to arts and culture, examining investment patterns.
Wellbeing metrics to assess community cohesion and quality of life.
Economic indicators such as employment growth, business formation, and tourism sustainability.

By integrating the use of DiD with region and time fixed effects and spatial analysis, this study not only isolates direct programme effects but also uncovers regional spillovers often overlooked in previous evaluations.

Staggered Difference in Difference

Looking ahead, future evaluations for the UK CoC or similar competitively decided events series could benefit from employing a staggered Difference-in-Differences DiD methodology. This approach, which estimates treatment effects when similar interventions are introduced across different units at varying time periods, could be well-suited to the structure and nature of the UK CoC programme, provided the data and design constraints identified in the learnings from these case studies is sufficiently addressed.

In the context of this case study, the application of a staggered DiD approach does not offer additional analytical value. Several limitations hinder effectiveness. Firstly, the publicly available data used in this study is restricted in both granularity and completeness across the lifecycle of each UK CoC. This limits the capacity to robustly define pre- and post-treatment periods for comparative analysis. Secondly, determining a consistent treatment period is challenging. Each UK CoC titleholder is distinct, with no blueprint for designing and delivering a UK CoC means the timing or nature of interventions is not consistent across titleholders. Without detailed knowledge of a titleholder’s programme, it becomes difficult to identify a precise treatment period. While it may be possible to test whether the point of designation, the title year, or if an intermediate point is the most appropriate point of intervention, this would likely vary depending on the outcome variable being assessed and may well be different across titleholders.

Additionally, considerable heterogeneity exists in programme design among titleholders. While this case study synthesises headline outcomes based on historic UK CoC objectives and bidding guidance, the actual interventions differ significantly from one city to another. As a result, even if a common treatment period could be identified, the nature of the treatment itself is not consistent across titleholders. Further, presently with only three titleholders having fully concluded their UK CoC year, the number of available observations is insufficient to provide meaningful statistical power for a DiD analysis.

Despite these challenges, a staggered DiD approach holds promise for future evaluations. If a more stable and comprehensive dataset can be developed – one that includes clearly defined treatment periods and consistently tracked outcome measures – then estimating effects across multiple UK CoCs using this method would be both feasible and informative. In such cases, either the announcement year or the UK CoC year itself could reasonably serve as the treatment period, though careful consideration would still be needed given the variability in programme design.

The following is an example of how a staggered DiD approach could be used to assess the impact the title designation has on host cities in relation to the number of creative and cultural businesses, it therefore includes Coventry and Bradford to illustrate what is potentially possible.

Using this methodology, it is evident that the title has had a positive impact in host cities. The treatment effect for the UK CoC 2013 was estimated at approximately 1.94 creative and cultural businesses. However, this result was not statistically significant, with a p-value greater than 0.05, indicating that there was no substantial effect from the designation. In contrast, data for the UK CoC 2017 showed a significant and substantial impact. The estimated treatment effect was 13.33 businesses, with a significant p-value of less than 0.001. The treatment effect for Coventry, which held the UK CoC 2021 title, was 5.83 businesses. This result was statistically significant (p=0.02), indicating a positive but smaller impact compared to Kingston upon Hull. The 2021 designation led to a moderate increase in the number of creative and cultural businesses in Coventry.

Results of the staggered DiD highlight a varying impact of the title designation on creative and cultural businesses across different years.

The most significant effects were observed for the UK CoC 2017, with a positive and large increase in business counts.
The 2021 designation for Coventry also showed a positive effect, albeit smaller in magnitude.
In contrast, results for the UK CoC 2013 did not result in a statistically significant change, and no effect has been observed for the UK CoC 2025 based on the data available.

Looking into the data for each city individually, it can be argued that the designation of the UK CoC helped Derry/Londonderry and Hull navigate the effects of COVID-19 where the impact on the count of creative and cultural businesses is much reduced compared to control cities and also Coventry who was at the planning stage of implementation when the pandemic hit.

4.4.4 Robustness checks

Robustness checks are used to test the appropriateness of the counterfactual and violation of difference-in-difference assumptions in the pre intervention trends. The results of the tests show the robustness of the findings should be questioned for some variables.

Table 4.5: Overview of robustness checks

Variable of interest	Parallel trends	Check on robustness of counterfactual choice	Any other checks for robustness
Tourism related jobs	The pre-2017 period shows that while Hull’s trend is relatively flat, the control group fluctuates more, particularly with a sharp rise in 2016, suggesting the parallel trends assumption was violated and may not fully hold.	Control group composed of shortlisted cities; baseline variation reduces confidence in comparative validity.	Spillover effects assessed in wider region; no clear pattern emerges.
Public funding for arts and culture	A visual check reveals that pre-2017 trends in arts and culture funding diverge sharply, with Hull experiencing a significant funding spike in 2016, suggesting the parallel trends assumption may not be fully satisfied.	Control cities showed steady trends, strengthening attribution of observed peak in relation to the intervention.	Additional funding analysis showed longer-term stabilisation in certain areas such as organisational funding over project funding, however this is anecdotal.
Wellbeing	Trends were parallel prior to intervention; no substantial change observed post-event.	Comparative areas followed similar trajectories; supports reliability of counterfactual choice.	Broader external trends acknowledged as driving observed changes.
Cultural participation	Pre-intervention growth followed by decline post-event; visual trends consistent across regions, suggesting potential deviation from the parallel trends assumption.	Use of nearby and shortlisted cities as controls supports robustness of findings.	NA

4.5 Findings of the application of a multi-method approach to evaluating legacy

As the outcomes sought by the UK CoC are wide and far reaching, this study focuses on the following outcomes and the further details relating to each outcome are detailed below.

Table 4.6: Overview of outcome results by indicator for UK City of Culture programme

Outcomes	Successfully measured?	Significant results?	Outcome achieved and persistence of outcomes
Tourism jobs	SMS Level 2 - Trends analysis and DiD	No	DiD for Hull does demonstrate a positive increase but is not statistically significant. While the results are not statistically significant, the trends analysis does show positive increases in tourism related jobs which in part could be related to the UK CoC title.
Creative and Cultural sector jobs (Hull only)	SMS Level 2 - Trends analysis and DiD	No	The UK CoC generates short-term jobs, however many jobs are temporary for the delivery of the year.
Economic impacts (turnover and employment)	SMS Level 2 - DiD (PSM)	Derry - Yes Hull – No	While Derry shows positive impact, the influence of COVID-19 has impacted the data for Hull.
Increase in public funding for arts and culture (Hull only)	SMS Level 2 - Trends analysis and DiD	No	Evidence demonstrates that the UK CoC title does increase public funding into titleholders, however the funding is not sustained, but anecdotal evidence suggests organisational funding increases in the long-term.
Wellbeing (Hull only)	SMS Level 2 - Trends analysis and DiD	No	Short-term impact but not sustained after event.

As mentioned previously, environmental outcomes have been emerging within the context of the UK CoC programme, but at present there are not appropriate data available to measure any legacy environmental impacts from host cities featured in this study.

4.5.1 Increase in tourism

The UK CoC programme has been linked to increased tourism in host cities, as evidenced by evaluation reports from host cities. However, measuring tourism impact extends beyond simple visitor counts. Job creation can be used as an indicator of growth in the tourism sector.

4.5.1 Increase in tourism (Tourism jobs (UK CoC 2013 and 2017))

Through looking at jobs in the tourism sector (as defined by DCMS SIC codes), across all of Northern Ireland there has been a sustained growth in jobs since 2011.

Figure 4.4: Tourism Jobs in UK CoC 2017 Shortlisted Cities 2015 to 2023 including Difference-in-Difference Analysis, Source: Business Register and Employment Survey.

Descriptive trends pre-UK CoC: Before the UK CoC 2017 designation, Hull exhibited a statistically significant baseline difference in tourism-related jobs compared to counterfactual cities (p=0.004), with a difference of 2,594 jobs. This may reflect pre-UK CoC 2017 venue closures for refurbishment, positioning Hull for broader exhibition offerings post-event.^{[footnote 24]} The study focuses on relative changes rather than absolute levels, mitigating concerns about baseline differences.

Descriptive trends post-UK CoC: Following the UK CoC 2017 intervention, Hull’s treatment area showed a growth of 578 jobs. However, due to the aggregated nature of the data and the low number of observations (n=36), this effect was not statistically significant (p=0.526).^{[footnote 25]} Similarly, in the shortlisted control cities, job growth of 102.62 was observed, but the coefficient remained small and statistically insignificant (p=0.909).

Causal impact of the UK CoC: The statistical analysis aimed to identify any spillover or ripple effects (i.e., events at one location can spread and influence other locations) from the UK CoC 2017 beyond Hull, specifically in the wider Humber region. While counterfactual regions such as East Riding of Yorkshire and York experienced increases in tourism-related job values after 2017, the interaction term (of 578.45) in the DiD model did not show statistical significance. This suggests that the observed changes in these areas cannot be confidently attributed as spillover effects from Hull’s growth.

4.5.2 Increased funding

Public funding for arts and culture (UK CoC 2017)

Looking in more detail at the UK CoC 2017 against shortlisted cities, it is evident that Hull received a strong spike in arts funding in 2016, the year before they were the titleholder.

Figure 4.5: Arts Funding for UK CoC 2017 Shortlisted Cities 2014 to 2020. Sources of aggregated funding data: UK Arts Councils, Ministry of Housing, Communities and Local Government and National Lottery Funders.

Descriptive trends pre-UK CoC: Analysis of the impact of the UK CoC 2017 designation on public funding for arts and culture reveals distinct trends. Before 2017, Hull demonstrated a steady increase in funding, with an average value of £8.93 million in the pre-2017 period. This was significantly higher than the control group of other shortlisted cities (Swansea, Dundee, and Leicester), which showed a relatively stable pre-intervention average of £4.19 million. This suggests that certain benefits, such as increased levels of funding, can materialise before the official UK CoC year.

Descriptive trends post-UK CoC: In 2017, arts and culture funding for Hull peaked, aligning with expectations that short-term funding and investment would be required for the delivery and hosting of a large-scale cultural event. The control group did not experience a comparable increase, indicating that Hull’s rise was uniquely driven by the UK CoC 2017 designation. However, post-2017, arts and culture funding in Hull declined sharply. The average value fell to £6.38 million, marking a significant reduction from pre-2017 levels. In contrast, the control group exhibited modest growth, with their average increasing to £4.91 million during the same period, possibly reflecting greater stability in the absence of the title designation.

Year-on-year analysis provides further insights. Hull experienced significant gains in arts and culture funding leading up to 2017 but faced a dramatic decline in subsequent years, falling below baseline levels. In contrast, the control group’s funding remained relatively stable throughout the analysis period. One notable exception occurred in 2017, where a spike in funding for the control group was traced to a substantial injection of funding into Dundee City for the construction of the V\&A Dundee.

Causal impact of the UK CoC: DiD analysis quantifies this relative change from pre- to post-UK CoC 2017, revealing a negative effect of -£3,270,148.66.^{[footnote 26]} This result suggests that, after adjusting for baseline trends in the control group, Hull experienced a substantial relative decline in arts and culture funding post-2017. The findings indicate that the increased funding benefits of the UK CoC designation were largely temporary, with little evidence of sustained improvement.

From a legacy perspective, anecdotal evidence from evaluations of similar cultural initiatives, such as the London Borough of Culture, suggests that a sharp reduction in public funding post-year presents challenges in sustaining legacy activities. For the London Borough of Culture 2023, Croydon established a legacy fund to maintain momentum and create lasting impacts.

While public funding has seen an overall decline, it can be argued that the designation of Hull as the UK CoC 2017 has contributed to greater funding stability. Analysis of Arts Council England’s annual investment in Hull through National Portfolio Organisations (NPOs) was £2.2 million for five NPOs between 2018 and 2022, with three of these first receiving funding in 2015 in preparation for the UK CoC year. In the most recent NPO funding round for 2023-2026, this investment increased by 35.7% to £3 million annually, with the funding now supporting eight NPOs, reflecting a growing and more resilient cultural sector.

Newly funded NPOs include Absolutely Cultured, the legacy organisation from Hull’s UK CoC year, and Back to Ours, a former Creative People and Places project known for its community engagement during 2017. Previously these organisations existed on project-based funding, however this transition from project-based to long-term organisational funding suggests a shift toward a more sustainable cultural landscape through improved organisational stability. However, the impact of private investment remains unclear due to limited available data.

4.5.3 Improved wellbeing scores

Wellbeing data for NI at the right geographic level are only available from 2014/15 onwards. As this is post Derry/Londonderry’s year as UK CoC 2013, it is therefore not possible to see if there were changes during the year. However, using the Annual Population Survey, it is possible to look at data for Hull and surrounding areas.

Population wellbeing (UK CoC 2017)

Descriptive trends pre-UK CoC: Using a DiD approach^{[footnote 27]} to investigate the causal effect of the UK CoC 2017 in Hull on life satisfaction taking 2013/14 as the pre-treatment period and 2019/20 as the post-treatment period, leveraging Leicester (shortlisted city for the 2017 title) as a control group,^{[footnote 28]} life satisfaction in Hull and the surrounding areas followed a gradual upward trend before the UK CoC 2017 year. Between 2011/12 and 2016/17, Hull’s life satisfaction scores increased from 7.42 to 7.56, closely aligning with Leicester (7.45 to 7.51) and the England average (7.41 to 7.68). This confirms a parallel trends assumption when comparing Hull to control areas.

Additionally, ripple effects into areas such as the East Riding of Yorkshire and North Lincolnshire exhibited higher baseline life satisfaction than Hull, with the East Riding of Yorkshire increasing from 7.93 in 2016/17 and North Lincolnshire rising from 7.80 in 2016/17, showing no significant deviations from general trends before 2017.

Descriptive trends post-UK CoC: Life satisfaction in Hull peaked in 2017/18 during the UK CoC 2017 year, reaching 7.64, ahead of the England average, which peaked a year later in 2018. However, after this initial rise, Hull’s life satisfaction declined sharply, reaching 7.14 by 2020/21, likely influenced by external factors such as the COVID-19 pandemic. A partial recovery to 7.38 was observed in 2022/23.

Control areas experienced similar post-2017 patterns, with Leicester’s life satisfaction falling from 7.51 in 2016/17 to 7.41 in 2020/21, and the England average dropping from 7.68 in 2016/17 to 7.38 in 2020/21. Neighbouring areas like the East Riding of Yorkshire and North Lincolnshire also saw declines, reinforcing that changes were not unique to Hull.

Figure 4.6: Life Satisfaction in Spatial Analysis and Control Areas 2011/12 to 2022/23, Source: Annual Population Survey.

Causal impact of the UK CoC: Through using DiD analysis, it was identified that there was:

No significant causal effect of CoC on wellbeing – small DiD estimates (-0.03 with Leicester and 0.09 with the England average) were negligible and statistically insignificant, suggesting that the UK CoC 2017 title did not have a measurable impact on life satisfaction.
No evidence of wellbeing spillover effects from the CoC – neighbouring areas experienced life satisfaction declines similar to or larger than Hull’s, indicating that any observed changes were driven by broader external factors rather than the UK CoC 2017 itself.
Life satisfaction trends in Hull mirrored national patterns – the comparison between Hull, Leicester, and the England average indicates that Hull’s post-2017 decline in life satisfaction was not significantly different from the control areas, further suggesting that broader external factors were at play.

Overall, while Hull experienced a temporary peak in life satisfaction during its UK CoC year, the event did not produce a lasting or statistically significant impact on wellbeing, with observed trends largely reflecting wider regional and national patterns.

As shown in other case studies in this report, wellbeing effects typically are not sustained post event due to the influence of external factors. It should also be noted that in the wellbeing evaluation for Eurovision 2023, a similar pattern was identified; however, the authors suggest that the peak in wellbeing immediately prior to or before the event is down to an anticipation/excitement event with wellbeing declining once the event has occurred.^{[footnote 29]}

4.5.4 Increase in participation

Data on cultural participation for the UK CoC 2013 are not available at a granular enough level for analysis but taking the UK CoC 2017 title and neighbouring areas across 2015 to 2019 (to avoid for COVID-19 contamination) and undertaking spatial analysis and DiD approaches, it can be argued that the UK CoC 2017 had a measurable impact on cultural participation in Hull and surrounding regions most notably in the build-up period and year itself.

Initially, most areas experienced growth in participation, peaking during the event in 2016/2017.

Overall participation (UK CoC 2017)

Descriptive trends pre-UK CoC: Before the UK CoC designation, cultural participation rates showed steady growth in key regions. The East Riding of Yorkshire experienced an increase from 61.5% in 2015 to 74.3% in 2016, before stabilising at 70.9% in 2017. Hull, the focal point of the UK CoC 2017, saw participation climb from 66.4% in 2015 to 72.3% in 2017. Notably, York consistently recorded the highest participation rates, maintaining 79.5% in 2017 – linked to its well-established cultural infrastructure, which was highlighted in Hull’s UK CoC 2017 bid.

Descriptive trends post-UK CoC: Following 2017, participation rates declined across most regions. Hull experienced a sharp drop from 72.3% in 2017 to 64.7% in 2019 – a steeper decline than in surrounding areas. The East Riding of Yorkshire saw participation fall from 70.9% to 66.8%, while York’s participation slightly decreased from 79.5% to 79.0%. Nationally, the trend mirrored these shifts, with England’s average peaking at 70.2% in 2016 before declining to 65.1% by 2019.

Causal impact of the UK CoC: Between 2015 and 2017, Hull’s participation increased by 5.9%, significantly outperforming the control group of other shortlisted cities 1.8% rise. This growth aligns with the UK CoC’s extensive cultural programming prior to and in the UK CoC year, which created a surge in engagement opportunities.

However, after the UK CoC delivery period ended, participation rates fell. Between 2017 and 2019, Hull saw a 7.5% decline, compared to a 2.2% drop in the control group. The resulting net DiD swing estimate of -9.4% suggests that while the UK CoC initially boosted participation, these gains were not sustained, leading to a sharper decline than in comparable regions once the treatment effect has concluded.^{[footnote 30]}

This pattern raises critical questions about the sustainability of cultural engagement following major public funding initiatives. The decline in participation post-2017 indicates that levels fell below their pre-CoC baseline, underscoring the need for strategic planning to maximise long-term legacy impacts.

Figure 4.7: Overall Participation Levels (Attending a Creative Event, Taking Part in a Creative Activity, Attending a Museum of Gallery, Source: Active Lives).

Conclusions from findings

The findings of this study which applied DiD, trends, and spatial methodologies across a limited and non-exhaustive list of outcomes to evaluate City of Culture programmes’ legacies, underscores a mixed yet impactful legacy of the UK CoC programme, revealing immediate and long-term effects.

While host cities experienced significant short-term boosts in tourism, cultural engagement, and public funding, sustaining these benefits over time into a legacy period proved challenging. The most enduring impacts were observed in employment and business growth in specific sectors, such as Derry/Londonderry’s sustained economic uplift post-UK CoC 2013. However, in other cases, such as Hull’s creative and cultural sector jobs, the programme did not yield long-term gains. Additionally, while cultural participation, wellbeing, and arts funding saw notable increases leading up to and during the UK CoC years, many of these benefits declined once the programme concluded, highlighting the difficulty in maintaining engagement and investment beyond the initial intervention. Nevertheless, the evidence suggests that the UK CoC acted as a catalyst, particularly in shaping cultural infrastructure and funding models, encouraging a shift towards more stable, long-term investment rather than reliance on short-term project-based funding.

The methodological framework employed in this study provides a useful example for evaluating the legacy of cultural interventions. By integrating DiD analysis with trends analysis and spatial analysis, this study seeks to isolate programme-induced changes from broader economic and policy trends. The use of control groups – in this case shortlisted but unsuccessful cities – is a sensible choice, despite some limitations such as selection bias and the potential for pre-existing differences. Looking at neighbouring areas through spatial analysis allows for an examination of spillover effects beyond the immediate intervention and treatment areas, revealing whether benefits extend to surrounding regions.. These methodological approaches, particularly the application of DiD in a cultural policy context could provide a replicable framework for future evaluations of major events where randomised controlled trials are not feasible.

Methodologies used in this study can be replicated for other major event evaluations, including cultural programmes, sporting events, and place-based interventions. The integration of longitudinal national datasets, counterfactual comparisons, and spatial analysis can help policymakers and stakeholders assess whether interventions produce sustained economic, social, and cultural benefits. In particular, the combination of DiD with region and time fixed effects allows for a more precise estimation of an event’s true impact, separating its effects from broader macroeconomic trends.

4.6 Robustness of findings

The robustness of the findings in this study is reinforced by the use of DiD methodology, trends analysis, and spatial analysis to isolate the impact of the UK CoC programme from broader economic and policy changes. However, the study also identifies key areas where methodological improvements could enhance the reliability and depth of future evaluations.

One significant limitation is the resolution of available spatial data. Higher-resolution and more granular datasets – such as Lower Super Output Area (LSOA) or Middle Super Output Area (MSOA) data – would allow for deeper analysis of localised effects and enable a more precise understanding of how the programme’s impact varies across different parts of a host city and its surrounding areas. This would also facilitate more accurate assessments of spillover effects into adjacent regions and potentially allow for the use of the radius approach used in the case study for the London 2012 Games. This case study has made use of data which is publicly available and would be easily accessible to evaluators of major events, but more detailed firm-level data is available through the ONS DataLab, which includes sources such as the Business Structure Database (BSD). Access to these datasets would allow for a more precise analysis of local business impacts, complementing the broader economic insights derived from publicly available sources (shown in this case study). If this more granular level of data is available, it is advisable to follow methodological approaches here but also the ones in the case study for the London 2012 Games which is an exemplar use of this data.

The methodology could be strengthened through use of synthetic controls to complement and support the DiD approach. This technique could strengthen causal inference by constructing an optimally weighted combination of control cities that better match pre-treatment trends in treated cities, helping to address limitations related to the selection of control groups. This study captures the breadth of impact and is quantitative in nature. The integration of qualitative data, such as interviews, case studies, and ethnographic research, would provide deeper insights into lived experiences and the mechanisms behind observed statistical trends. A mixed-methods approach could help explain why some impacts, such as cultural participation, are short-lived and what barriers exist to sustaining them.

Another key area for strengthening the robustness of findings is undertaking an assessment of distributional effects – that is, understanding who is benefitting from the UK CoC programme. While aggregate outcomes are measured, the study does not yet disaggregate impacts by key demographic and socio-economic factors such as income level, ethnicity, or age. This would allow for a more equitable assessment of whether the programme benefits certain groups disproportionately and could help shape future cultural interventions to be more inclusive.

4.7 Learning

4.7.1 What did we learn about the methods for quantifying legacy and impact?

Baseline data collection: Over successive UK CoC evaluations, the quality and depth of baseline data have progressively improved, with cities such as Coventry and Bradford aligning their datasets more closely with national metrics. This practice allows for greater comparability with other regions, improving the ability to discern programme-specific impacts from broader trends. However, given the highly localised nature of the UK CoC, previous studies have often relied on locally gathered data, making national-level comparisons more difficult. Ensuring that future evaluations incorporate standardised national datasets, where feasible, would strengthen their analytical rigor and policy relevance.

Selection of counterfactual areas: The use of shortlisted but unsuccessful cities as a primary counterfactual group has been a widely applied approach. However, it is important to recognise that these cities still receive some of the intervention’s benefits – such as increased cultural engagement and investment – simply by participating in the bidding process. This can lead to underestimation of the programme’s true impact. To address this, synthetic control methods could complement the DiD framework, providing an alternative control group that better matches pre-treatment trends of host cities. Additionally, spatial analysis has emerged as a valuable tool in evaluating spillover effects, helping to map how far the programme’s influence extends beyond the immediate host city. Evidence suggests that primary audiences are drawn from both the host city and the surrounding region, making spatial analysis an essential component in understanding the geographical distribution of benefits.

Multi-phase data collection: Pre-event evaluations establish baseline conditions, while real-time monitoring during the title year captures immediate changes in event attendance, funding allocations, and media coverage. Short-term post-event assessments – typically conducted within one to two years – measure economic and cultural impacts, while long-term evaluations (5–10 years post-event) remain scarce but are essential for understanding sustained legacy effects. Expanding long-term monitoring would allow for a more comprehensive understanding of lasting benefits, including economic resilience, sustained cultural engagement, and community wellbeing.

Mixed-method approaches: Quantitative techniques, such as DiD analysis, spatial analysis, and national trends analysis, provide causal inference and identify broad patterns of change. However, these should be supplemented with qualitative methods – including interviews, case studies, and ethnographic research – to capture the lived experiences and deeper social transformations resulting from the programme. Additionally, a distributional analysis would help to assess who benefits most from the UK CoC programme, ensuring that future policies are designed to maximise inclusivity and equitable access to cultural opportunities.

4.7.2 What do the findings tell us about legacy and impact?

Findings presented here highlight the importance of using robust methodologies to accurately measure the legacy and impact of major cultural interventions such as the UK CoC programme. The application of the DiD methodology is particularly significant, as it allows for causal inference by comparing host cities with control cities that shared similar pre-treatment characteristics but did not receive the intervention. This approach effectively isolates the programme’s impact from broader economic and policy trends. The findings suggest that while short-term gains in were evident for tourism, longer-term impacts – particularly were not identified. This raises critical questions about how cultural policy interventions can be structured to maximize enduring benefits.

Through incorporating spatial analysis, the study provides a more nuanced understanding of how the UK CoC programme influenced not only host cities but also adjacent regions. This method helped identify whether spillover effects were present and how far they extended beyond the core intervention areas. Furthermore, the integration of longitudinal trends analysis using national datasets supported distinction between temporary fluctuations and sustained legacy impacts. Ultimately, this study underscores the value of counterfactual-based approaches in evaluating legacy impacts and demonstrates how future cultural interventions can be assessed even with limited data availability.

Question 6

5. Manchester International Festival

Accepted Answer

This study employs a range of analytical methodologies to test how successfully they can be applied using the Manchester International Festival (MIF) as a test case as a homegrown, biennial event rather than a competitively awarded mega-event. This study serves as an exploration of methodologies for cultural event impact measurement, contributing to a broader framework for assessing legacy effects in similar contexts including annual and biennial recurring festivals. At times, this case study uses data which is not the most comprehensive or granular available in order to test the extent to which insights can be drawn when sub-optimal data is available. It is therefore important that this analysis is not considered a comprehensive evaluation of impact.

5.1 Background and categorisation

5.1.1 Type of Major Event

MIF produced and delivered by organisation Factory International, is a biennial cultural festival operating with significant investment from Manchester City Council. Recognised as a ‘Major Event’ due to its economic and cultural impact to Manchester and the wider region, MIF has played a key role in enhancing Manchester’s global profile. Since its inception in 2007, the festival has supported local businesses, generated employment, and contributed to the city’s cultural infrastructure. The establishment of Aviva Studios, MIF’s new permanent home, underscores the festival’s long-term strategic importance and role it plays in the cultural life of Manchester.

5.1.2 Focus of the event

MIF is dedicated to commissioning and presenting original, interdisciplinary work spanning visual arts, music, theatre, dance, literature, and digital media. It serves as a platform for innovation, collaboration, and artistic excellence, premiering new works that engage with international and local artists while addressing contemporary themes. Community participation is a central pillar, with workshops, interactive experiences, and educational initiatives fostering inclusivity and accessibility are a key element of the programme, as well as expanding participatory opportunities, MIF has also created a volunteering programme which encourages engagement and volunteering with arts and culture.

5.1.3 Scale and geography

The festival traditionally took place across multiple venues in Manchester. MIF also utilises prominent city landmarks such as Manchester Central, Albert Square, and the Whitworth Art Gallery, in addition to unconventional spaces like warehouses, parks, and historic buildings. For the 2023 festival onwards programming has begun to take place in the new purpose-built Aviva Studios, the need for which emerged after the logistical challenges of providing necessary event infrastructure in found spaces (space that is not traditionally used for theatrical productions but is adapted for that purpose). The festival’s impact extends beyond Manchester, with MIF-commissioned productions being staged internationally at venues such as New York’s Park Avenue Armory, Paris’s Festival d’Automne, and Melbourne Festival, reinforcing Manchester’s cultural export strategy.

Examples of the festival’s international links with venues where productions from MIF have become cultural exports include:

New York City’s Park Avenue Armory, where The Life and Death of Marina Abramović, directed by Robert Wilson and featuring Willem Dafoe, premiered after its MIF debut.
Paris’s Festival d’Automne in Paris, which hosted Tree of Codes, an innovative ballet and visual art piece created by Olafur Eliasson, Wayne McGregor, and Jamie xx.
Venues which were part of Italy’s Spoleto Festival, which showcased The Skriker, a play by Caryl Churchill that was reimagined for MIF by Maxine Peake and Sarah Frankcom.
Australia’s Melbourne Festival, where Björk’s Biophilia, an MIF-commissioned multimedia project combining music, science, and technology, was performed after its MIF debut.
Kuwait’s Sheikh Jaber Al-Ahmad Cultural Centre, where MIF’s Arabic-language production of Returning to Reims, adapted from Didier Eribon’s memoir, was staged.

Through these cultural exports, it can be argued that the festival seeks to position the soft power of Manchester, and the United Kingdom.

5.1.4 Importance

MIF plays a pivotal role in positioning Manchester as a leading cultural city. By commissioning and debuting pioneering artistic works, the festival attracts global talent and audiences, enhancing Manchester’s reputation in the international arts sector. The festival also stimulates the local economy through increased tourism, job creation, and investment in cultural infrastructure. Additionally, MIF’s community engagement initiatives strengthen social cohesion and make the arts more accessible to diverse audiences.

5.1.5 Competitive process

MIF was established in 2007 as a legacy project following the 2002 Commonwealth Games cultural programme, Culture Shock. The festival was developed as a bespoke cultural event rather than through a competitive bidding process. The festival is also important in the positioning of Manchester as an ‘event city’.

Devised as a home-grown event, the initial objective of the festival was to create a unique cultural event that would showcase new, ground-breaking work across various art forms, including music, theatre, dance, visual arts, and digital media, the festival was conceived to celebrate Manchester’s rich cultural history while also positioning the city as a global leader in contemporary arts.

5.1.6 Duration

The festival typically runs across 18 days every two years in July/August. Each day features a full schedule of performances, exhibitions, and events across the city. The positioning of the festival in the summer months allows locals and international visitors to experience a wide variety of artistic programming during a period where arts and cultural venues are typically ‘dark’ for maintenance.

With the establishment of the Aviva Studios, MIF will evolve into year-round programming, however, the biennia festival format will not be lost and will remain a key feature in Manchester’s cultural calendar. This shift aims to expand MIF’s cultural impact and provide a continuous arts presence in Manchester.

Presently, evaluations of MIF and this study are based on evaluation data and wider use of data on the basis that MIF is a biennial festival. The move towards year-round programming through the Aviva Studios could complicate ongoing evaluations of long-term impacts through the change in operational direction of Factory International.

5.1.7 Catalysts

MIF has been instrumental in Manchester’s evolution as a global cultural hub. The festival has driven significant developments, including the establishment of Aviva Studios, increased tourism, and enhanced international collaboration. Through its partnerships and commissioned works, MIF continues to expand Manchester’s influence in the arts sector globally.

5.1.8 Construction of infrastructure

Initially, MIF was designed to utilise existing and temporary venues. However, as the festival grew, the need for a dedicated space with advanced digital infrastructure became apparent. The expense involved with the installation of temporary infrastructure in found spaces became an increasing drain on festival resources over time. Therefore, Aviva Studios, developed with public and private investment, including funding from Arts Council England and Manchester City Council, addresses this requirement. The venue’s state-of-the-art facilities enhance MIF’s ability to host large-scale and technologically advanced productions. Early data suggests the new venue has had minimal displacement effects on other cultural institutions in Manchester.

5.2 Event objectives and legacy strategy

To evaluate MIF’s impact, it is essential to define its core objectives. Prior to its launch, a feasibility study identified four key strategic aims, which have guided the festival’s development:

To create an internationally renowned festival dedicated to commissioning innovative, original work.
To enhance Manchester’s status as a world-class cultural city, celebrating its artistic, scientific, and creative heritage.
To foster local talent and community engagement through participatory programming.
To contribute to the city’s economic sustainability and ensure a lasting cultural legacy.

These objectives continue to shape MIF’s direction, reinforcing its role as a transformative cultural institution. The festival’s Theory of Change reflects these aims, underpinning its strategic approach to legacy planning and impact evaluation.

5.2.1 Theory of Change

Using the objectives set out above we have established a Theory of Change for the Manchester International Festival.

Figure 5.1: Theory of Change for the Manchester International Festival

5.3 Choosing indicators and data sources**

To assess progress toward the objectives of interest, we must first examine the available data and select indicators that effectively represent these objectives. In a standard evaluation, we recommend that evaluators begin by reviewing existing data, whether from administrative sources or secondary datasets, before addressing any gaps with primary data collection. However, for this case study, we are limited to secondary data, which constrains the scope of what can be measured compared to a typical evaluation that allows for primary data collection over a sufficient timeframe. The table below presents a comprehensive list of potential outcomes aligned with the event objectives of the Manchester International Festival, along with the corresponding indicators for measurement.

Table 5.1: Indicator long list for the Manchester International Festival and reasons for including or excluding from case study

Objective	Indicator	Reason for inclusion/ exclusion from analysis?
To create an international, ambitious and extraordinary festival, dedicated to commissioning new work from across the spectrum of creativity and human endeavour.	Cultural Participation (Active Lives)	For the majority of time across the festival’s lifetime, the cultural engagement questions commissioned by Arts Council England are a good indication of participation at local authority level. Therefore, analysis is included.
	Cultural Participation (Taking Part)	Taking Part ran from 2005 until 2020 and was at the time the flagship survey from the DCMS. Results are only able to be drawn down to a regional level and are therefore not granular enough for inclusion in this analysis.
	Funding Data (Arts Council England)	To compare public investment administered by Arts Council England into Manchester against counterfactual cities to see differences, therefore analysis is included.
	Total Expenditure on Arts, Tourism, & The Historic Environment Per Head (Ministry of Housing, Communities and Local Government)	As MIF is funded by the local authority these metrics in relation to income and expenditure are included in the analysis.
	Income From Arts, Tourism, & The Historic Environment Per Head (Ministry of Housing, Communities and Local Government)	As above.
To help secure Manchester’s reputation as a world class cultural city, celebrating its pivotal role in music, the arts, science, culture and innovation.	Funding Data (Arts Council England)	As above.
	Manchester Visitor Economy Data	To understand the impact of MIF on the visitor economy.
	Local Government Spend	Financial data from local authority returns to understand the cultural scene in Manchester and counterfactuals should be included in the analysis.
To welcome Manchester’s talent, resources and communities to take part in their City’s Festival, in extraordinary ways that reflect the Festival’s ambition.	MIF Evaluations	Using primary data collected by MIF and reported to Manchester City Council to understand depth and breadth of reach of the festival. The numbers involved are small and therefore do not show in national datasets but at a locally reported level are important to include.
To be a sustainable driver in the City’s economy, ensuring that there is a lasting legacy for the City.	Wellbeing Data (Annual Population Survey)	To understand if peaks in wellbeing coincide with festival years this is included in the analysis.
	Employment Counts (BRES)	Data in relation to DCMS SIC/SOC codes against counterfactuals is included in the analysis.
	UK Business Counts by Industry (ONS)	Interesting comparator against counterfactuals and is therefore included in the analysis.
	GVA by Industry (Local Authority Level, ONS)	Interesting comparator against counterfactuals therefore is included in the analysis.

5.4 Methodology

Unlike other case studies in this research, MIF is not a major event determined through a competitive bidding process, nor is it a one-off occurrence that rotates between locations. Instead, MIF is a locally developed home-grown initiative, inspired by Manchester’s hosting of the 2002 Commonwealth Games, which marked the city’s capacity for staging large-scale, international cultural events. The financial investment into MIF is significantly lower than that of mega-events like the London 2012 Games or Commonwealth Games or a indeed a UK City of Culture amongst others, necessitating a distinct approach to evaluating its legacy impacts (the average cost on delivering MIF between 2007 and 2015 was £10.7 million per festival).

This case study evaluates the localised impacts of MIF, through employing a combination of Propensity Score Matching (PSM) to identify counterfactual areas, difference-in-differences (DiD), and trends analysis. Given MIF’s unique nature as mentioned above, as a locally developed biennial arts festival, rather than a major event secured through competitive bidding, this study focuses on isolating its impacts within the Manchester area as well as the Greater Manchester Combined Authority (GMCA) area,

The lack of a formal selection process of a host area as seen in other case studies, means that a standard evaluation approach, which typically compares successful and unsuccessful host locations, is not applicable in this instance. Instead, this study adopts a counterfactual approach using matched comparison cities, Leeds and Nottingham, to infer the impact of MIF on Manchester and the GMCA area.

It should be noted that when applying the DiD framework, a challenge emerges due to the absence of a definitive start date for MIF’s impact, as the festival has been established for a long time (since 2007). This has made it difficult to identify a clear pre-treatment and post-treatment period in available datasets, leading to inconsistencies in constructing a reliable time series. Additionally, variations in festival programming, frequency, and scale over time have further complicated efforts to establish a consistent treatment effect.

This study aims to identify the economic, cultural, and social effects of MIF by comparing outcomes in Manchester and the GMCA area to those of selected counterfactual cities. By applying econometric methodologies, this study seeks to inform evaluations of similar cultural events in the UK. The emphasis is placed on methodological learning rather than a comprehensive assessment of all MIF impacts. This study considers a range of key outcome variables including employment rates, Gross Value Added (GVA), tourism growth, business development, cultural participation rates, and wellbeing metrics. By drawing from multiple data sources, the study aims to provide a nuanced understanding of how MIF contributes to local economic and cultural ecosystems.

Given the scale of MIF, its effects are unlikely to register in national datasets due to their small and localised nature, the datasets in the following table serve as a starting point.

5.5 Data sources

Table 5.2: Data sources used in City of Culture Programme case study

Data Source	Description of data source	Indicators
Active Lives	Commissioned by Sport England, Arts Council England and captured by Ipsos, Active Lives asks questions on cultural participation; unfortunately, the data are only available at the local authority level and cannot be cut in further ways.	Cultural Participation Rates at a Local Authority Level
Arts Council England	Funding data from ACE indicates how much public funding arts and culture projects are receiving within a local authority.	Funding levels for Arts and Culture at a Local Authority Level
Ministry of Housing, Communities, and Local Government	Reported local authority income and expenditure data.	Tourism Expenditure Tourism Income Heritage Expenditure Heritage Income Arts Expenditure Arts Income
Manchester City Council	Reported evaluation data from MIF 2007 to MIF 2023 to Manchester City Council.	Individual festival metrics demonstrating reach.
Office for National Statistics	Relevant economic data. Wellbeing Data	Employment Counts UK Business Counts GVA by Industry Wellbeing Measures from the Annual Population Survey

5.5.2 Defining a target area

MIF has a reach that covers the entire Northwest region in relation to audiences attending and artists who they work with. While the city of Manchester and the Northwest region are a focus in MIF’s design and delivery, artists also come from other regions, including neighbouring Yorkshire and the Humber, and the West Midlands. The festival outlook is on the city and Northwest region.

However, using the entire Northwest region as the treatment area introduces contamination risks due to other significant cultural offerings, particularly in Liverpool and the wider Liverpool City Region, which hosts a Biennial and other cultural events as standard throughout the year.

To ensure a more precise examination of legacy impacts, this study defines the Greater Manchester Combined Authority (GMCA) area as a wider treatment zone area in addition to Manchester in isolation. This area encompasses the ten local authority districts of Bolton, Bury, Manchester, Oldham, Rochdale, Salford, Stockport, Tameside, Trafford, and Wigan.

A key advantage of using the GMCA area is the ability to track economic and social indicators consistently across a defined spatial unit. Given the strong regional policy emphasis on creative and cultural industries, the GMCA serves as a coherent analytical unit for assessing MIF’s contribution. The area has a well-established infrastructure supporting arts and cultural initiatives, making it an ideal spatial treatment zone for evaluating potential legacy impacts of a cultural major event.

Figure 5.2: The Northwest region mapped by local authority (a) and the GMCA area mapped by local authority (b).

(a)

(b)

5.5.3 Defining a counterfactual

Establishing counterfactual cities is a crucial methodological approach to assessing the legacy impacts of MIF. Unlike other case studies where host areas are selected through a formal process, Manchester lacks such a structured selection, making standard evaluation methods inapplicable. Therefore, it is essential to identify counterfactual cities that closely resemble Manchester in key aspects but have not been influenced by MIF or similar multi-arts biennial festivals. ^{[footnote 31]} A major challenge in the methodology for this study is selecting appropriate counterfactuals to effectively isolate MIF’s legacy impacts. While this study proposes one method for identifying counterfactuals, different event evaluations or studies may require alternative methodologies that may be better suited for the circumstances of the place and event.

While alternative approaches in assessing the legacy impacts – such as analysing impacts within a defined geographic radius, as seen in the London 2012 Games study – could have been viable, they were less suited to MIF’s early years, when it operated across multiple venues rather than serving as a singular catalyst for the construction of a single piece of cultural infrastructure (the Aviva Studios). Given these factors, the counterfactual approach was determined to be the most rigorous and appropriate means of assessing MIF’s legacy impacts.

The counterfactual approach utilised here should allow for a more nuanced and accurate statistical analysis of the festival’s true impacts across various domains. By comparing the outcomes observed in Manchester (and, where relevant, the wider GMCA area) with those of a similar city that does not host a major biennial arts festival, the counterfactual method helps isolate the specific effects of MIF. This comparison needs to control for external variables, enabling a clearer understanding of both the direct and indirect impacts of the festival on the city.

The counterfactual scenario helps to better understand how MIF has influenced critical aspects such as local businesses, tourism, employment, and broader economic and social dynamics. By isolating the festival’s role, it offers a more accurate and comprehensive assessment of its legacy. Furthermore, this approach facilitates benchmarking against other comparable cities within England. By selecting a city or cities with similar demographic, economic, and social characteristics, we can determine whether the outcomes observed in Manchester can be uniquely attributed to MIF, or are part of broader, regional trends (which could have happened even in the absence of MIF). There is a need to remain conscious of the possible likelihood that the comparison areas will be subject to contaminating interventions, to which the impacts are unlikely to be known with any precision and presumably cannot be effectively controlled.

Additionally, the counterfactual method helps to mitigate the risk of confirmation bias by ensuring a rigorous and objective evaluation of MIF’s effects. Rather than focusing solely on positive outcomes, this approach compels a balanced assessment that considers both the festival’s contributions and potential challenges. By systematically comparing Manchester’s experience with that of a similar city, this methodology provides a more accurate, data-driven understanding of MIF’s role in shaping the city’s development and cultural identity.

The counterfactual cities to Manchester used in this study are Leeds and Nottingham.

These two cities have been selected through two statistical modelling techniques: Propensity Score Matching (PSM), as outlined in the toolkit, and undertaking a Euclidean Distance Calculation as a confirmatory step (the use of this calculation in this context is experimental and included here as potential means of further developing the methodology in this area).

The selection process began with cities identified in the initial feasibility study for MIF, supplemented by other cities that have recently sought to host major cultural festivals, including Leeds, Sheffield, Birmingham, Nottingham, Leicester, Coventry, Bradford, Kingston-Upon-Hull, and Southampton. Key socio-economic variables (taken from data dated within the year 2021) were used for comparison, including population size, median income, unemployment rate, GDP, median age, percentage of households in deprivation, percentage of the population not White British, and the average percentage of adults participating in cultural activities (as measured by Active Lives).^{[footnote 32]} Using these domains, the cities were analysed to ensure a robust and reliable comparison, enabling precise calculations for the counterfactual analysis.

In short, the selection of counterfactuals here makes use of:

Propensity Score Matching (PSM), a statistical technique used to create counterfactuals by matching cities with similar characteristics. It calculates the probability of a city hosting a festival like MIF based on factors like population and economic conditions, ensuring that treated (MIF-hosting) and untreated cities are comparable. This reduces bias and enables more accurate estimates of the festival’s impact.
Euclidean Distance Calculation measures the similarity between cities based on multiple characteristics (data points). By calculating the theoretical distance between Manchester and potential counterfactual cities, to identify those most similar, ensuring valid comparisons to Manchester to assess MIF’s impact.

This study uses both PSM and Euclidean Distance Calculation in tandem to identify the best counterfactuals, leveraging their complementary strengths. PSM was employed to estimate the probability of receiving treatment based on observed covariates, ensuring that treated and control groups were balanced and comparable in terms of their covariate distributions. Once the propensity scores were calculated, the Euclidean distance was utilised as a test and matching metric to identify the closest possible control units to each treated unit within the propensity score space. While both methods are different, PSM is a specific methodology and Euclidean distance is a general mathematical tool, both methodologies in practice overlap.

As illustrated in Figures 5.3 and 5.4, Nottingham and Leeds consistently emerged as the closest scoring cities to Manchester in both scenarios. Birmingham was excluded as a counterfactual due to the fact that, until 2012, the city hosted the annual ArtsFest festival. This event featured an average of 600 cultural activities each year, including major commissions, which significantly influenced Birmingham’s cultural landscape and made it unsuitable for direct comparison.

Figure 5.3: Final PSM plotted to show closest cities.

Figure 5.4: Euclidean Distance Calculation plotted to show closest cities

In selecting Nottingham and Leeds as the counterfactual cities, Liverpool was considered but ultimately excluded. Its geographic proximity to Manchester and the GMCA area posed a risk of overlapping influences, making it less suitable for comparison. Additionally, as previously noted, Liverpool’s own biennial festival and events programme could introduce contamination, further compromising its validity as a counterfactual.

In the selection of a counterfactual, an alternative methodology which could be utilised is the use of synthetic controls, this approach was rejected here because Manchester lacked a formal selection process as a host city, making it difficult to construct a reliable synthetic counterpart. Additionally, finding suitable donor cities for the creation of a synthetic control was challenging, as few places mirrored Manchester’s characteristics while remaining unaffected by similar cultural interventions. The complex and multidimensional legacy effects of MIF, spanning economic, social, and cultural domains, did not fit well with the synthetic control methodologies reliance on single quantifiable outcomes. Instead, a direct comparison with carefully selected counterfactual cities was deemed a more practical and flexible method for isolating MIF’s impact, hence why it has been used here.

5.5.4 Study Design

This case study is used to explore how the impacts of MIF on Manchester and the GMCA area could be captured using several indicators for which data is openly available. This case study builds on previous assessments and evaluations submitted to Manchester City Council while employing advanced statistical techniques to strengthen causal inference. This study is not intended to be an all-encompassing evaluation of MIF but an exploration of methodologies which could be employed when trying to understand the legacy impacts of major events.

To isolate the specific impacts of MIF, the study applies a counterfactual approach, focusing exclusively on Manchester and the wider GMCA area while mitigating contamination from external economic influences such as those from Liverpool. Additionally, Leeds and Nottingham serve as counterfactual cities, selected using Propensity Score Matching and confirmatory experimental use of Euclidean Distance Calculation to provide a comparative framework. This ensures that observed changes can be more confidently attributed to MIF rather than broader regional trends.

Recognising methodological constraints, the study acknowledges limitations in fully capturing post-MIF impacts due to shifting geographic boundaries, data availability, and the effects of COVID-19. Notably, Manchester has historically lacked a longitudinal Household/Citizens Survey, a tool commonly used in major cultural event evaluations such as the UK City of Culture programme. While the Greater Manchester Resident Survey was introduced in 2022 in response to the pandemic, it does not yet include coverage of culture and major events, limiting its applicability to this analysis.

To enhance analytical robustness, the study employs a DiD framework where applicable, allowing for a more precise estimation of MIF’s impact by controlling for external economic and cultural shifts. By comparing pre- and post-event data across Manchester and the counterfactual cities, this approach strengthens causal attribution and distinguishes festival-driven effects from broader industry trends.

Beyond direct festival outcomes, spatial analysis examines potential spillover effects from Manchester into the wider GMCA area, identifying shifts in employment, tourism, and cultural participation that extend beyond Manchester’s core. By integrating DiD with national trends analysis and spatial modelling, this study presents a potential methodological framework for assessing the long-term legacy of MIF.

5.5.5 Robustness checks

Robustness checks are used to test the appropriateness of the counterfactual and violation of difference-in-difference assumptions in the pre intervention trends. The results of the tests show the robustness of the findings should be questioned for some variables.

Table 5.3: Overview of robustness checks

Variable of interest	Parallel Trend check (method and implication of result)	Check on robustness of counterfactual choice (method and implication of result)	Any other checks for robustness
Jobs and revenue in cultural, tourism and hospitality sectors	Visual inspection of trends shows upward trajectory in festival years; overall increase with intermittent declines. Differences with counterfactuals suggest that parallel trends assumptions do not fully hold.	Comparison with counterfactual cities shows limited variation, supporting general comparability.	NA
Wellbeing	Across all four well-being indicators, Manchester’s trends diverge from those of counterfactual cities in the pre-intervention years, indicating that the parallel trends assumption is unlikely to hold.	Use of DiD analysis against control group revealed inconsistent and minimal effects; supporting the lack of measurable impact.	External factors may be influencing results.
Increase in opportunities for artists and Manchester’s cultural sector	Visual inspection of trends shows overall growth in business counts across Manchester and GMCA; DiD analysis of GVA reveals mixed results across years with no consistent festival-related impact.	Comparisons with counterfactual cities show stronger sub-regional growth, but variations suggest influence of broader structural changes.	Relocation of major institutions (e.g., BBC to MediaCityUK) likely key driver; MIF’s contribution difficult to isolate from wider trends.

5.6 Findings

The following table is a summary overview of the findings detailed in the remainder of this section.

Table 5.4: Overview of findings

Outcomes	Successfully measured?	Significant results?	Outcome achieved and persistence of outcomes
Creative and Culture jobs	SMS Level 2 -Trend compared to control cities	No	Trends analysis shows increase in jobs during festival years however increase is not sustained.
Wellbeing	SMS Level 2 - Trend compared to control cities and DiD	No	Short-term impact but changes are not sustained.
Business counts and GVA	SMS Level 2 -Trend compared to control cities and DiD	No	Analysis shows MIF has contributed to Manchester and the wider area’s economy and has supported growth.

5.6.2 Jobs and revenue in cultural, tourism and hospitality sectors

Overall, within the GMCA area there has been an upward trend in the number of jobs within DCMS SIC Codes for the creative industries, the digital sector and the cultural sector. This upward trajectory is more pronounced in the creative industries and digital sector, whereas the cultural sector peaked in 2019 (coinciding with MIF 2019) before a decline (most likely due to the COVID-19 pandemic). Interestingly, like cultural participation there are spikes in the number of jobs during festival years, a potential reason for this could be the increase in temporary employment which the festival provides. Despite declines in non-festival years the overall trajectory is increasing.

Figure 5.5: Jobs by DCMS Sector in the GMCA Area

When breaking this down further and looking at the counterfactual cities, the need to look at employment figures within the whole GMCA area becomes apparent. Focusing on the Creative Industries and the Cultural Sector it is clear there is an upward trend. Nottingham as a counterfactual has remained stable only showing minor improvement while Manchester has recently overtaken Leeds who historically has outperformed Manchester (most likely reason is the larger population size. The regional reach of the festival is key as MIF engages firms which are based in neighbouring local authorities to Manchester.

Figure 5.6: Jobs in the Creative Industries and Cultural Sector in the GMCA Area against Counterfactuals.

5.6.3 Wellbeing

From available evidence there is no noticeable wellbeing effects observed in the general population across festival years and non-festival years. Overall wellbeing levels across the four domains included in the ONS 4 Subjective Wellbeing measures follow similar patterns to those of the counterfactual cities. Within overall levels of anxiety, the national picture is observed particularly with increases around the time of the EU Referendum in 2016 and COVID-19 pandemic in 2020.

Figure 5.7: ONS 4 Subjective Wellbeing Measures from Annual Population Survey

Through using a DiD analysis when looking specifically at life satisfaction, the analysis highlights that in 2012/13 prior to the 2013 MIF, Manchester experienced a small decline in life satisfaction score (-0.09), while the control group of the counterfactual cities saw an improvement (+0.035). This resulted in a negative DiD effect of -0.125, suggesting that Manchester was underperforming relative to the control group. Moving to festival years, in 2013/14, Manchester showed a small improvement of +0.02, which matched the change in the control group. This yielded a near-zero DiD effect, indicating no measurable relative impact that can be causally linked to MIF. In 2015/16, Manchester’s outcome improved by +0.03, but the control group improved more substantially (+0.075), resulting in a small negative DiD effect (-0.045), showing that Manchester’s improvement was less pronounced compared to the control group.

Interestingly, in years where MIF did not take place, Manchester occasionally outperformed the control group. For example, in 2014/15, Manchester’s outcome improved significantly (+0.14), while the control group experienced a slight decline (-0.025). This resulted in a positive DiD effect (+0.165). These fluctuations suggest that other factors may be influencing outcomes in both Manchester and the control group.

Results of the DiD analysis provides limited evidence to suggest that MIF has had a substantial or consistent impact on the outcomes for Manchester relative to Leeds and Nottingham. During festival years, the changes in outcomes for Manchester are small and often similar to those in the control group, resulting in negligible or negative DiD effects. In contrast, larger variations occur in non-festival years, indicating that other external factors may be driving the observed changes. From a methodological perspective, this analysis suggests that MIF’s effects may either be small or difficult to detect using this data. Further analysis or additional data sources may be needed to draw more robust conclusions and increase the rigour of findings.

5.6.4 Increase in opportunities for artists and Manchester’s cultural sector

When looking at the estimated business counts for the creative industries and cultural sector in Manchester compared to the counterfactual cities, it is clear that the number of businesses has been increasing. Compared to the counterfactuals the gap between Manchester and Leeds has been narrowing with Leeds maintaining a marginally higher count of businesses. Nottingham on the other hand has remained relatively stable with only a marginal growth rate. Figures for the GMCA area show a faster growth rate, again highlighting the sub-regional draw which MIF has.

Figure 5.8: Estimated Business Counts in the Creative Industries and Cultural Sector

GVA figures for businesses within the arts, entertainment, and recreation sectors, based on SIC codes, have shown sustained growth. Notably, Manchester surpassed Leeds in 2013, while Nottingham has remained relatively stable with only marginal increases. The most significant driver of this growth in the GMCA area appears to be the development of MediaCityUK and its subsequent expansion, with the majority of companies having relocated by 2012/13. Given the timing and scale of the BBC’s move as an anchor tenant from 2011 onwards, this is likely the dominant factor in the observed economic impact, overshadowing any direct influence from the festival.

Figure 5.9: GVA estimates for arts, entertainment and recreation

Through DiD analysis of GVA estimates, before the festival began in 2007, Manchester experienced small growth in its GVA. For example, in 2006, Manchester’s performance improved by £16 million, while the control group of counterfactual cities grew by £20 million. This resulted in a small negative relative effect, suggesting that Manchester’s growth was slightly weaker than that of the control group in the pre-festival period. In 2007, the first year of the festival, Manchester’s performance declined by £20 million, while the control group also declined, though by a smaller margin of £8.67 million. The resulting negative difference-in-differences effect of -11.33 suggests that Manchester’s relative performance worsened in the inaugural year of the festival.

In 2008, a non-festival year, Manchester’s outcomes continued to decline, falling by £11 million, while the control group showed an improvement of £11 million. This stark contrast created a negative effect of -22, indicating that Manchester underperformed relative to the control group during this year. This result suggests that external factors unrelated to the festival may have been influencing Manchester’s outcomes. In 2009, during the second festival year, Manchester rebounded slightly, improving by £4 million, while the control group declined by £5.33 million. This positive relative effect of +9.33 implies that MIF may have had a beneficial impact on Manchester’s GVA performance during its second iteration.

Non-festival years often show significant underperformance by Manchester, suggesting that broader economic or social trends may have played a role in shaping the city’s outcomes, independent of the festival. The results highlight an inconsistent impact of MIF. While certain years suggest a positive effect on Manchester’s relative GVA performance, others show no clear benefit or even negative effects. This variability points to the influence of broader structural or economic factors that may overshadow MIF’s contributions. It is also possible that the benefits of the festival are not immediate and may take time to materialise, meaning that year-on-year changes might not fully capture longer-term benefits such as cultural growth, increased tourism, or sustained economic development.

5.6.5 Conclusions from findings

The findings of this study demonstrate that while MIF has contributed to Manchester’s cultural and economic landscape, its long-term impact presents a complex and nuanced picture. The analysis reveals that MIF has played a role in increasing private investment in the arts, expanding business counts in the creative industries, and fostering a sub-regional cultural ecosystem. Additionally, the festival has driven volunteer engagement and opportunities for artists, while also positioning Manchester as a hub for cultural exports. However, the study found limited statistical evidence of consistent improvements in wellbeing, cultural participation, or employment figures directly attributable to MIF. While certain metrics, such as Gross Value Added (GVA) and cultural participation, show peaks during festival years, these effects are not always sustained, suggesting that MIF’s impact may be episodic rather than a long-term transformative force.

The methodologies employed in this study, particularly the use of PSM and where applicable DiD analysis have been critical in isolating MIF’s legacy impact from broader local and regional trends. By selecting counterfactual cities, in this case Leeds and Nottingham – the study has been able to compare Manchester’s trajectory against similar urban environments that have not hosted a biennial arts festival of MIF’s scale. While these counterfactuals were identified through statistical processes, questions remain around if they are the best counterfactual choices. Further, to improve statistical power future studies may wish to increase the number of counterfactuals included in the analysis.

The approach trialled here provides a more rigorous framework for assessing legacy impacts than traditional comparative methods, which often rely on subjective narratives or broad economic indicators. Additionally, the use of spatial analysis and national datasets has allowed for a more granular understanding of MIF’s effects, revealing how its influence extends beyond Manchester to the GMCA area. This multi-faceted approach strengthens causal inference and offers a replicable model for future evaluations of cultural events.

The methodologies used in this study could be applied by others in evaluating the legacy impacts of major events, particularly those that lack a formal selection process or clear-cut geographic boundaries. The counterfactual approach ensures that findings are not merely anecdotal but are backed by comparative analysis, helping to distinguish event-driven effects from broader socio-economic trends. Furthermore, the use of trend analysis and DiD frameworks can be beneficial in assessing the sustained impacts of cultural and sporting events, particularly when pre- and post-event data availability is inconsistent. By refining these methodological approaches, future evaluations could improve their robustness, offering stronger evidence to support cultural policy decisions and investment in major events which are home-grown and at the local level.

5.6.6 Robustness of findings

The findings presented in this study exhibit limited robustness. By combining primary data from festival evaluations with national datasets and selecting comparable cities through PSM, the study aimed to isolate the impacts of MIF as far as reasonably possible. Tangible benefits, such as sector-specific job growth, and heightened cultural participation during festival years, underscore MIF’s contribution to Manchester’s cultural and economic landscape. The longitudinal analysis spanning multiple iterations of MIF enhances credibility by revealing trends and trajectories over which align with event years.

However, the robustness of the findings is questionable given the use of only two counterfactual areas. We therefore suggest that more counterfactual areas are used to enhance statistical power.

Furthermore, data gaps and contamination, particularly due to the COVID-19 pandemic, further complicates the interpretation of trends for recent festival iterations. While peaks in cultural participation and job counts are observed during festival years, these outcomes lack consistency and sustained persistence, raising questions about the festival’s long-term legacy. Broader impacts, such as wellbeing improvements and positive place perception, are supported more by anecdotal evidence than by rigorous empirical data.

Furthermore, estimating localised impacts remains a challenge. While this study provides an understanding of some of the more localised effects of MIF, the findings are less robust for evaluating broader systemic or enduring outcomes. Limitations in data availability restrict the ability to draw definitive conclusions about MIF’s legacy impacts on Manchester and the surrounding areas. Despite these challenges, the methodological framework applied in this study – combining econometric modelling with counterfactual comparisons – provides a strong foundation for future research. By refining data collection practices and integrating additional qualitative insights, future studies can improve the precision and reliability of cultural event impact assessments.

5.7 Learning

5.7.1 What did we learn about the methods for quantifying legacy and impact?

Data availability and accessibility: This study highlights critical lessons regarding data availability and accessibility when quantifying the legacy and impact of MIF or similar events/festivals. The availability of national data at a local authority level presents challenges. While evidence from Active Lives is useful for quantifying participation, its data cannot be fully disaggregated due to cultural participation questions being separated from the rest of the survey data. Similarly, the historic Taking Part survey is available only at a regional level, which masks the impacts of a highly localised festival. Future evaluations may benefit from the Participation Survey, which is now available at a local authority level every two years, though it remains outside the scope of current MIF evaluations. Comparing Manchester and the GMCA area against counterfactuals using national datasets required analytical adjustments to show statistical significance. Different methodologies and sample sizes across survey populations made it difficult to ensure consistency. The study underscores the need for data collection improvements to enhance future evaluations.

Recurring events: As a biennial event, MIF has readily available evaluation data. However, its relatively modest investment scale means that ripple effects into national datasets are difficult to detect. Combining localised evaluations with national datasets can provide a more nuanced impact picture. Integrating national metrics into localised surveys could create a default comparison with the general population, as seen in UK City of Culture evaluations. The lack of granularity and consistency in available data limits analysis, making it challenging to capture the full breadth of MIF’s impact.

MIF could benefit from a dedicated local household survey, running concurrently with the Participation Survey, to better track long-term legacy impacts. This could also serve as a planning tool to maximise future festival impacts. Robust baseline data from the earliest possible point is essential, yet evaluations of MIF have primarily reported volume-based metrics. Data contamination from external shocks like COVID-19 further complicated analysis, particularly in areas such as wellbeing and visitor perceptions. Access to institutional financial records would enhance economic impact assessments, though reliance on publicly available sources such as Companies House remains time-consuming.

Use of counterfactuals: The counterfactual approach in this study provided valuable insights into MIF’s unique impact. Given that MIF is a home-grown initiative rather than a competitively won event, selecting Leeds and Nottingham as counterfactuals enabled a controlled comparison of economic, cultural, and social indicators. This approach isolated MIF-specific effects while accounting for broader regional and national trends.

Despite its utility, the counterfactual method was constrained by data limitations and external influences, such as COVID-19’s impact on cultural participation and economic indicators. Additionally, contamination risks from other cultural interventions in comparator cities, such as LEEDS 2023 Year of Culture, presented challenges in fully attributing observed differences to MIF alone. Future evaluations could refine this approach by incorporating qualitative insights or leveraging more granular data sources to enhance findings.

The need for long-term data infrastructure: Cultural major events like MIF must prioritise and invest in long-term data collection infrastructure to comprehensively understand their impact. Access to consistent, granular data across multiple domains, paired with adaptable methodologies, can significantly improve legacy quantification. While national datasets and counterfactual analysis remain valuable, they must be complemented by targeted local data collection to fully capture festival effects and individual-level impacts.

5.7.2 What do the findings tell us about legacy and impact?

The findings of this study underscore the value of employing robust econometric methodologies in evaluating the legacy and impact of major cultural events. The combination of PSM and DiD has proven effective in mitigating potential selection bias and ensuring that observed differences between Manchester and counterfactual cities can be attributed with greater confidence to MIF rather than external variables even if the results are variable and mixed. This methodological robustness is essential in distinguishing genuine legacy effects from the broader economic and social shifts that could otherwise confound analysis.

Moreover, the counterfactual approach, supported in this context by the experimental use of the Euclidean Distance Calculation as a confirmatory measure to support the PSM, demonstrates how methodological triangulation can enhance the credibility of legacy impact assessments. The use of DiD to track changes over time has further refined the ability to detect statistically significant differences, though its limitations in capturing the delayed or diffuse effects of cultural events are noted. While challenges such as data availability and contamination risks remain, the study highlights how employing multiple quantitative techniques can provide a more comprehensive and reliable picture of cultural event impacts overall.

Additionally, this study highlights the importance of defining clear spatial and temporal boundaries when assessing legacy impacts. MIF’s effects are not confined to a local authority or more tightly defined treatment area. This required a more flexible yet rigorous approach to identifying and assessing impact, which was achieved through a focus on the wider GMCA area in addition to Manchester alone. The study also accounts for the influence of external cultural events in nearby regions, ensuring that Manchester’s outcomes were not incorrectly attributed to MIF’s activities. The ability to track spillover effects beyond the immediate urban centre provides a more accurate representation of how cultural festivals contribute to regional development. Furthermore, this study demonstrates the importance of considering both direct and indirect impacts over time. While some economic and cultural benefits may be immediately observable, others, such as increased investment in the arts or shifts in employment patterns, may take years to materialise. By integrating long-term trend analysis with counterfactual comparisons, this study offers a potentially replicable framework for evaluating cultural events with sustained but evolving impacts. Future research could further refine this approach by incorporating additional data sources, such as longitudinal surveys, qualitative interviews, and more granular economic indicators.

Question 7

6. Conclusions

Accepted Answer

The four case studies – the London 2012 Games, City of Culture Programmes, Cricket World Cups, and Manchester International Festival – offer valuable insights into evaluating the legacy and impact of major events. While each event possesses unique characteristics, several common themes and specific learnings emerge from the evaluation process.

6.1 When to use variations of Difference-in-Difference

There are several aspects to consider when choosing the correct methodology to use in your evaluation of major sporting and cultural events. They key considerations are the type of major event, the availability of data (particularly spatial and temporal granularity and sample size), the aims of your evaluation, and the complexity of the econometric techniques required.

We have seen from the case studies that Difference-in-Difference provides a robust methodology which goes beyond trend analysis to assess changes in variables of interest against a counterfactual in a way that can provide higher levels (Level 3) of robustness on the HMT Magenta Book Scientific Methods Scale (SMS). The case studies have allowed us to consider how Difference-in-Difference can be deployed in different ways to be versatile to the different characteristics of each case study.

By applying Difference-in-Difference methods in four different contexts we see that Difference-in difference can be applied across the variety of types of major events. Each of the case studies provide different characteristics in terms of the types of events, their geographical level, and frequency. Each case study therefore provides different insights, and represents a different analytical ‘type’. We set out below the instances where each case study offers the best model for evaluation of the impacts of cultural and sporting events, and in the Toolkit we provide further guidance on when they should be used:

Our preferred approach: London 2012 Games

The London 2012 Games case study shows that higher levels of robustness can be achieved through more granular-level data. The analysis uses data at the Lower Super Output Area which is more granular than the other case studies which use data at the event area level. By conducting the analysis at a more granular level, statistical power and the credibility of matches is increased due to the sample size achieved. In addition, using a matching technique (rather than a shortlisted area or selecting a counterfactual based on prior knowledge) the counterfactual areas selected are more likely to represent a true counterfactual.

The method used in the London 2012 Games case study is contingent on having access to granular data (we suggest LSOA). For most variables of interest for a major event evaluation getting access to this level of granular data will require a data agreement (for example through the ONS Secure Research Service) or primary collection of data. The econometrics involved also requires a familiarity with regression and large datasets.

In the absence of appropriate data and/or econometric experience we suggest following a similar approach to the following case studies based on the event characteristics.

If your event is held in multiple areas: Cricket World Cup

This approach used difference-in-difference across multiple treatment and comparison areas based on whether they were involved in hosting the event. Although the analysis is done at the area level which is less robust than conducting the analysis at a more granular level as in the London 2012 example, because the fact that there are several treatment areas and counterfactual areas this increases the power of the analysis. This approach is effective when the impact is expected to be localized around event sites, such as increased sports participation however may suffer from location-specific biases if hosting locations differ significantly from non-hosting ones in unobserved ways.

If your event is as part of a competitive series of events: City of Culture Winners

The methodology used in the city of culture methodology utilizes shortlisted cities as controls, which can help mitigate selection bias. Differences between winning and non-winning cities might still exist, potentially affecting the parallel trend assumptions. Using a staggered approach (i.e. using shortlisted and winning areas from future iterations as a counterfactual could help with relevance of counterfactual and to increase sample size.

If your event is as a one-off event or part of a regular series in the same location: Manchester International Festival

Manchester is compared to similar cities like Leeds and Nottingham, which did not host similar festivals over the evaluation period. The cities are identified through suggested methods (PSM and Euclidean distance) to identify appropriate counterfactuals. However, we suggest using at least four counterfactual areas to reach a necessary level of statistical power to make claims on causality.

6.2 Supporting Difference-in-Difference with other analysis

We suggest that trend analysis can be used to support Difference-in-Difference analysis, but do not recommend using trend analysis in place of Difference in Difference, as it does not provide sufficient causality for ascertaining whether the event has impacted on the outcomes of interest . Trend analysis is helpful to use in advance of applying the DiD approaches. It can help to identify fluctuations in the indicator which gives you a sense of the robustness of the data. Also seeing if there are other peaks before or after the event you can identify other heterogenous impacts, maybe another event or a political change. It is also more intuitive to understand than Difference-in-Difference approaches so can help make assertions to a lay reader. Caution should however be taken on attributing impact to the event purely based on trends.

Mixed-methods evaluations that combine quantitative data with qualitative insights from interviews, focus groups, or case studies offer a richer understanding of impact and explore the mechanisms behind observed changes. More qualitative methods also provide an alternative where quantitative methods are inconclusive or not possible because of data limitations. We suggest supporting your quantitative findings with qualitative information where possible.

6.3 Data availability

Although we have not covered a comprehensive set of indicators and data sources within the case studies we can review where data is available and where findings have been limited by the comprehensiveness of the data available. The London 2012 Games case study shows that with granular data aggregated at a level below the area of the event, a DiD approach with ample statistical power can be applied. However, access to this data relied on data agreements which may not be possible for all evaluators to agree. The below table sets out for key indicators how data can be accessed and the implications for evaluators. It should be noted that data agreements and the continuation of regular data collection is subject to change and the information presented here is correct at the time of writing.

Table 6.1: Key indicators for how data can be assessed and the implications for evaluators

Outcomes	Reflections
Turnover and Employment	Business Structure Database (BSD) provides useful information such as Industry classification using the Standard Industrial Classification (SIC) system, turnover, headcount and location. This data is available through the ONS Secured Research Service at the output area level. If a data agreement is not possible then Companies House data can be accessed openly. However, data must be extracted manually. This could be useful when exploring particular types of business, for example Elite Cricket Clubs where there are a limited number of data points to extract.
Wellbeing	Standard ONS questions on Life Satisfaction are recorded within several regular multi-topic surveys. Data from the Understanding Society survey is available through an agreement with ONS (e.g. through the UK Data Service and the SRS) which provides data at the LSOA level. Active Lives Survey data can be downloaded at a regional level from their website. It is therefore possible to follow the preferred DiD approach with a data agreement however if no data agreement is available primary data collection would be needed to use this approach.
Volunteering	Data from the Understanding Society Survey includes questions on whether respondents had volunteered in the last 12 months and how for many hours. The data is only reported every other year which limits the data analysis as it might not be possible to see event-year levels of volunteering. For the time series considered in the case studies, no other appropriate sources of volunteer numbers were found. For evaluation of future events, The Community Life Survey has recently been delivered with an increased sample size to allow LA level analysis (historically data has been collected with a smaller sample size only allowing regional level breakdowns). The Community Life Survey could therefore provide a useful source of data although we have been unable to test it within this report. It is therefore possible to follow the preferred DiD approach with a data agreement using Understanding Society Data or using the Community Life Survey with improved sample size. Alternatively, primary data collection would be needed to use this approach.
Participation	Active Lives survey measures the activity levels of people across England. Data can be downloaded at a regional level from Sport England Website. The Taking Part Survey includes information on the regularity of participation for adults and children art involving museums and galleries, archives, libraries, heritage and sport. The data is only available at the regional level. It is therefore possible to follow the preferred DiD approach with a data agreement if analysing physical activity such as sport participation however if no data agreement is available or participation in cultural activities is the focus then primary data collection would be needed to use this approach.

6.4 Common challenges across all major events

Attribution and confounding factors: Isolating the specific impact of an event from concurrent social, economic, or political changes (confounders) is difficult. Major events are often timed to coincide with other policy initiatives, further complicating causal attribution. Untangling these intertwined effects requires robust methodologies and careful consideration of contextual factors. For example, the introduction of The Hundred and the COVID-19 pandemic obscured the analysis of the Cricket World Cups, highlighting the difficulty of isolating specific impacts. None of the methods suggested allow an evaluator to completely remove these aspects however using the suggested matching methods within a difference-in-difference approach and using a higher granularity of data will likely reduce the impact of confounding factors.

Data Availability and Quality: Obtaining accurate and relevant data is a recurring challenge. Data may be unavailable, inconsistent across time periods, geographically granular or available at different levels of granularity, or lack sufficient detail to capture the intended outcomes. Evaluations often rely on readily available secondary data, limiting the scope of analysis and requiring compromises. The reliance on national datasets for the London 2012 Games case study highlighted the limitations of data granularity in capturing local effects. We suggest combining event specific evaluation data with national datasets to provide a more nuanced understanding of impact. This worked well for the MIF case study on areas like employment, tourism, and cultural participation.

Defining a Target Area: Defining the geographic area over which impacts are felt is challenging. Events attract visitors from beyond the immediate locality, and impacts may ripple across regions or even nationally. Sporting events in particular are broadcasted via television and radio and therefore consumed by people across the country. Determining the appropriate scope of analysis requires careful consideration of the event’s nature, scale, and reach. All of the case studies have used spatial analysis however similar approaches could be used to discern the difference between those who attended an event and those who did not or those who consumed the event through media and those who did not. This would require access to commercial data such as ticket sales or television subscriptions with accompanying household data such as postcodes.

Selecting a robust counterfactual: A range of counterfactual options were explored using the case studies. Although international counterfactuals are suggested as part of the toolkit we did not identify a way to test them within this case study analysis due to our reliance on the available secondary data and the lack of consistent data across countries. By only assessing treatment areas against other UK based controls, national scale impacts are subtracted and therefore our analysis is unlikely to show the full impact of the major events we have analysed.

Subjectivity & Metrics: Measuring certain legacy impacts at the societal level, like social cohesion, civic pride, and wellbeing, relies on subjective data and requires standardized, quantitative metrics. Deriving and agreeing upon appropriate metrics that capture both “what changes” and “to what extent” is essential but complex.

Lack of objectives: Where objectives are either absent or unrealistic in their scope or quantity, evaluation of legacy impacts becomes challenging. We therefore see it as vital that a theory of change is established to best understand how objectives are intended to be reached. We have suggested only evaluating outcomes where the causal pathway along the theory of change is strong and therefore is likely to create impact detectable through data analysis.

6.5 Learning about legacy

Clearly, attributing specific long-term outcomes to major events is complicated due to the various reasons set out in this report. For some indicators we were unable to identify significant improvements attributable to the event. However this does not necessarily mean that the events did not improve these outcomes as data collection and indicators are rarely a perfect proxy.

For some indicators we were able to identify short-term improvements. For example, we identified improvements in the tourism sector (employment and turnover growth) during and immediately following the London 2012 Games. However, the evidence did not show any persistent improvements that were statistically significant.

There were limited cases where longer-term legacy impacts were identified which makes it difficult to infer any more about what legacy constitutes and how long after an event we should define legacy. The analysis of the cricket world cup was able to build a narrative of the improved ability to bounce back from COvid-19 with a statistically significant increase in regular participation among event group women compared to the control group in the post-COVID-19 period. Also, there is evidence of statistically significant increases in total revenue and net profit for county cricket clubs associated with hosting men’s CWC matches in some of the post-event years.

One of the challenges to detecting and distinguishing short and longer-term impacts comes from data availability. With the availability of longer term and higher resolution data, it may be possible to capture this nuanced effect by extending the DID method. One approach would be to interact the treatment variable with different time periods. For instance, you could have separate variables indicating short-term (e.g., first year after treatment) and long-term (e.g., five years after treatment) effects. With sufficient data on the outcomes of interest, this would allow the evaluator to statistically test the significance and magnitude of the treatment effect in different timeframes and test whether they vary over time.

These case studies demonstrate the complexities of evaluating the legacy and impact of major events. While they offer evidence of short-term successes across various domains, capturing and attributing sustained long-term impacts is significantly more challenging. Several key lessons emerge from this research:

Early and robust evaluation planning: Evaluations should be integrated into the early planning stages of major events, aligning with clearly articulated objectives and theories of change. This allows for the identification of relevant indicators, data sources, and methodologies before the event takes place, ensuring that evaluations are fit for purpose and minimise compromises.
Development of a robust theory of change: A clearly defined theory of change that articulates intended impacts, mechanisms, and measures is essential. This clarifies the logic of the intervention, guiding data collection and analysis towards meaningful outcomes.
Pre-event baseline data collection: Collecting comprehensive baseline data on key indicators before the event is crucial for measuring true change and isolating the event’s specific impact from other confounding factors. Baseline data should be aligned with the theory of change and reflect the geographic scope of the evaluation.
Consideration of confounding factors: Anticipating and accounting for potential confounding factors, such as concurrent policy changes, economic trends, or external shocks, is essential. This requires careful consideration of the context in which the event takes place and the use of robust methodologies that control for these influences. Evaluations should employ robust methodologies, such as difference-in-differences, propensity score matching, synthetic controls, or spatial analysis, to control for confounding factors and isolate the specific impact of the event.
Prioritizing longitudinal and panel data: Cross-sectional data provide limited insight into changes over time. Evaluations should strive to collect longitudinal or panel data that track outcomes for individuals, businesses, or communities before, during, and after the event.

By applying these lessons learned, future evaluations can contribute to a stronger evidence base on the legacy and impact of major events, enabling a more nuanced understanding of their long-term effects and informing strategies for maximizing their benefits. Investing in robust evaluation methodologies and prioritising long-term data collection will be crucial for unlocking the full transformative potential of these significant events.

Question 8

7. Appendices

Accepted Answer

7.1 Review of previous evaluation for Case Study 1: London 2012 Olympic and Paralympic Games Post-Games Evaluation

This meta-evaluation of the impacts and legacy of the London 2012 Olympic and Paralympic Games was commissioned by the Department for Culture, Media & Sport (DCMS and conducted by a consortium led by Grant Thornton). The evaluation aimed to assess the additionality, outputs, results, impacts, and benefits of the investment in the London 2012 Games, focusing on the evidence available one year after the event.

7.1.1 Outcomes

The evaluation explored the following outcomes:

Economic Impact: Boost to the UK economy, generating significant Gross Value Added (GVA) and creating employment opportunities.
Sport Participation: Increase in adult participation in sports and physical activities.
Youth Engagement: Inspiration and engagement of children and young people in sports, cultural activities, and volunteering.
Elite Sport Performance: Enhanced performance of UK athletes in elite sports events.
Volunteering: Surge in volunteering opportunities and increased enthusiasm for volunteering
Disability and Inclusion: Positive impact on attitudes towards disability and increased opportunities for disabled individuals.
Community Engagement: Engagement of communities across the UK through various initiatives.
Urban Regeneration: Physical transformation and revitalization of East London
Tourism: Boost to the UK tourism industry during and after the London 2012 Games.
Sustainability: Setting new standards for sustainability in mega-events.

7.1.2 Methods used and limitations

The report acknowledges many of the challenges faced when measuring legacy impacts. First, given the long-term nature of legacy development, many anticipated benefits were still to materialize, and some impacts were yet to be fully understood one year after the London 2012 Games. Secondly, isolating the specific impact of the London 2012 Games from other factors influencing the measured outcomes, such as the economic downturn or pre-existing policy initiatives, proved difficult. Lastly, the breadth and complexity of the London 2012 Games and their related programs presented a significant logistical and analytical undertaking.

7.1.3 Timing

The evaluation was conducted one year after the London 2012 Games, representing an early assessment of the emerging legacy. This timing allowed for the capture of immediate impacts and early signs of legacy development, but it also meant that long-term outcomes and the sustainability of early gains could not yet be fully determined.

7.1.4 Findings

The evaluation identified several key findings across different themes:

Economic Impact: The London 2012 Games provided a substantial boost to the UK economy, generating an estimated £28 billion to £41 billion in Gross Value Added (GVA) and creating numerous employment opportunities. The construction of the Olympic Park and related infrastructure projects stimulated economic activity, particularly in the construction sector. The London 2012 Games also positively impacted the tourism industry, attracting overseas visitors and boosting domestic spending.

Sport Participation: The London 2012 Games contributed to an increase in adult participation in sport and physical activity. Pre-existing and London 2012 Games-related participation programs, combined with the inspirational effect of the event, motivated people to engage in sports. Investment in sports facilities and infrastructure also facilitated increased participation opportunities.
Youth Engagement: The London 2012 Games inspired a generation of children and young people through participation in sports programs, cultural activities, and volunteering opportunities. School-based programs, such as the School London 2012 Games and Change 4 Life Sports Clubs, engaged thousands of young people in sports and physical activity. The Cultural Olympiad provided numerous opportunities for creative expression and cultural engagement, with a significant portion of projects targeting children and young people.
Elite Sport Performance: The London 2012 Games acted as a catalyst for improved elite sporting performance in the UK. Additional funding allocated as a result of hosting the London 2012 Games allowed for investment in talent identification and development, coaching, international competition, and scientific research. Team GB and Paralympics GB achieved significant medal success, exceeding their targets
Volunteering: The London 2012 Games led to a surge in volunteering opportunities and increased enthusiasm for volunteering. The London 2012 Games Maker program alone created 70,000 volunteer positions, attracting a diverse range of participants. The positive experience of volunteering at the London 2012 Games motivated many to continue volunteering in other contexts.
Disability and Inclusion: The London 2012 Games had a positive impact on attitudes towards disability and provided new opportunities for disabled people to participate in society. The unprecedented levels of media coverage of the Paralympic London 2012 Games by Channel 4 challenged perceptions of disability and promoted inclusion. The London 2012 Games also led to improvements in accessibility and increased participation of disabled people in volunteering, cultural activities, and sports.
Community Engagement: The London 2012 Games engaged communities across the UK through a variety of initiatives, including the Olympic Torch Relay, the Cultural Olympiad, and Inspire projects. These programs brought the London 2012 Games to local communities, creating opportunities for participation, celebration, and shared experiences.
Urban Regeneration: The London 2012 Games accelerated the physical transformation of East London, particularly the Olympic Park and surrounding areas. The creation of new sports venues, housing, community facilities, and green spaces helped revitalize the area. The London 2012 Games also led to improvements in public transport infrastructure, benefiting East London and wider London.
Tourism: The London 2012 Games provided a boost to the UK tourism industry, attracting overseas visitors and generating significant spending. While overall visitor numbers were slightly down during the London 2012 Games period due to displacement of regular visitors, the higher spending by London 2012 Games visitors resulted in a net gain for the tourism sector.
Sustainability: The London 2012 Games set new standards for sustainability in mega-events. The integration of sustainability principles into the planning, design, and delivery of the London 2012 Games led to significant achievements in areas such as waste management, carbon reduction, and sustainable sourcing of materials. These practices have the potential to influence future events and construction projects.

7.2 Review of previous evaluations for Case Study 2: UK Cities of Culture

The evaluations of previous UK Cities of Culture have been designed for a diverse audience of stakeholders, each with distinct interests in the programme’s outcomes and impacts. Foremost, an evaluation is also a formal requirement from the DCMS. This is so the winning place can demonstrate whether the programme has achieved its objectives, such as driving economic growth, increasing cultural participation, fostering social cohesion, and creating lasting legacies. The evaluation is primarily of activities and outputs of the nominated delivery organisation. At a local level, these evaluations help shape future policies, refine the programme’s design, and demonstrate value for money from public funding.

Host cities and local governments also rely on these evaluations to measure the success of their cultural strategies and delivery mechanisms. The insights gained are critical for long-term planning, securing future investment, and ensuring the sustained benefits of the title. Additionally, cultural organisations and creative industries use the findings to understand how the programme impacts their sector, guiding the development of future projects, partnerships, and infrastructure improvements.

Evaluations also serve local communities by providing transparency and accountability, highlighting how the programme has benefitted residents and fostered social and cultural value. For academia and research institutions, they contribute to the growing body of evidence on culture-led regeneration, enabling comparative studies and refining evaluation methodologies. Finally, future bidders for the title use the evaluations as a resource to learn from the experiences of previous hosts, improving their bidding strategies and planning for effective delivery and legacy.

7.2.1 Outcomes

UK CoCs try to capture and evaluate against a wide range of impacts and outcomes. Within the evaluation of Derry/Londonderry in 2013, at least 35 immediate outcomes and objectives were identified in the benefits realisation plan. Hull in 2017 had 27 immediate outcomes, and Coventry in 2021 had 15 outcomes. Based on previous evaluations, the following common outcomes have been measured:

Economic Impact:

Increase in additional investment
Increase in tourism
Increased jobs in the creative industries and cultural sectors

Sector Development/Stability:

Strengthened cultural infrastructure
Cultural sector capacity building

Health and Wellbeing:

Improved health and wellbeing scores

Social and Cultural Value:

Improved civic pride and social cohesion
Increase in participation

Environmental Sustainability:

Environmental awareness is increased through programming
Elimination of single use plastic across all events

7.2.2 Methods used and limitations

Evaluation methodologies for UK CoCs have evolved over time. The assessment of Derry/Londonderry’s tenure as UK CoC employed fewer techniques and methodologies compared with those utilised for Hull and Coventry. Table 2.2 offers a summary of the evaluation methods utilised by each place, including those which are detailed in the evaluation plan for Bradford 2025 (as of September 2025).

7.3 Review of previous evaluations for Case Study 3: Cricket World Cups

To the best of our knowledge, there have been no previous evaluations of the women’s CWC 2017 for which any reports or data are available in the public domain. The men’s CWC 2019 was subjected to an economic impact analysis commissioned by the ICC. This study estimated that the event generated around £350 million for the UK economy through the additional expenditure by event visitors and organisers as well as business to business supplier contracts and broader consumer spend. The economic impact report is not publicly available and therefore it is not possible to diagnose the robustness of the methods used or the veracity of the findings.

7.4 Review of previous evaluations for Case Study 4: Manchester International Festival

As the principal funder for the festival, evaluation findings are submitted to Manchester City Council on the conclusion of each festival for scrutiny and approval by the city’s Cabinet. Early iterations of the festival were evaluated independently by consultancy Morris Hargreaves McIntyre. Findings were then incorporated into council papers for approval. It is unclear from the publicly available data and information if the later festivals (from 2017 onwards) have been evaluated independently or if the evaluation has been done ‘in house’. Across the lifetime of the festival, consistent metrics have been measured and evaluated against, allowing for changes which have occurred over time to be easily identified and accounted for.

In late 2015, Manchester City Council began consultations on the city’s overarching strategy, Our Manchester, which launched formally in late 2016. Outcomes for MIF 2021 and beyond have been aligned to city outcomes to ensure a stronger relationship between the event and the overall strategic needs and outcomes sought for Manchester.

7.4.1 Outcomes

MIF is aligned to Our Manchester outcomes which seek for Manchester to be…

a thriving and sustainable city: supporting a diverse and distinctive economy that creates jobs and opportunities
a highly skilled city: world class and home-grown talent sustaining the city’s economic success
a progressive and equitable city: making a positive contribution by unlocking the potential of our communities
a liveable and low carbon city: a destination of choice to live, visit, work
a connected city: world class infrastructure and connectivity to drive growth

In addition, recent evaluations have also explored the following outcomes:

Grow the international reputation of the festival and city
Grow digital audiences, locally, nationally and internationally
To build and grow relationships with low engaged Greater Manchester audiences
To bring ‘the most extraordinary’ artists from around the world to Manchester
Connect in new and deeper ways with the city and region of Manchester
Increase the range and diversity of those engaging with the festival
To develop the brand, profile and awareness of MIF/The Factory (now Aviva Studios) locally, nationally and internationally
Achieve sustainability targets which align with UN Sustainable Development Goals

7.4.2 Methods used and limitations

Evaluations to date have made strong use of primary data in understanding the impact of MIF. Each evaluation looks into the specific festival in question and there is limited examination of collective longitudinal effects. Economic impact data are calculated by using comprehensive ticketing data, audience surveying and MIF accounts. The use of national datasets in the evaluation is not evident from what is publicly available, which highlights the value added of the methodological approach to this case study.

At present, the evaluations of MIF do not reach Level 3 on Maryland Scientific Methods Scale of Level 3 on the NESTA Standards of Evidence. The reason for this is the lack of counterfactuals or an evident control group.

7.4.3 Timing

Evaluation findings are submitted to Manchester City Council following the festival, with the reporting and review process consistently occurring no sooner than three months after the festival’s conclusion. Historically, this reporting window has ranged from three to seven months across all iterations of the festival. As the principal funder and financial underwriter of the festival, Manchester City Council conducts a scrutiny process involving the Council’s Cabinet of evaluation findings to ensure fiscal responsibility, the meeting of KPIs and to provide a settlement for future work. As an example of this scrutiny process, to date, the festival has successfully broken even in every iteration, except the inaugural festival in 2007, which incurred a £0.2 million loss. On that occasion, Manchester City Council covered the deficit but recovered the funds in 2009 when the festival generated a £0.2 million surplus.

7.4.4 Findings

From the most recent evaluation of MIF 2023:

A thriving and sustainable city: supporting a diverse and distinctive economy that creates jobs and opportunities: As reported to Manchester City Council, MIF 2023 was a flagship event in an important year for culture in Manchester and contributed substantially to the ongoing cultural recovery of the city. The evaluation of MIF 2023 stated that the festival had generated £39.2 million of economic activity in Manchester, compared to an economic impact of £19.5 million in 2021.^{[footnote 33]} Metrics included in the calculation of economic impact include visitor spend, audience numbers, volunteer hours, and engagement of local artists, however the full methodology is not available publicly. Over the next decade, Aviva Studios is projected to generate £1.1billion to the city’s economy and create or support 1,500 direct and indirect jobs.

A highly skilled city: world class and home-grown talent sustaining the city’s economic success: Factory International and MIF 2023 continues to boost employment and volunteering opportunities, with 218 staff members (or 210 Full-Time Equivalents) and 488 volunteers. The Factory Academy continues to work with industry partners to provide pathways to training and employment for Manchester residents.
A progressive and equitable city: making a positive contribution by unlocking the potential of our communities: Factory International’s creative learning team, through its Neighbourhood Organisers and Community Partnership programmes, continues to raise awareness in communities and develop pathways for residents to engage with the festival offer. 1,164 children and 25 schools were involved in creative opportunities during MIF23, as well as 157 adults and 58 children across 16 sessions as part of the Community Engagement programme. 20% of audiences across MIF and the Aviva Studios opening programme were Black, Asian, or Ethnically Diverse. The company’s permanent staff team includes 28% of colleagues that are Black, Asian or Ethnically Diverse, 16% are disabled and 53% are female. Amongst board members, 47% are Black, Asian or Ethnically Diverse, 11% are disabled, and 44% are female.

A liveable and low carbon city: a destination of choice to live, visit, work: MIF 2023 drew 325,300 visitors, including 83,000 visitors to the new Festival Square. It is a significant part of the city’s cultural offer with a reach into communities that is both broad and deep, through its creative learning programme, including schools’ outreach. In addition, 3,500 people tuned into MIF 2023 to watch live content from 36 countries during the festival, with 700,000 visits to the Factory International website which represents an increase of 135% on MIF 2021. 19,000 people read and watched content on Factory+, Factory International’s digital content strand. The evaluation acknowledges the festivals commitment to sustainability and meeting environmental goals set by Manchester City Council.

A connected city: world class infrastructure and connectivity to drive growth: At MIF 2023, 24 countries and 55 cities from across the globe were represented by 145 guests and 101 organisations. Through the “International Weekend” for curators, presenters, artistic and executive directors, programmers and producers attended MIF 2023 and almost 150 of these international key players from a range of art forms experiencing Manchester and Manchester-based productions, while creating and strengthening industry connections. A reception by the Lord Mayor was held two days before the festival opening for delegates of the 2023 International Society for the Performing Arts mid-year congress in Manchester.

In addition, research into the first year of operation of the Aviva Studios found:

703,735 visitors to Aviva Studios in the first year of operation. Based on available data, there have not been any unintended displacement effects as venues in the GMCA area have not reported a reduction in visitor numbers.
In excess of 21,000 Aviva £10 tickets have been issued since launching the affordable ticket scheme, helping to ensure that the programme is accessible to all.
1000 people from Greater Manchester have been trained to take up jobs in the creative industries having graduated from Factory Academy’s free courses with 100 alumni going on to paid roles at Factory International and many more taking up employment at the likes of Studio Lambert, Rochdale Development agency, Science and Industry Museum, Rio Ferdinand Foundation, Manchester Youth Zone, Odd Arts and more.
102 artists from the North of England have benefitted from development opportunities including Factory Fellows, Artist Takeovers and Factory Sounds, networking opportunities and more.
Factory International has worked with over 25,000 children and young people through family-friendly events and activities as part of the public programme, as well as through relationships with early years centres, schools, colleges, the city’s universities, youth zones and community hubs.
549 volunteers from across the region helped make Factory International’s opening year a success, getting involved in everything from supporting shows behind the scenes, to being the face of the venue.

The low SMS level in the evaluations means that the above is at best contextual information. One of the benefits of MIF’s approach to evaluation is the consistency over time in their reporting. However, the evaluation has a focus on the delivery of the festival from the perspective of the delivery body. Therefore, use of national data and counterfactuals is required for a more robust understanding of legacy.

It can be argued that the evaluation process for the UK CoCs held thus far has followed a progressive trajectory, with each subsequent city learning from the evaluations of its predecessors. This iterative approach allows for the refinement and improvement of evaluation methodologies, drawing on the experiences and insights gained from previous cities. For example, Hull’s evaluation likely incorporated lessons learned from Derry/Londonderry’s experience as the inaugural titleholder. Similarly, Coventry would have benefitted from Hull’s evaluation findings. Now, as Bradford prepares for the start of their year, knowledge exchange sessions have been held between the evaluation teams from Coventry and Bradford to pass on methodological challenges and learnings with the aim of progressing learning in this space.

With regards to assessing evaluations undertaken thus far against standards of evidence, overall, evaluations of UK CoCs typically fall at Level 3 or below on the NESTA Standards of Evidence and they do not reach Level 3 on the Maryland Scientific Scale. While certain elements, such as externally commissioned economic impact assessments following HM Treasury’s Green Book guidance, demonstrate robustness and rigour, other components vary in their level of thoroughness.

Department for Digital, Culture, Media & Sport, 2012. A legacy from the London 2012 Olympic and Paralympic Games. [pdf] Available at: https://assets.publishing.service.gov.uk/media/5a74f325e5274a3cb28687cc/201210_Legacy_Publication.pdf ↩
UK Government, 2013. London 2012 Meta-Evaluation. [online] Available at: https://www.gov.uk/government/collections/london-2012-meta-evaluation ↩
UK Government, 2013. London 2012 Meta-Evaluation. [online] Available at: https://www.gov.uk/government/collections/london-2012-meta-evaluation ↩
Department for Digital, Culture, Media & Sport, 2013. Meta-Evaluation of the Impacts and Legacy of the London 2012 Olympic Games and Paralympic Games: Report 1. [pdf] Available at: https://assets.publishing.service.gov.uk/media/5a74f121ed915d502d6cc3c5/DCMS_2012_Games_Meta_evaluation_Report_1.pdf See for examples Figure 5.2. ↩
Whilst this analysis sought to explore the localised impacts of the Games, it could be adapted to explore potential displacement effects, for example by running the analysis on LSOAs within the exclusion zone to see if they experience negative impacts relative to the comparison units. Positive coefficients would indicate spatial spill overs, where are negative coefficient would suggest displacement effects. ↩
Using SIC codes defined in ONS (2010) Measuring Tourism Locally ↩
Specifically, two-way fixed effects, incorporating both area (or individual depending on the unit of the analysis) and time fixed effects. ↩
Austin, P.C (2009) Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples, Statistics in Medicine, 28, 3083-3107. ↩
This is referred to as the “Fundamental problem of causal inference” – the idea that the counterfactual is not observable because it is a state that did not happen. ↩
Dolan, P., Kavetsos, G., Frey, B. S., and Lador, F., 2016. The Host with the Most? The Effects of the Olympic Games on Happiness. [pdf] Available at: https://cep.lse.ac.uk/pubs/download/dp1441.pdf ↩
ONS SRS data is incredibly granular, in most instances providing postcode or output area boundaries. These granular geographies can be used to precisely map the newly defined boundaries. ↩
For example aggregate Business Register and Employment Survey data is available at the LSOA level through the Nomis Data Platform ↩
Office for National Statistics, 2007. UK Standard Industrial Classification of Economic Activities (UK SIC) Archive. [online] Available at: https://www.ons.gov.uk/methodology/classificationsandstandards/ukstandardindustrialclassificationofeconomicactivities/uksicarchive ↩
At the time of writing, the ONS were quoting a 10-week application review period. Timings are subject to vary given ONS capacity, volumes of applicants, and complexity of individual projects. ↩
For example, the Business Structure Database cannot be used within the same project as the Annual Population Survey, as this was deemed to present a high risk of disclosure. This may limit the outcomes that could feasibly be explored. ↩
ICC = International Cricket Council ↩
England and Wales Cricket Board, 2018. ECB and ICC Cricket World Cup aim to take cricket to one million children. [online] Available at: https://www.ecb.co.uk/news/861753/ecb-and-icc-cricket-world-cup-aim-to-take-cricket-to-one-million-children ↩
Specifically, two-way fixed effects, incorporating both area (or club depending on the unit of the analysis) and time fixed effects ↩
Castellanos-García, P., Kokolakakis, T., Shibli, S., Downward, P. and Bingham, J. (2021) Membership of English sport clubs: A dynamic panel data analysis of the trickle-down effect, International Journal of Sports Policy and Politics, 13:1, 105-122, DOI: 10.1080/19406940.2021.1877170 ↩
UK Sport / DCMS (2023) Gold Framework: Guidance on UK-level support available when bidding for and staging major sporting events, UK Sport / DCMS, London UK. ↩
The Evaluation Strategy for Bradford 2025 was published in late 2024 and can be accessed here - https://bradford2025.co.uk/wp-content/uploads/2024/11/Bradford-2025-Evaluation-Strategy.pdf [accessed 24/03/2025] ↩
Commissioned by the DCMS, Warwick Business School undertake an evidence review of the UK CoC Programme, this report was submitted to the DCMS in July 2024. Published 24th April 2025. ↩
Stephen Roper, Evaluating the local business growth effects of the UK City of Culture 2013 and 2017: A simple propensity score matching-difference-in-difference modelling approach (Coventry: Enterprise Research Centre, University of Warwick, 2024), available at: https://www.enterpriseresearch.ac.uk/wp-content/uploads/2024/02/ERC-Insight-Evaluating-the-local-business-growth-effects-of-the-UK-City-of-Culture-2013-and-2017-Roper.pdf [accessed 20/11/2024]. ↩
See, https://houlton.co.uk/projects/125/ferens-art-gallery-refurbishment [accessed 16 January 2025] ↩
The statistical power of this DiD analysis is reduced by the relatively small number of observations (n=36) and the limited number of treatment and control areas, which may constrain the ability to detect subtle effects with high confidence. ↩
The DiD formula utilised here is - DiD Effect=Treatment After-Treatment Before-(Control After-Control Before) ↩
DiD formula utilised: ∆Y=Y post,treatment-Y pre, treatment-(Y post, control-Y pre control), where Y is life satisfaction score. ↩
Data for Dundee City and Swansea (the other cities shortlisted for the UK CoC 2017 title) is available, however due to varying collection methods in Scotland and Wales it has been excluded for robustness. ↩
See, Corcoran, R., 2024. Community and wellbeing evaluation of a unique international cultural event: Liverpool’s hosting of Eurovision 2023 for Ukraine, Liverpool: University of Liverpool/What Works Centre for Wellbeing. ↩
The analysis here makes use of a two-stage DiD approach that captures both the initial gain in cultural participation and its subsequent reversal. The total effect is calculated using the formula: ↩
Some cities do host annual or biennial festivals however the scale to MIF is not comparable and these festivals typically only focus on one art form and not the multi art form setup like MIF. ↩
Despite the impact of the COVID-19 pandemic, 2021 is an appropriate stable data point to work from and avoids contamination from other major events such as the 2022 Commonwealth Games. ↩
Manchester City Council, 2023. Report spells out economic and wider benefits of last year’s Manchester International Festival and successful opening season for Aviva Studios. [online] Available at: https://www.manchester.gov.uk/news/article/9422/report_spells_out_economic_and_wider_benefits_of_last_years_manchester_international_festival_and_successful_opening_season_for_aviva_studios ↩

Cookies on GOV.UK

Executive Summary

1. Introduction

Context

Purpose of the document

Analytical process

Defining the major events

Theory of Change

Choosing indicators

Scoping data sources

Methodology

Findings

Learnings

2. London 2012 Olympic & Paralympic Games

2.1 Categorisation

2.1.1 Type of Major Event

2.1.2 Focus of the event

2.1.3 Scale and Geography

2.1.4 Importance

2.1.5 Competitive process

2.1.6 Duration

2.1.7 Construction of infrastructure

2.1.8 Catalysts for future events

2.2 Event objectives and Legacy Strategy

2.2.1 Theory of change

2.3 Choosing Indicators

2.4 Methodology

2.4.1 Overview of the evaluation

2.4.2. Defining a target area

2.4.3 Defining a counterfactual

2.4.4 Model Specification

2.5 Findings

2.5.1 Overview of findings

2.5.2 Tourism (Employment)

2.5.2 Tourism (Turnover)

2.5.3 Wellbeing

2.5.4 Volunteering

5.5.4 Olympic medals

2.6 Robustness of findings

Outcome measurement:

Threats to the assumptions of difference-in-differences:

Local context:

2.7 Learning

2.7.1 What did we learn about the methods for quantifying legacy and impact?

2.7.2 What do the findings tell us about legacy and impact?

3. Cricket World Cups

3.1 Background and Categorisation

3.1.1 Type of Major Event

3.1.2 Focus of the event

3.1.3 Scale and Geography

3.1.4 Significance

3.1.5 Competitive process

3.1.6 Duration

3.1.7 Construction of infrastructure

3.1.8 Catalysts

3.2 Event Objectives and legacy strategy

3.3 Choosing indicators and data sources

3.4 Methodology

3.4.1 Overview of case study methodology

3.4.2 Defining a target area

3.4.3 Defining a counterfactual

3.4.4. Robustness

3.5 Findings

3.5.1 Participation

3.5.1. Participation (National picture – adults)

3.5.1. Participation (National picture – children and young people)

3.5.2 Composite analysis – Event group 3/4 v control group 3/4

3.5.3 Women’s Cricket World Cup – Event group 1 v control group 1

3.5.4 Men’s Cricket World Cup – Event group 2 v control group 2

3.5.5 Wellbeing

3.5.6 Club Finances

3.6 Learning

3.6.1 What did we learn about the methods for quantifying legacy and impact?

3.6.2 What do the findings tell us about legacy and impact?

4. City of Culture Programmes

4.1 Background and Categorisation

4.1.1 Type of Major Event

4.1.2 Focus of the event

4.1.3 Scale and Geography

4.1.4 Importance