Innovation Clusters Map: summary and methods
Published 1 October 2025

Executive summary
The Innovation Clusters Map is a tool developed by the Department for Science, Innovation and Technology (DSIT) to support the UK’s ambition to foster regional innovation and drive economic growth. It identifies and visualises geographically concentrated clusters of businesses, research institutions, and infrastructure that are actively engaged in research, development, and innovation (RD&I). These clusters are critical because they generate agglomeration effects, where proximity between firms and institutions leads to knowledge spillovers, increased productivity, and enhanced resilience to economic shocks.
By mapping these clusters, the tool provides:
- policymakers
- investors
- local stakeholders
with a clearer understanding of where:
- innovation is happening
- sectors are thriving
- targeted support could unlock further growth
Clusters are created by using data to identify areas where businesses within economic sectors are concentrated geographically. This data-driven method allows clusters to emerge organically based on where businesses are located, rather than being constrained by administrative boundaries. This approach ensures that clusters reflect real-world economic geography and the potential for aggregate benefits.
Cluster creation uses consistent data sources, such as the Inter-Departmental Business Register (IDBR), and the best available sector definitions. This means that the map offers a robust and comparable view across sectors, and ensures the map highlights genuine concentrations of RD&I activity, enabling more targeted policy and investment decisions. Because these data and methods are innovative and experimental, there are some limitations to the evidence provided in the map, but this evidence is nonetheless crucial to understanding the UK’s innovation ecosystem.
Introduction
The Innovation Clusters Map identifies and describes the UK’s research, development and innovation (RD&I) clusters. The map can be used to explore the geography, distribution and strengths of innovation clusters across the UK, offering a detailed view of local innovation ecosystems. It allows users to focus on key sectors, access data on economic and innovation activity, and understand regional innovation strengths.
Innovative clusters represent geographically concentrated networks of companies, research institutions, specialised skills and supporting infrastructure within related sectors. These networks create knowledge spillovers and agglomeration effects that enhance local and regional innovation capacity and productivity levels. Enhancing local innovation ecosystems is essential for driving economic growth and addressing spatial disparities in development. Innovation clusters also bolster economic resilience and accelerate recovery following economic shocks.
When a region achieves critical mass in specialised expertise with strong innovation potential, this is a positive signal to competitive markets, attracting skilled workers, international investment, and government resources. As a result, clusters can initiate and sustain positive feedback loops of economic growth, delivering significant benefits both locally and across the broader economy. Recognising this, the UK government has made realising the potential of city regions and clusters a core objective of its new Industrial Strategy.
Version 1 of the Innovation Clusters Map, released in February 2024, was developed by a consortium including Cambridge Econometrics, The Data City, The Innovation and Research Caucus and the Department for Science, Innovation and Technology (DSIT). The new tool has been developed internally by DSIT. Key changes include updates to the data used, methodological improvements, and a rebuilt application with a focus on user experience.
Summary of methods
The Innovation Clusters Map uses the best available data to identify geographically dense clusters of firms in the same industry, focussing on clusters with evidence of research, development and innovation activity.
Clusters have been generated based on data on the businesses in each sector and the locations in which these businesses operate.
The standard way of classifying businesses into sectors is to use Standard Industrial Classification 2007 (SIC), and we use this where possible. However, SIC does not accurately reflect emerging sectors - sectors which are in the early stages of development driven by innovation or emerging technologies. Where it is not appropriate to classify businesses into emerging sectors using SIC, alternative methods have been used – these methods are described in more detail in the section ‘Identifying which businesses are operating in which economic sectors’. Sectors based on The Data City’s Real-Time Industrial Classification are also included in the tool, as are Innovate UK clusters not tied to a specific sector.
The locations in which businesses operate, alongside associated turnover and employment data, is captured using the Inter-Departmental Business Register (IDBR). See the section ‘Business location and financial information’ for more information on how the IDBR is used.
Clusters have been produced separately for every sector within scope. Sectors have been included if they:
- are an Industrial Strategy growth-driving sector or frontier industry
- are an emerging sector of interest to DSIT
- have been highlighted by users of version 1 of the tool as high priority
- have been created as a Real-Time Industrial Classification by The Data City
Clusters are produced for each sector by detecting geographic areas with a relatively high density of business locations using the HDBSCAN clustering algorithm. This results in clusters which reflect the real spatial distribution of businesses and avoids distortions due to administrative boundaries.
Metrics are produced for all clusters summarising cluster RD&I activity, economic activity, and size and importance of the cluster. RD&I active clusters are identified based on a variety of data sources, including government research funding data, research and development survey data, and The Data City’s innovation scores.
This results in the creation of RD&I clusters of concentrated activity where businesses are likely to benefit from agglomeration effects, alongside rich metrics to describe these clusters.
Methodological caveats
The map enables users to explore clusters which are available for a wide variety of sectors and across the whole UK. It utilises the best available data sources to identify emerging sector businesses alongside a single government-assured data source for location, turnover and employment data. Clusters been created using an innovative, data-driven method, meaning that clusters reflect the real spatial distribution of businesses without being constrained to administrative boundaries; and are provided alongside a rich set of cluster metrics.
However, because the map is based on experimental, innovative methods, there are several important caveats and limitations users should be aware of. The key caveats are related to consistency with other sources, cluster formation, and sector measurement.
Consistency with other sources: As far as possible consistent methods and data have been used across all sectors. This has the important advantage that a range of sectors and clusters can be compared and included on a single tool. However, the data and methods used may vary from the way some sectors are measured in other publications, which will mean clusters and metrics are not fully consistent with some other sources of data. For sector-specific details, see Annex A.
Cluster size and formation: Clusters capture the geographic area across which the relative density of business locations is higher. This means that clusters covering a larger area are not necessarily more important economically. In addition, clusters are created based on the density of business locations and not based on the amount of economic activity. This means that clusters with small numbers of large, important business locations may not be captured. For more detail, see the section ‘How the clusters are created’.
Sector measurement: The businesses which are active in each sector are captured using the best available data. However, in some sectors, it is common for a business to operate across multiple other sectors, and only a proportion of a business’ activity may be in the sector of interest. It is not possible to account for this consistently across all sectors. The primary result of this is that turnover and employment may be over-estimated. For more detail, see the section ‘Business data used to create clusters’
Business data used to create clusters
This section sets out the data used to:
- identify which businesses are operating in which economic sectors
- identify sites and financial information for these businesses
For a full list of data sources used including reference dates, see Annex B.
Defining sectors and identifying the businesses operating in each sector
The Innovation Clusters Map includes a wide range of economic sectors. To produce clusters for each sector it is necessary to identify the businesses engaged in the sector. The best way to do this varies between sectors. This section outlines the data used to identify businesses operating in different types of sector. For more detail on the specific data sources used for each sector, see Annex C.
Sectors are grouped in the sidebar of the map into 5 sector types:
- Industrial Strategy Sectors
- SIC sectors
- DSIT emerging sectors
- Real-Time Industrial Classification sectors
- Innovate UK sector
These sector types are aligned to the method used to define the sector.
SIC sectors
The standard way of classifying businesses into sectors is to use Standard Industrial Classification 2007 (SIC). SIC has a number of important advantages, including robust quality assurance in key government datasets and comparability with other sources, and as such it is our preferred source where it is suitable to use. In addition to sectors included in the SIC sectors sector type, the Digital Economy emerging sector is also based on SIC.
Industrial Strategy sectors
The Industrial Strategy sets out 8 priority sectors as central to driving economic growth, innovation, and resilience. Six of these sectors are included on the map - the Defence and Clean Energy Industrial Strategy sectors are not currently included as relevant data are not yet available. The Department for Business and Trade (DBT) have separately published analysis on priority Industrial Strategy city regions and clusters outlining how places were prioritised for the Industrial Strategy. See Annex A for more detail.
The SIC-derived definition of Digital and Technologies is only a proxy and does not fully capture the detail and fast-changing nature of the sector, and the extent of diversified companies in the sector. The Biosciences and Health Technology Sector Statistics (BaHTSS) publication is the best source for statistics on the Life Sciences sector. Methods and data sources differ between the innovation clusters map and the BaHTSS publication - in particular, the map uses lists of businesses within sectors used by the 2021-2022 BaHTSS publication and not the more recent 2023-24 publication.
Industrial Strategy sectors are based on SIC (with the exception of Life Sciences), using definitions set out in the Department for Business and Trade’s industrial strategy sector definitions list. Life Sciences is defined using a sector list, as outlined below – for details [see the Bioscience and Health Technology Sector statistics user guide for more detail] (https://www.gov.uk/government/statistics/bioscience-and-health-technology-sector-statistics-2023-to-2024/bioscience-and-health-technology-sector-statistics-2023-to-2024-user-guide)
DSIT Emerging sectors
DSIT emerging sectors are sectors of interest to DSIT which are in the early stages of development, driven by innovation or emerging technologies. Because SIC was last revised in 2007, SIC does not capture most DSIT emerging sectors sufficiently well. This problem is exacerbated by the fact that businesses will typically only be assigned to a single SIC, meaning that emerging sectors overlapping with other sectors may not be well-represented even if an appropriate SIC code exists.
Because of this, the businesses in DSIT emerging sectors (with the exception of Digital Economy – see above) are identified using a variety of data sources and industry engagement, often including applying classification methods on free-text data. This process is usually carried out in partnership with commercial providers – for a list of partners involved in the creation of sector lists for different sectors, see Annex C. This results in the creation of a ‘sector list’ - a list of businesses engaged in the sector. These lists have been validated by sector teams within DSIT and have gone through rigorous DSIT quality assurance and review. The Life Sciences Industrial Strategy sector has also been defined in this fashion.
Real-Time Industrial Classification sectors
Sectors in this category have been defined based on The Data City’s Real-Time Industrial Classifications (RTICs). These RTICs identify the businesses operating in a range of sectors based primarily on information scraped from company websites and expert insight.
RTIC sectors have been produced and approved by The Data City but have not been through DSIT quality assurance. These sectors are included to provide information across a wider range of sectors.
Where an RTIC has been used to create a DSIT emerging sector, it has been included under DSIT emerging sectors rather than the Real-Time Industrial Classifications, reflecting that the list has been created collaboratively with DSIT analysis teams and undergone DSIT validation.
Copyright in the Real Time Industrial Classifications data belongs to Data City and/or its licensors (2025)
Innovate UK clusters
Industrial Strategy, SIC, emerging sector and RTIC clusters are all produced for specific sectors – that is, clusters are produced for each sector separately. A broader set of clusters is also produced based on all businesses who have engaged with Innovate UK between 2015 and 2025. These clusters are referred to as ‘Innovate UK’ clusters and they are included in a separate category in the map. These clusters are designed to identify areas of higher R&D activity without pre-filtering to a specific sector.
Quality considerations with sector definitions
A key challenge in sector measurement is that, when a business is assigned to a sector, it is not necessarily the case that all the business’ activity is attributable to that sector. This cannot be accounted for consistently based on the data available, and therefore the size of some sectors is overestimated. For example, if a business is assigned to the Quantum emerging sector because it appears on the Quantum sector list, it is assumed that all of that business’ activity is attributable to the Quantum sector, which will sometimes result in overestimation. This issue affects all sectors, however it is more acute for DSIT emerging sectors and RTIC-based sectors due to the cross-cutting nature of some emerging sectors. Steps have been taken to limit the impact of this issue in the AI sector specifically by utilising additional information – see Annex A.
It is important to note that some sectors overlap with each other, and some businesses will therefore be in multiple sectors. For example, a business may be in both the Advanced Materials and Advanced Manufacturing sectors. For this reason, estimates should not be summed across different sectors.
Business location and financial information
Accurately capturing the locations at which businesses are operating is crucial to avoid distortions in the clusters. A common pitfall in geographic analysis based on business data is the headquartering effect, which is caused by an organisation’s registered or administrative headquarters being used as the location for attributing activity, rather than the actual site where the work takes place. To avoid the headquartering effect, information from the March 2025 Inter Departmental Business Register is used.
The IDBR is a comprehensive list of businesses stored for government statistical purposes. Crucially, the IDBR captures all locations each business operates at and associated financial information. A business location is referred to as a ‘site’ - a single business may have multiple sites if it operates at many different locations.
The IDBR stores information on the employment and SIC of each business site. This means that, where sectors are based on SIC, the IDBR can be used to identify all relevant business sites and employment information for those sites. IDBR turnover is only available for the entire business rather than the site and has therefore been apportioned to the site level using a ratio based on site employment. This is an improvement on the approach used in version 1, where business turnover was split equally between sites.
When businesses are assigned to a sector using an alternative method to SIC codes, linking to the IDBR becomes more nuanced. Businesses in sector lists and Innovate UK data are typically identified by their ‘Company Reference Numbers’ (CRNs), which are more detailed than the IDBR’s primary business identifier of an ‘enterprise’. It is therefore assumed that IDBR enterprises’ business activity contributes to an emerging sector in proportion to the turnover and employment of the constituent Company Reference Numbers in that sector. The IDBR sources and structure information document contains more details on how the IDBR defines enterprises and stores information about businesses. All business sites in the linked enterprise are treated as being part of the sector. This may result in some sites being included in a sector despite the relevant activity not taking place at the site. This potential over-coverage could be addressed in future research.
Integrating sector list data with the IDBR means that IDBR local business site information can be used for businesses in sector lists, and the address associated with the CRN does not need to be used. This avoids biases such as the headquartering effect mentioned above.
IDBR turnover and employment data is used to calculate cluster employment in this fashion for all sectors except the AI sector (for more details on the AI sector, see Annex A). Turnover and employment information is aggregated to the cluster level to produce cluster estimates. IDBR disclosure rules mean that turnover and/or employment information for some clusters is suppressed.
This approach to identifying business sites and financial information differs from the approach used in version 1 of the map. The version 1 map used business site and financial information provided by The Data City for emerging sectors, without use of the IDBR. This new version of the map uses the IDBR. This is because IDBR turnover and employment estimates are more accurate, particularly for businesses operating internationally. The IDBR also includes employment information at the site level, enabling better reflection of the real location of business activity. However, as IDBR site information is based primarily on a survey, it is not updated as frequently as The Data City, and the new approach may therefore include less up-to-date site information
How clusters are created
Cluster creation algorithm
The data described in the section ‘Business data used to create clusters’ provide the location of business sites for each sector. This location information is available as the postcode of each business site, which is transformed to a latitude and longitude using the location of the geographic centre of the postcode.
This data is used to form sector-specific clusters using the HDBSCAN clustering algorithm. This is the same high-level approach as version 1 innovation clusters, but improvements have been made to parameter tuning and the treatment of Northern Ireland, as outlined below. HDBSCAN identifies as clusters spatial areas where there is a high density of business sites relative to the surrounding area within the sector.
An important advantage of HDBSCAN is that not all sites will be assigned to a cluster – only areas with a relatively high density of firms will be identified as clusters. In addition, as the algorithm does not take administrative boundaries into account, clusters are not aligned to local authority or other boundaries, which helps ensure that boundaries do not have a distorting effect. For example, cluster 30 in the Space Economy sector overlaps the Sheffield, Rotherham, and North East Derbyshire local authority boundaries. Local authority-based estimates may therefore understate the size of the cluster and would not represent the spatial distribution of business activity as well as a HDBSCAN-based clustering approach.
Northern Ireland’s geographic separation often causes HDBSCAN to identify the entirety of Northen Ireland as a cluster when analysed alongside the rest of the UK. To resolve this, HDBSCAN was run separately in Northern Ireland to the rest of the UK, with hyperparameters selected separately for Northern Ireland.
The size of the resulting clusters reflects the size of the geographic area over which the relative density of business sites is higher. This means that bigger clusters are not necessarily more important. To address this, concentration metrics have been included for clusters to indicate:
- the proportion of sites in a cluster for a sector, relative to the total sites in that sector
- the proportion of sites in a cluster for a sector, relative to the total sites that fall within all clusters in that sector (as opposed to also including sites that are not indicated to be part of clusters by HDBSCAN)
Turnover and employment metrics, as discussed in the section ‘Business location and financial information’, also help to establish the economic importance of the cluster.
Within sectors, in some edge cases, clusters do overlap. This is because of the buffer applied to each cluster shape boundary to smooth the edges, which slightly expands the cluster area, and not because there are the same business sites in more than one cluster within a sector.
This approach involves treating each sector separately. This has some drawbacks - in reality clusters will cut across related sectors, including up and down supply chains. It is anticipated that this will be addressed in future work.
Parameter tuning
HDBSCAN utilises several hyperparameters, which are configuration settings for the algorithm, controlling the formation of clusters. For more details, see the HDBSCAN documentation
The aim of the clustering exercise was to produce clusters which reflect genuine benefits from agglomeration for businesses within the clusters. This means avoiding clusters which are too geographically large, as 2 sites on either side of the cluster would not benefit from being close to each other. At the same time, clusters should not be split up such that firms in 2 separate clusters likely benefit from co-location. In addition, clusters should not be so small that they omit firms benefitting from being close to the cluster.
To achieve this, hyperparameters were manually selected and tuned for each sector with the aim of producing clusters which reflect agglomeration benefits. For Industrial Strategy and DSIT emerging sectors, hyperparameter tuning was also carried out based on feedback from sector experts within government. Expert feedback was not used for RTIC, SIC, and Innovate UK clusters.
The hyperparameter minimum cluster size controls how conservative the clustering is by setting the minimum number of neighbouring points required for a point to be considered a ‘core point’ by the algorithm. This influences both the robustness to noise and the minimum cluster density. Minimum cluster size was initially set proportionate to the number of sites in the sector and then, where required, tuned (adjusted) based on feedback on the resulting clusters. In all sectors, the minimum cluster size was set to at least 10 to ensure clusters with small numbers of sites were not generated.
The hyperparameter cluster selection epsilon sets a distance threshold that allows clusters within this epsilon range to be merged – it controls when nearby clusters should be combined into one larger cluster. This parameter is set to 0 by default but increased above 0 in sectors with large numbers of small, spatially close clusters. This resulted in important improvements compared to the Innovation Clusters Map version 1 for some sectors. For example, in version 1 there were 7 distinct FinTech clusters within the boundaries of Greater London. Version 2 has a single FinTech cluster covering the London area, as all businesses in this area are likely to benefit from co-location.
Research, Development and Innovation
Identifying Research, Development and Innovation active clusters
Clusters have been identified as Research, Development and Innovation active (RD&I active) based on evidence of innovative activity from several sources. The map by default only shows RD&I active clusters, reflecting the focus on innovative activity, but the filters can also be adjusted to show non-RD&I active clusters.
RD&I active clusters are identified based on RD&I indicators for businesses within the cluster. A business is assumed to be RD&I active if it meets any of the following criteria:
- applied for an Innovate UK grant between the period April 2015 and July 2025
- received a non-Innovate UK UKRI grant between January 2006 and June 2024
- has a positive ‘Innovation Score’ from The Data City, as extracted in May 2025 - see the ‘Introducing our company innovation measures’ blog
- is conducting Research and Development as identified by the Business Enterprise Research and Development survey between 2018 and 2023
A cluster is treated as RD&I active if it meets either of the following conditions:
-
Threshold 1 - more than 20% of the sites in the cluster are businesses defined as RD&I active above; which ensures that clusters where a notable proportion of firms appear RD&I active are included
- Threshold 2 - more than 100 enterprises in the cluster have an indication of RD&I activity in any data source means that clusters which may contain a large amount of RD&I activity are included even if the proportion of firms which conduct RD&I is low
Clusters in all sectors are checked for RD&I activity. This is a change from version 1 of the tool, where all clusters in emerging sectors were assumed to be RD&I active and only SIC-based clusters were tested for RD&I activity. For more details on the rationale for these thresholds and sensitivity testing, please refer to Annex D. For statistics on cluster innovation, please see the section ‘Summary statistics for clusters’.
RD&I metrics
A binary RD&I activity indicator hides substantial heterogeneity in the way businesses in clusters are engaged in research, development and innovation. Innovation-related cluster metrics are presented in the map and downloadable data to enable more granular analysis and insight.
The percentage of businesses which are innovation active
This captures the percentage of businesses within a cluster which are RD&I active according to the criteria set out in section ‘Identifying Research, Development and Innovation active clusters.’
The percentage of businesses which are engaged with Innovate UK or the rest of UKRI
This captures the percentage of businesses within a cluster which either received a UKRI grant between January 2006 and June 2024 or applied for an Innovate UK grant between the period April 2015 and the latest available data as of July 2025.
The percentage of businesses which have a published patent
This captures the percentage of businesses within a cluster which have a published patent filed between January 2013 and June 2024.
Cluster funding from Innovate UK
This captures funding for businesses in the cluster from Innovate UK between April 2020 and the latest available data as of July 2025. Note that this time period differs from the time period used to identify RD&I active clusters. This difference is in order to ensure funding estimates from Innovate UK and the rest of UKRI cover roughly equivalent periods.
Innovate UK funding is available at the Company Reference Number level. These data have been linked to the IDBR and disaggregated to the site level using a ratio based on site employment.
Because businesses can appear in more than one sector, cluster funding from Innovate UK should not be summed across sectors.
Cluster funding from the rest of UKRI
UK Research and Innovation (UKRI) is the parent organisation of Innovate UK. In addition to Innovate UK awards, businesses will also receive some funding as a part of wider UKRI awards.
Funding from the rest of UKRI (excluding Innovate UK) has been calculated for the period between April 2020 and the latest available data as of July 2025. Only funding for businesses is captured, and not funding for non-businesses in the area defined by the cluster. This is important because most non-Innovate UK UKRI funding will not go to businesses.
UKRI funding is available at the project level and not the participant level, and so funding must be split across project participants. The lead project participant typically receives more funding. Reflecting this, project funding has been disaggregated to participants utilising a ratio of project lead to non-project-lead funding calculated using Innovate UK data, such that project leads are assumed to receive more funding. Non-lead participants are assumed to receive an equal share of the remaining funding.
This method is the best available and is more accurate than splitting the grant equally across all project participants. However, it is an approximation and will result in some participants receiving too much or too little estimated grant. In particular, in practice project lead will sometimes receive all the funding, and non-leads will not receive any funding. In this circumstance the method will assign non-0 grants where no funding has been received, leading to an overestimate. This issue may be exacerbated by the fact that, where businesses participate in non Innovate UK UKRI projects, they are usually not leads.
These data are then disaggregated to the site level using a ratio based on site employment.
Because businesses can appear in more than one sector, cluster funding from the rest of UKRI should not be summed across sectors.
R&D Expenditure used to claim R&D tax credits
This captures the total qualifying expenditure by businesses within the cluster which was reported to HMRC as qualifying expenditure for the purposes of claiming R&D tax credits (‘R&D expenditure’). For more details on R&D tax credit information see the summary background and quality report for R&D tax credit data.
R&D expenditure data are submitted to HMRC for the entire business unit and not at the business location level – that is, HMRC do not receive information on which business location R&D expenditure takes place at. Because of this, where businesses have multiple locations and are not entirely contained within a cluster, estimates for the cluster will capture the R&D expenditure for the entire business and not expenditure specific to the cluster. It has not been possible to disaggregate from the business to the business location level based on employment as has been done for other estimates.
The total for the 2021-2022 financial year is included in the map tool as the latest available, and totals for 2020-2021 and 2019-2020 are included in the downloadable data. In contrast, clusters have been created using IDBR data from 2025, as outlined in the section ‘business data used to create clusters’. Because 2025 data is used to assign businesses to clusters and R&D expenditure used to claim R&D tax credit data is estimated for previous years, R&D expenditure data will not fully reflect changes to the business population. For example, a business may be assigned to a cluster based on 2025 data, but was not actually operating within that cluster in 2021 – in which case, R&D expenditure will be over-estimated as the expenditure of the business will be incorrectly included.
HMRC publishes statistics on R&D tax credits. These published statistics will be based on data which includes claims received up to 31 May 2025. In contrast, the underlying data used for the map includes claims received up to 31 January 2025. Because businesses may submit tax credit claims some time after the end of a financial year, this difference in timing may cause differences between the Innovation Clusters Map and HMRC publication. Published data is also available to 2022-2023 based on estimation for claims not yet received, whereas estimates in this analysis run only to 2021-2022 as estimation for claims not yet received is not possible in the data used for the map.
Because businesses can appear in more than one sector, R&D expenditure used to claim tax credits should not be summed across sectors.
Identifying within-cluster collaboration
The clustering approach outlined in ‘How the clusters are created’ identifies clusters where businesses likely benefit from agglomeration effects due to their co-location. It does not however provide evidence of direct business-to-business collaboration. Therefore, to supplement the spatial clustering approach, network analysis was conducted on Innovate UK (IUK) data to identify evidence of internal (within-cluster) collaboration.
Applications for Innovate UK grants between July 2016 and January 2025 were used to assess collaboration. There were approximately 53,000 organisations, including businesses, universities and other entities, that applied for funding over the timeframe. If 2 organisations had jointly applied for a grant, these organisations formed a ‘collaborative pair’. To remove infrequent collaborators, and thus focus on frequent collaborators, only organisations that had jointly applied twice with at least 2 others were considered. This left approximately 5,000 organisations with 38,000 collaborative pairs.
This network of collaborating organisations was used to form collaboration communities. The Louvain algorithm was used to find ‘collaboration communities’, groups of organisations that work closely together. This approach partitions the network to maximise ‘modularity’. High modularity means having dense connections between nodes within communities but sparse connections across communities. The Louvain algorithm ensures that every organisation is assigned to a community, and no organisation can be in more than one community. This algorithm detected 147 collaboration communities with sizes ranging from 3 organisations. The minimum of 3 was defined when applying the Louvain algorithm, and no maximum was specified – the largest community had 95 collaborators.
Using this collaboration information, evidence of collaboration within clusters can be identified. A cluster is deemed to be ‘internally collaborating’ if it meets either of the following conditions:
- there are at least 5 organisations in the cluster that are all members of the same collaboration community
- there are at least 5 collaborative pairs within the cluster
Some collaboration communities are so large that it is possible for 5 or more organisations within them to inhabit the same cluster by sheer chance. To handle this, cases where organisations in a cluster were more geographically dispersed than the community average were removed, as this suggests the grouping was less cohesive and more likely to have occurred by chance.
Of the 2,428 clusters, 2,020 have at least one organisation present in the filtered collaboration network. Of these clusters, 331 are internally collaborating. This means that 331 clusters met at least one of the 2 conditions stated above (152 clusters meet the first condition and 321 meet the second condition), and therefore showed evidence of within-cluster collaboration through joint application for Innovate UK grants.
Sectors with the largest evidence of internal collaboration (over 60% of clusters are internally collaborating) include Materials Innovation and Advanced Materials sectors. There are 16 sectors where over a quarter of the clusters are internally collaborating. Conversely, there are 34 sectors where no cluster is internally collaborating.
While this evidence from IUK data demonstrates evidence of internal collaboration within clusters across several sectors, clusters that are not found to be internally collaborative via this method may still be internally collaborative. The data on IUK applications is only one source and does not capture all instances of collaboration.
Due to these limitations, a measure of collaboration was not included on the map. However, a binary flag in the downloadable data to indicate whether the cluster was classified as internally collaborative, based on the IUK data, has been included in the downloadable data.
In future work, these data on collaboration could be used to identify distributed collaboration communities, as demonstrated in the Innovation Clusters version 1 report, and also to analyse between-cluster collaboration. Additional data sources on collaboration, such as co-application for patents, could also be incorporated to strengthen the collaboration metric.
Research Infrastructure
In addition to clusters, the map can also be used to display information on universities, higher education institutions, and UKRI-funded Institutions operating in the R&D space.
Universities and higher education institutions have been classified into one of 3 types utilising groupings defined by the Office For Students Transparent Approach to Costing (TRAC). These types are:
- Research Intensive Universities (RIUs)
- Teaching Intensive Universities
- Specialist Music/Arts Institutions
Research Intensive Universities
Universities and higher education institutions with a medical school and research income 20% or more of total income, or all other institutions with a research income of 15% or more of total income. This grouping corresponds with TRAC A and B.
Teaching Intensive Universities
Universities and higher education institutions with research income of between 5% and 15% of total income, or institutions with a research income of less than 5% of total income and total income greater than £150 million, or institutions with research income less than 5% of total income and total income less than or equal to £150 million. This grouping corresponds with TRAC C-E.
Specialist Music/Arts Institutions
All other Universities and higher education institutions. This grouping corresponds with TRAC F.
For each higher education institution, research income, teaching income, other income, and total expenditure for the 2023/2024 financial year is reported. The number of students enrolled by subject is also reported for: Arts and Humanities, Social Sciences, Natural Sciences, and Engineering Technology, all for the 2023/2024 academic year. Total student enrolment is also reported.
Innovation-related institutions which are not universities or higher education institutions, such as catapults or research units, are also included based on data provided by UKRI. These are included as Research and Development Facilities. The status of these institutions is reported on the map, for example “Embedded in an Independent Research Organisation”, “Legally independent of UK Research and Innovation (UKRI)”, or “Catapult”.
Summary statistics for clusters
Table 1 shows for Industrial Strategy sectors and DSIT emerging sectors, the number of clusters and average size per cluster. There is considerable variation across sectors. Creative, Digital Economy, Digital and Technologies, and Professional and Businesses services clusters contain large numbers of businesses on average, reflecting the expansive nature of these sectors. Clusters for more niche sectors contain substantially fewer businesses – for example, Quantum Technology clusters contain only 16 businesses on average.
Table 1: Number of clusters and average size per cluster
Sector | Number of clusters | Average businesses per cluster | Average sites per cluster |
---|---|---|---|
Advanced Connectivity | 27 | 33 | 114 |
Advanced Manufacturing | 34 | 324 | 347 |
Advanced Materials | 11 | 51 | 67 |
Artificial Intelligence | 16 | 271 | 449 |
Creative | 37 | 3,204 | 3,284 |
Digital Economy | 40 | 3,700 | 3,843 |
Digital and Technologies | 69 | 2,898 | 3,004 |
Engineering Biology Application | 11 | 48 | 55 |
Engineering Biology Supply Chain | 14 | 30 | 36 |
Financial Services | 86 | 430 | 459 |
Future Telecoms Supply Chain | 9 | 18 | 24 |
Life Sciences | 23 | 220 | 308 |
Materials Innovation | 30 | 73 | 169 |
Professional and Business Services | 56 | 2,570 | 2,705 |
Quantum Technology | 3 | 16 | 216 |
Robotics and Autonomous Systems | 41 | 34 | 40 |
Semiconductors | 22 | 27 | 34 |
Space Economy | 41 | 41 | 97 |
Table 2 shows, for Industrial Strategy sectors and DSIT emerging sectors, RD&I activity summaries averaged across clusters. DSIT emerging sectors have a very high percentage of RD&I activity.
Table 2: RD&I activity summaries averaged across clusters
Sector | Average cluster percentage of innovation active businesses | Average cluster engagement with IUK or wider UKRI | Average cluster percentage of businesses with patent |
---|---|---|---|
Advanced Connectivity | 97.0% | 85.2% | 65.8% |
Advanced Manufacturing | 31.4% | 13.5% | 8.2% |
Advanced Materials | 76.6% | 70.1% | 40.7% |
Artificial Intelligence | 91.5% | 82.5% | 42.6% |
Creative | 11.7% | 4.1% | 0.4% |
Digital Economy | 21.2% | 8.3% | 2.8% |
Digital and Technologies | 19.8% | 8.5% | 2.7% |
Engineering Biology Application | 90.4% | 69.9% | 16.4% |
Engineering Biology Supply Chain | 76.4% | 48.4% | 14.0% |
Financial Services | 12.9% | 4.7% | 0.3% |
Future Telecoms Supply Chain | 86.4% | 62.3% | 20.3% |
Life Sciences | 82.5% | 64.8% | 14.6% |
Materials Innovation | 92.6% | 87.4% | 56.7% |
Professional and Business Services | 12.6% | 6.7% | 1.2% |
Quantum Technology | 99.7% | 98.8% | 95.6% |
Robotics and Autonomous Systems | 61.4% | 46.1% | 13.0% |
Semiconductors | 89.3% | 77.2% | 36.6% |
Space Economy | 91.6% | 81.2% | 58.5% |
Future research
The Innovation Clusters Map has been generated using the best available data and methods. However, there are a number of areas that could be iterated on and improved in future publications. These include:
- improving estimation of how much of each business’ activity is relevant to each sector, reducing possible over-estimation of turnover and employment. It would similarly be beneficial to improve the identification of which business sites are relevant to each sector.
- better accounting for the fact that clusters cut across multiple sectors, including up and down supply chains
- improving measurement of within-cluster collaboration, utilising more sources of information than co-application for Innovate UK grants
- improving measurement of research, development and innovation, including benchmarking innovation measures against the UK Innovation Survey
- improving the information provided on how clusters are related to administrative boundaries, and enabling improved filtering and selection of clusters which overlap with specific areas
Further information
### Contact for comments or feedback
If you have any comments or feedback on this publication, please contact innovation.clusters@dsit.gov.uk.
Annex A – comparability and other resources
This research has generated clusters for a wide range of sectors across the whole UK using a quantitative data-driven clustering method.
Several sources offer insight into the spatial distribution and concentration of business activity. Similarly, several sources include statistics on emerging sectors, including government publications. Due to differing data, methods and goals, in some sectors the data used are not consistent with these other sources.
Industrial strategy clusters
The Department for Business and Trade (DBT) have separately published analysis on priority Industrial Strategy city regions and clusters, outlining how places were prioritised for the Industrial Strategy. The DBT method is based on a range of qualitative and quantitative sources, including version 1 of the Innovation Clusters Map. DBT clusters were identified using a data-driven approach supplemented by expert qualitative review, and then prioritised based on economic assessment, barriers to growth, and policy support.
In contrast, Innovation Clusters were created based on a purely data-driven approach focused on geographic business concentration only, with no prioritisation based on economic assessment or other factors, and including in scope a wide variety of sectors. Expert qualitative review was used only to inform parameter selection and not to identify additional clusters or directly amend the areas captured by clusters. Innovation Clusters are also focused on innovation, with clusters filtered to RD&I active clusters by default on the map.
Users should refer to DBT analysis to understand how places were prioritised for the Industrial Strategy. The Innovation Clusters Map can be used to provide additional insight, including insight into the spatial area captured by all clusters (including those not prioritised by Industrial Strategy analysis), analysis of Industrial Strategy clusters alongside a wide range of other sectors, and cluster metrics.
Space sector
In the space sector, the UK Space Ecosystem Cluster Directory outlines the UK’s space clusters. Space clusters are actively managed local hubs of space activity which, with support of UK Space Agency (UKSA) and partners, grow the space sector. In contrast, space sector clusters in the Innovation Clusters Map are entirely data-driven representations of areas with a high concentration of businesses in the space sector.
Advanced Connectivity Technologies (ACT) sector
The ACT sector has been included in the Innovation Clusters Map based on a 2023 list of businesses . For insights based on the latest data users should refer to the ACT Market Scoping Analysis 2025 publication. It has not been possible to utilise the latest data in the development of the map due to timing constraints.
Life Sciences sector
The Life Sciences sector has been included on the Innovation Clusters Map based on the list of businesses used for the Biosciences and Health Technology Sector Statistics (BaHTSS) 2021 to 2022 publication. The next iteration of the BaHTSS publication, relating to the period 2023/24, has now been published, meaning that the life sciences cluster analysis included in the map will not reflect the latest company data.
Statistics based on Innovation Clusters will not be comparable with the BaHTSS publication due to differing methods and data sources. Users should refer to the BaHTSS publication for statistics on the Life Sciences sector, including employment and turnover. The Innovation Clusters Map can be used for additional insight into Life Sciences clusters.
AI sector
The AI sector has been included on the Innovation Clusters Map based on the list of businesses used for the 2024 AI sector study.
As outlined in the section ‘business data used to create clusters’, a key issue in sector measurement using business data is that a business identified as being a part of a sector may be only partially engaged in that sector. This issue is particularly acute in the AI sector, and means that using IDBR turnover and employment results in over-estimates. The AI sector study accounts for this by estimating AI-specific turnover and employment for each business in the list of businesses used for the study instead of using the IDBR.
Information from the AI sector study, rather than IDBR turnover and employment estimates, is used to calculate AI cluster estimates. This means that turnover and employment is calculated differently for the AI sector compared to other sectors, and care should be taken when making comparisons. IDBR local site information continues to be used to identify the business locations used to create clusters, as described in the section ‘business location and financial information’.
For all sectors, including AI, the map uses IDBR enterprise counts to create business counts, and IDBR local site information to create local site counts. In contrast, the AI sector study uses counts of Companies House registrations to count businesses and does not include counts of local sites. Business count information from the study is not used in the map in order to ensure location count information can be included on a consistent basis.
Digital and Technologies
The Digital and Technologies sector has been included based on the definition included in the Department for Business and Trade’s Industrial Strategy sector definitions list. The Digital and Technologies sector is defined based on SIC codes to main consistency with other Industrial Strategy sector definitions. This SIC-derived definition of Digital and Technologies is only a proxy and does not fully capture the detail and fast-changing nature of the sector, and the extent of diversified companies in the sector. This SIC-derived definition builds upon the SIC-based definition of the Digital Sector used in the Digital Sector Economic Estimates Series, but it is not limited to that definition.
The Digital and Technologies sector includes both dedicated companies – focused solely on digital and technology products and services – and diversified companies, which offer some digital and technology products and services but operate broader business models. This SIC-derived definition may not fully capture the extent of diversified companies in the sector. DSIT is exploring innovative approaches to more accurately define the sector.
For more details, see the technical annex to the Digital and Technologies sector plan.
Annex B – data source and reference period summary
Tables 3 and 4 show list all data used as inputs to cluster creation and the identification of research institutions, alongside usage and reference dates. Data used for metrics, data used for RD&I activity filtering, and other data used are included in separate tables.
Note that in some cases there are differences between reference periods of similar data used for different purposes. For example, Innovate UK data used for total Innovate UK funding is from 2020 onwards, while Innovate UK data used for innovation filtering is from 2016 onwards. This is to ensure that the metric covers a similar number of years to the rest of UKRI funding metric, while allowing innovation filtering to use relevant, but less recent, funding application data.
Table 3: Data used for cluster metrics
Metrics | Data used | Reference Period |
---|---|---|
Turnover and Employment | Inter-Departmental Business Register | 2025 Month 3 data used, latest available annual estimate |
Percentage of businesses which are engaged with IUK or the rest of UKRI | Innovate UK grant receipt data and UKRI grant receipt data | Innovate UK: April 2015 to latest available as of July 2025; Rest of UKRI: funded projects that began between January 2006 and June 2024. |
Percentage of businesses which have a published patent. | Patent data from the Intellectual Property Office | January 2013 to June 2024 |
Innovate UK funding | Innovate UK grant receipt data | April 2020 to latest available as of July 2025 |
Rest of UKRI funding | UKRI business level innovation data | Funded projects that began between April 2020 to June 2024 |
HMRC tax credit data | R&D tax credit data provided at the cluster level by HMRC | 2021-2022 financial year data included in app, 2019-2020 and 2020-2021 financial year data included in downloadable data |
Table 4: Data used for RD&I active filtering
Data used | Reference period |
---|---|
Innovate UK grant application data (all applicants & therefore broader than grant receipt data) | April 2015 to latest available as of July 2025 |
UKRI business level innovation data | funded projects that began between January 2006 and June 2024 |
Business Enterprise Research and Development survey data | 2019-2023 inclusive |
The Data City Innovation scores | Latest data as of 28th May 2025 |
Table 5: Other data used
Data used | Use | Reference period |
---|---|---|
Inter-Departmental Business Register | Business site information | 2025 Month 3 |
The Data City Real-Time Industrial Classification data | Identifying emerging sectors - see Annex C | Latest data as of 28 May 2025 |
Glass AI / Perspective Economics sector list data | Identifying emerging sectors | Latest available as of March 2025 (see Annex C for sector specific details)[#annex-c] |
HESA institution data | Displaying Universities on map | Income: Financial year 2023/2024; Student enrolment: Academic year 2023/2024 |
UKRI institution data | Displaying Institutions on map | Latest available as of June 2025 |
Annex C – data sources used for each sector
As outlined in the section ‘Identifying which businesses are operating in which economic sectors’, there are 3 main data sources for how businesses are classified into economic sectors: SIC, sector lists, and RTICs.
DSIT emerging sectors are defined based on a variety of these sources:
The Data City’s Real-Time Industrial Classifications were used to define the following sectors: Advanced Materials, Engineering Biology Application, Engineering Biology Supply Chain, Quantum Technology, Robotics and Autonomous Systems, and Space Economy. Since these are DSIT emerging sectors, these RTICs have undergone extensive DSIT quality assurance, and are included in the map tool under DSIT emerging sectors dropdown rather than the RTIC dropdown. Other RTICs, which are included under RTICs dropdown, have not gone through DSIT quality assurance.
Glass.AI and Perspective Economics created lists were used to define the following sectors: Artificial Intelligence, Semiconductors.
A Perspective Economics created list was used to define the Advanced Connectivity Technologies sector.
The NMIS Materials Innovation Company List, created by Perspective Economics, was used to define the Materials Innovation sector.
SIC was used to define the Digital Economy sector.
Industrial Strategy sectors are defined using SIC, with the exception of the Life Sciences sector. The Life Sciences sector was created in collaboration with Kepier & Company Ltd.
Annex D – sensitivity testing of RD&I activity thresholds
The classification of clusters as RD&I active outlined in the section ‘Identifying Research, Development and Innovation active clusters’ is to treat a cluster as RD&I active if it meets either of the following conditions:
-
Threshold 1 - more than 20% of the sites in the cluster are in businesses with at least one indication of innovation in any data source. This ensures that clusters where a notable fraction of firms appear RD&I active are included.
-
Threshold 2 - more than 100 enterprises in the cluster have an indication of RD&I activity in any data source. This ensures that clusters which may contain a large amount of RD&I activity are included.
These thresholds were chosen partly with reference to thresholds used in version 1 of the Innovation Clusters Map and partly based on sensitivity testing threshold combinations and the results on clusters. This annex summarises these sensitivity tests.
Table 6 shows the number of RD&I active clusters with different threshold combinations:
- no innovation filtering
- the filtering outlined in the section ‘Identifying Research, Development and Innovation active clusters’
- the filtering in the section ‘Identifying Research, Development and Innovation active clusters’ but with a higher minimum percentage of sites threshold of 50%
- the filtering in the section ‘Identifying Research, Development and Innovation active clusters’ but omitting threshold 2 (i.e. only basing the filtering on the percentage of sites which are innovation active)
Table 6: number of RD&I active clusters with different threshold combinations:
Threshold 1: minimum percentage sites innovation active | Threshold 2: minimum count enterprises innovation active | Number of RD&I active clusters |
---|---|---|
no filtering | no filtering | 2,428 |
20% | 100 | 1,716 |
50% | 100 | 1,200 |
20% | Not applied | 1,497 |
Increasing the minimum percentage of sites to 50% notably decreases the number of RD&I active clusters. A lower threshold of 20% has been retained as there are many ways businesses may be innovative which will not be reflected in this data. It is therefore better to err on the side of including clusters where there is some limited evidence of RD&I, on the basis that other RD&I is likely to be undetected.
In contrast, removing threshold 2 – the minimum count of enterprises which are innovation active – has only a relatively small effect. This reflects the fact that threshold 2 has been set to a high value. This is to ensure that large clusters in sectors with large numbers of sites are not included as innovative simply due to their size and despite a relatively low density of innovation activity.