Official Statistics

Bioscience and health technology sector statistics 2023 to 2024: user guide

Published 2 October 2025

1. Introduction

1.1 Background

This user guide accompanies the 2023/2024 release of the bioscience and health technology sector statistics (BaHTSS). The BaHTSS publication series presents data and commentary on the UK life sciences sector and its subsectors. The life sciences sector estimates in this publication are based on a dataset of companies in the UK who operate within the biopharmaceutical or medical technology subsectors.

This accompanying user guide provides further details on how these statistics are derived and the underlying quality associated with the data and analysis used in the publication.

Please email analysis@officeforlifesciences.gov.uk  for any feedback or questions on these statistics.

1.2 Changes from previous publications

The 2023/2024 report, published on 2 October 2025, is the latest edition in the annual BaHTSS publication series. There have been substantial methodological changes in data compilation for the 2023/2024 report compared to past publications in the BaHTSS series (relating to 2021/2022 and before).

Alongside providing detail on how the data for 2023/2024 was constructed, this guide outlines how the data construction has changed compared to past publications and the subsequent impact of these changes. 

1.3 Official Statistics designation

The BaHTSS publication series was previously designated as official statistics. The 2023/2024 report was redesignated as ‘official statistics in development’. More details on the change in badging and an accompanying plan for planned future developments can be found in section 4.

2. Statistical presentation and data collection for 2023/2024

2.1 Companies and businesses in the UK life sciences sector

2.1.1 Definition

The definition of the life sciences sector used in this publication covers businesses operating within the biopharmaceutical or medical technology subsectors in the UK. This definition includes companies involved in the discovery, development and marketing of new pharmaceuticals and medical devices, as well as companies operating in the associated service and supply chains.

The 2023/2024 BaHTSS publication collates data on:

  • companies: a company is a legal entity which may have a distinct identity from the broader organisation that owns or runs it. A company in the BaHTSS publication is defined as an entity registered with Companies House that has an assigned company registration number (CRN)
  • businesses: an entity that is the legal owner of one or more companies. Companies are considered part of the same business when they are part of the same enterprise group structure, as defined by the Office for National Statistics’ (ONS) Inter-Departmental Business Register (IDBR). Some businesses will consist of just one company - in these instances ‘business’ and ‘company’ effectively mean the same thing

For a company to be captured within the scope of the BaHTSS publication, each of the following criteria must be met:

  • it has a legal entity in the UK, not including any Crown Dependencies
  • it is registered with Companies House
  • it is not a university, publicly owned institution, an NHS institution, or a charity
  • it conducts a notable amount of activity that is focused on human health (not including the provision of healthcare, which is out of scope). Veterinary and industrial biotechnology businesses are not in scope
  • it must be registered in ONS’s IDBR. IDBR is a register of all businesses registered for value-added tax (VAT) and/or pay as you earn (PAYE). This means businesses that are not registered for PAYE or VAT, largely sole proprietorships and partnerships, are not included in the publication
  • it must be considered active. Enterprises in IDBR (and the companies that belong to them) are flagged as being ‘active’ if they have activity within the UK and there is no evidence that they have ceased trading.

If a company is considered in-scope of the BaHTSS life sciences sector definition, then its parent business will also be considered in-scope. It is therefore possible for a business within the scope of the publication to comprise of companies both in and out of scope of the BaTHSS life sciences sector definition. Where this is the case, only the in-scope companies within the business are included in the publication data, and the business’ employment and turnover values are scaled down to account for the fact that only some of its companies are in scope (more details in section 2.6.2.

Figure 1 visualises the business structure as defined in the BaHTSS 2023/2024 publication for a hypothetical business with 3 companies.

Figure 1: Illustrative example of business structure in the BaHTSS publication

Note: businesses can have ownership over any number of companies.

2.1.2 Data collection and process

The BaHTSS publication aims to identify life sciences companies which meet the definition set out in section 2.1.1. For 2023/2024 data, two sources were used to identify companies:

  • an existing dataset of companies used in past BaHTSS publications. This dataset contained a list of identified in-scope companies that were operating as of 2021/21/2022. Further steps were undertaken to remove any of these companies no longer operating as of 2023/2024
  • descriptive text from company websites, which was obtained and analysed through a web-scraping approach and was used to identify additional companies. This is a new process introduced for the first time for the 2023/2024 publication. More information on the differences in data collection from past BaHTSS reports can be found in section 3.

Existing BaHTSS dataset of companies from 2021/2022

The Office for Life Sciences (OLS) holds a dataset of companies operating in the UK life sciences sector as of 2021/2022 from previous BaHTSS publications. This dataset of companies was compiled in 2023 through engagement with a cohort of data partners. Data partners are organisations with expertise in the life sciences sector who provide OLS with information on companies they know to be operating in the sector. The full list of data partners can be found in section 8.

This company list was then linked to the ONS IDBR dataset via company registration numbers (CRNs). Companies were then removed from the dataset if:

  • they weren’t present in the ONS IDBR dataset. Given that a number of key metrics in this publication are sourced from IDBR data, and given the high coverage of businesses included within IDBR, it has been decided that any companies that are not in IDBR should be excluded from OLS’s company list

  • they were not considered active in IDBR as of 2023/2024
  • they were designated as non-profits, government organisations or associated with local authorities in IDBR

Identifying new companies

To identify further companies, a new approach was introduced for the 2023/2024 BaHTSS publication, relying primarily on open web sources. This entailed identifying the textual content on company websites that is descriptive of that company’s activities, and using that to identify a list of companies that may be relevant to the sector.  This process was conducted by a consortium of Technopolis and Glass.AI who were contracted by OLS. The process is summarised via the below steps:

  1. A series of inputs are used to train multiple models. These inputs include:
    • a subset of Glass.AI’s core dataset of companies. The full core dataset contains data crawled from the websites of companies from all sectors (not just life sciences), but this was filtered down to only include companies where initial evidence suggested there was a degree of textual relevance to the life sciences sector that could be investigated further through machine reading
    • companies included in previous BaHTSS publications and their associated websites
    • online directories of relevant life sciences bodies such as ABPI and their membership lists
    • companies identified through the data partners listed in section 8
  2. Entity recognition is used to identify websites that belong to companies and businesses and then key details are extracted into a dataset.

  3. This data is then fed into a series of models. These models generate a probability score for the likelihood they meet the inclusion criteria for life sciences (as set out in section 2.1.1). These models cover a variety of techniques including LLMs, embedding-based semantic modelling, neural networks, and regression. Some models are trained on company descriptions, whilst others might be trained on snippets of prioritised text from website.

  4. An ensemble meta model then produces an overall probability score for each company website decided on a precision-recall curve.

  5. A sample of companies is outputted after step 4. These were then reviewed manually for a decision on whether they met the inclusion criteria in section 2.1.1. This manual review was then used as an input for further training of the models (as in step 1) and then steps 3 and 4 were repeated. This manual review process was repeated several times to further refine and train the models.

An initial dataset of companies identified using the above process was sent to OLS for review. The following steps for cleaning the data were applied:

  • the company list was linked to the Office for National Statistics’ (ONS) IDBR via company registration numbers (CRNs). The same criteria used to remove companies from the 2021/2022 BaHTSS dataset were then applied
  • any duplicated companies that were already present in the OLS core dataset from past years were removed
  • to mitigate the possibility of the web-scraping process identifying non-life sciences companies, automated data cleansing processes were carried out and a Standard Industry Classification (SIC) based filtering approach was applied. Only companies that met either of the below criteria were retained:
  • All companies with a SIC of 21100, 21200, 26600, 32500, 46460, 72110, 74909 and 72190 are included OR
  • Companies with any SICs are included only if they are linked in IDBR to a business that has at least one company with a ‘core’ SIC of 21100, 21200, 26600, 32500, 46460 or 72110

The definition of the above SIC codes can be found on the ONS UK SIC Hierarchy page. As a result, the final dataset used in the 2023/2024 BaHTSS statistics combines the companies remaining after the above criteria are applied with the existing dataset of companies from the previous section.

Determining corporate structures and assigning companies to businesses

In the IDBR dataset, the corporate structures of companies and businesses can be determined through the allocation of individual companies to enterprises, and the allocation of enterprises to enterprise groups. An enterprise may consist of one or more companies. All companies in IDBR are considered part of an enterprise, but only some of these enterprises are also part of a wider enterprise group.

The corporate structures from the IDBR dataset are used to determine the link between companies and businesses in the final list of companies produced from the steps above. This is assigned as follows:

  • any companies that are the only company within their business are effectively considered to be both a company and a business
  • any companies that are associated with each other (by being linked under the same business structure) are given the same unique business identifier and are considered a single business. The below criteria are used to define which companies are linked under the same business:
  • if a company is part of a wider enterprise group, all companies within the enterprise group structure are considered part of the same business
  • If a company is not part of a wider enterprise group, all companies within the same enterprise structure are considered part of the same business

Therefore, the term ‘business’ can refer to either an individual company (where that company is not part of a wider corporate structure), an enterprise, or an enterprise group.

2.2 Subsectors

2.2.1 Definition

In this publication, the UK life sciences industry consists of businesses operating in the following 2 subsectors (focusing on human health):

  • biopharmaceuticals: this subsector includes:
    • core companies that develop and/or produce their own pharmaceutical products. This includes companies working on small molecules, vaccines and advanced therapy medicinal products (ATMPs)
    • service and supply companies that offer goods and services to core biopharmaceutical companies including, for example, contract research and manufacturing organisations (CRMOs), and suppliers of consumables and reagents for research and development (R&D) facilities
  • medical technology: this subsector includes:
    • core companies that develop and produce medical technology products, ranging from single-use consumables to complex hospital equipment, including digital health products
    • service and supply companies that offer services to core medical technology companies including, for example, CRMOs, and suppliers of consumables and reagents for R&D facilities

2.2.2 Data collection and process

Each company included in the dataset is assigned to either the biopharmaceutical or medical technology subsector, based on the following process:

  1. Textual information on open web sources (primarily company websites) and existing subsector classifications from past BaHTSS datasets were used as inputs to train a series of models
  2. These models used techniques such as embedding-based semantic modelling and neural networks to assign a subsector classification
  3. A sample of companies that had been assigned subsector classifications were outputted after step 2. These were then reviewed manually to check the accuracy of the classifications and to provide corrections where appropriate. The results of this manual review were then used to further train the classification models (as in step 1) and then steps 2 and 3 were repeated. This manual review process was repeated several times to further refine and train the models

In some cases, a company may conduct both pharmaceutical and medical technology activities. When this occurs, a subsector is designated by language models and associated scoring, driven by weight of evidence. With this in place, a determination is made, and the company is assigned its primary activity on this basis.

Some companies may also carry out non-life sciences activity in addition to their life sciences activity. In these situations, all employment and turnover for the company has been included in the dataset. Future publications will consider whether life sciences activity can be further disaggregated from non-life sciences activity within companies, more details on this can be found in section 4.

2.3 Manufacturing and research and development (R&D) activity

2.3.1 Definition

The 2023/2024 BaHTSS publication includes information on companies which have reported either manufacturing or R&D as their primary business activity. In these statistics, manufacturing and R&D companies are identified according to their SIC and are defined as follows:

Companies are required to select one primary SIC code that best describes their main activity. Therefore, in this publication, companies which primarily manufacture are mutually exclusive to companies which have R&D as their primary activity. Other companies within the BaHTSS dataset that do not have a primary SIC code within section C or division 72 are not tagged as having manufacturing or R&D activity.

2.3.2 Data collection and process

SIC information is primarily sourced from Moody’s FAME dataset. This collates information on SICs reported by companies to Companies House. This is joined to the BaHTSS dataset via CRN. When company SIC codes are not available in Moody’s FAME dataset, IDBR SICs are used instead. The SIC allocation in IDBR is done at an enterprise level (multiple CRNs can belong to one enterprise) and therefore generalises multiple companies in a single SIC. The IDBR SICs are only used in a small, non-impactful number of cases.

2.4 Small and medium-sized enterprises (SMEs)

2.4.1 Definition

The 2023/2024 statistics report on the number of businesses which are classed as ‘small and medium-sized enterprises (SMEs)’. This is defined according to the European Union standard definition of small and medium-sized enterprises.

2.4.2 Data collection and process

SME status is sourced from Moody’s FAME dataset and linked to the dataset at a company level through CRNs. For any CRN in the BaHTSS dataset that does not return a match in FAME, the SME status is set to ‘unknown’.

2.5 Company ownership

2.5.1 Definition

The 2023/2024 statistics report on whether companies are owned by a UK or overseas based business. This is determined using the ‘Global Ultimate Owner’ allocation from Moody’s FAME database.  In FAME, the Global Ultimate Owner is identified by searching for the shareholder with the highest direct or total percentage of ownership for each company. If that shareholder is independent, it is defined as the Ultimate Owner for the subject company. If the shareholder is not independent, the process is repeated until an Ultimate Owner is found.

2.5.3 Data collection and process

SME status is sourced from Moody’s FAME dataset and linked to the dataset at a company level through CRNs. For any CRN in the BaHTSS dataset that does not return a match in FAME, the company ownership status is set to ‘unknown’.

2.6 Employment and turnover

2.6.1 Definition

The BaHTSS publication reports on employment and turnover for the life sciences sector at a national level and at further granularities. The source used for employment and turnover in the 2023/2024 statistics is ONS’s IDBR. Employment and turnover are defined as follows:

  • employment: people working for a business, including owners and partners
  • turnover: the income received by a business from the ‘sales of goods and or services charged to third parties’. This excludes value-added tax (VAT)

2.6.2 Data collection and process

The constructed dataset from the previous steps uses employment and turnover information obtained through data matching with IDBR by CRN. IDBR turnover and employment data is available at an ‘enterprise’ level. An enterprise can consist of more than one company. To assign employment and turnover at a company level for each CRN, it is assumed that companies have a uniform share of the enterprise’s employment and turnover, and so the enterprise-level figures are divided equally by the number of companies in the enterprise.

Turnover values collected for past years (2021/2022 and earlier) has been put in 2023/2024 prices, using GDP deflators to account for inflation across the years.

2.7 Time period

Company employment and turnover figures within IDBR are mostly updated on an annual basis and are sourced from a range of datasets. Due to the variety of sources used, employment and turnover figures are not updated at a consistent point in the annual cycle. For this publication, a snapshot of IDBR data was taken that most closely relates to the period 2023/2024. Therefore, some figures included in this publication may relate to a different year.

Past BaHTSS reports have published estimates of the sector for previous financial years. Section 3 contains more details on the comparability of past figures to the statistics relating to 2023/2024 and the differences in how these were constructed.

2.8 Geographical areas

The BaHTSS report presents geographical estimates at the following levels:

The postcode for each site within the BaHTSS data is matched to a ITL 1 region and LAD using the ONS’s National Statistics Postcode Lookup (NSPL). This allows consistency throughout all layers of geography used.

The BaHTSS report uses the most appropriate NSPL available from ONS at the time of publication. For the 2023 to 2024 report, the LAD boundaries as of May 2025 were used, which means boundary changes that happened after this date are not reflected in this report.

The 2023/2024 report uses companies’ registered addresses from Companies House to determine which geographical area to which they belong to. Previous publications used the physical location of the site the company operates from. This has been changed because it has not been possible to include sites as the lowest level of reporting for this year’s statistics. More information on the methodological changes is available in section 3. Further work is to be done by OLS to reintroduce site level reporting. More details on the development work around this is outlined in section 4.

2.9 Rounding and disclosure control

Outputs for the publication are assessed to ensure that employment and turnover values for individual businesses are not disclosed. When a figure is derived from 5 or fewer businesses or companies, these figures are suppressed. Totals are also rounded where appropriate to prevent suppressed values being revealed through cross-tabulation between different data tables.

The BaHTSS reports apply rounding to quoted numbers to ensure coherence and that the main results of the findings can be interpreted easily.

3. Impact of 2023/2024 methodology changes and comparability to past estimates

3.1 Comparability to past BaHTSS reports

The methodological changes implemented for the 2023/2024 data have had an impact on the timeseries of key metrics. Please note that:

  • this report covers data up to 2023/2024 and is not directly comparable to past estimates of the life sciences sector relating to 2021/2022 and before. Data for 2022/2023 is not available due to the time taken to implement the new methodology
  • estimates of the sector between 2008/09 and 2021/2022 are still referenced in this report where the definition of the metric remains unchanged. This is to allow users to view data on estimates for past years, but it should be noted these figures are not directly comparable and the scale of change in the sector cannot be inferred between 2023/2024 and earlier time periods

3.2 Methodology changes in 2023/2024

The methodology used to compile the 2023/2024 statistics, as outlined in section 2, has changed substantially compared to the methods used for past BaHTSS reports to produce estimates of the sector between 2008/2009 and 2021/2022. The key changes in methodology and reporting are listed below.

3.2.1 Businesses, companies and sites

Previous BaHTSS publications reported on sites as the lowest sub-unit of businesses in the publication. Sites are defined as the physical locations that companies operate from, with each company having one or more sites. Site-level data is not available for the 2023/2024 statistics and the lowest unit used is companies (as defined in section 2.1.1).

The 2023/2024 statistics use the registered company address as the basis for geographical statistics. This means that employment and turnover are allocated to the location where the company is registered, rather than the physical sites that companies operate from.

The 2023/2024 statistics allocate companies to a business structure based on the enterprise group structure used in IDBR (as outlined in section 2.1.1). Previous statistics allocated companies to a business structure based on Moody’s FAME database.

This change in the 2023/2024 statistics has resulted in the following impact on comparability to past years:

  • businesses: the source used to map companies to their parent businesses has changed, so figures for businesses in the 2023/2024 report are not directly comparable to past estimates of businesses from 2021/2022 and before. Whilst figures aren’t directly comparable, past estimates of business counts for 2008/2009 to 2021/2022 are available in the accompanying report and data tables to allow users to reference estimates from previous years. The scale of change in the business estimates between 2023/2024 and earlier time periods cannot be determined
  • companies: each company may operate out of multiple sites, but site-level data is not available for the 2023/2024 publication. Site-level data was available for past estimates of the sector for 2021/2022 and earlier, but the lowest unit (in terms of corporate structures) used in this publication was companies. The count of companies in this year’s publication is not comparable to the site counts from previous BaHTSS reports
  • geographical breakdowns: as breakdowns by LAD and ITL 1 region for 2023/2024 are based on the registered addresses of companies, no comparisons can be made between the 2023/2024 geographical data and the estimates from past publications (which are based on site addresses instead of registered addresses). Geographical analysis based on registered address is published for the first time in the 2023/2024 statistics

OLS are considering how to re-establish the reporting on sites for future publications in accordance with the new methodology. More details on this development work are outlined in section 4.

3.2.2 Subsectors

The 2023/2024 publication allocates companies to either a biopharmaceutical or medical technology subsector classification. In the 2023/2024 dataset a new automated approach (as outlined in section 2.2 assigned these classifications to new companies found through web scraping and to existing companies from past versions of the BaHTSS datasets. This means that subsector classifications used for past publications may have been revised and updated for some companies in the dataset.

For past publications relating to the period between 2008/09 and 2021/2022, sites were manually assigned a segment as well as a subsector based on our previous contractor’s expertise and desk research on company activity.

For the 2023/2024 dataset, the only subsector classification that is reported on in this publication is the binary split between biopharmaceutical and medical technology companies. These subsectors are no longer broken down further into the ‘core’ and ‘service and supply’ subsectors, or the more granular segments that were reported on in previous publications. The former segmentation scheme is outlined in section 11 of the 2021/2022 background quality and user guide.

The new approach to assigning subsectors has not had a notable impact on the share of activity belonging to each subsector. The share of employment and turnover belonging to both the medical technology and biopharmaceutical subsector remained broadly consistent between the estimates in the 2023/2024 report compared to the 2021/2022 report.

However, due to the change in approach for assigning subsectors, the estimates for businesses, employment and turnover by subsector for 2023/2024 report are not directly comparable to past estimates. Whilst figures aren’t directly comparable, past estimates of breakdowns by subsector for 2008/2009 to 2021/2022 are available in the accompanying report and data tables to allow users to reference estimates from past years. The scale of change in the subsector estimates between 2023/2024 and earlier time periods cannot be determined.

These more granular classifications are not available for the 2023/2024 report due to further work needing to be undertaken on how to accurately classify company activity. OLS are exploring how best to reintroduce this to future publications and to review the previously used segments. More details on this development work are outlined in section 4.

3.2.3 Manufacturing and R&D activity

The 2023/2024 BaHTSS publication includes information on companies whose primary activity is either manufacturing or R&D (as determined by their SIC code). For previous BaHTSS publications relating to 2021/2022 and before, different metrics and definitions were used to identify companies manufacturing and conducting R&D.

Under the previous methodology, companies were classified as conducting manufacturing or R&D activity if there was any evidence of that company engaging in manufacturing or R&D, regardless of whether that was its primary activity. These classifications were determined manually for each company based on sectoral knowledge, and the manufacturing and R&D activity categories were not mutually exclusive.

As the reporting metrics for manufacturing and R&D are substantially different in 2023/2024 compared to past years, past data is not referenced in this publication and a comparison between past estimates and estimates for 2023/2024 cannot be made.

3.2.4 SMEs

The 2023/2024 statistics report on whether companies are SMEs or not. This uses the European Union standard definition of small and medium-sized enterprises and is sourced from Moody’s FAME database at a company level. The previous BaHTSS publications used the same definition, but the information was sourced from multiple different sources and SME status was allocated at a site level. In the 2023/2024 data SME status was determined at a company level and therefore the 2023/2024 statistics on SMEs are not comparable to past years. The 2023/2024 statistics report on SME status at a company level for the first time.

Unlike in previous BaHTSS publications, the 2023/2024 statistics do not provide an additional breakdown on SMEs and whether they are micro, small or medium companies. This data was not available for the 2023/2024 statistics, but OLS will consider how this can be reintroduced for future publications in the BaHTSS series.

3.2.5 Employment and turnover

The source of employment and turnover has been changed for the 2023/2024 BaHTSS report compared to past publications. The employment and turnover source for the 2023/2024 data is ONS’s IDBR dataset. Previous BaHTSS publications for 2021/2022 and before used Moody’s FAME database for these metrics instead. The source has been changed to improve the quality and reduce the manual adjustments needed from FAME data. More details on the quality changes can be found in section 4.

Given the change in source, estimates for turnover and employment for 2023/2024 are not directly comparable to estimates for previous years. Whilst figures aren’t directly comparable, past estimates of turnover and employment for 2008/09 to 2021/2022 are available in the accompanying report and data tables to allow users to reference estimates from previous years. The scale of change in the employment and turnover estimates between 2023/2024 and earlier time periods cannot be determined.

4. Compliance with the Code of Practice for Statistics

4.1 Designation

The BaHTSS publication series was previously labelled as official statistics. The 2023/2024 report has been designated as ‘official statistics in development’.

This is defined by the UK Statistics Authority as a ‘subset of official statistics that are undergoing a development; they may be new or existing statistics, and will be tested with users, in line with the standards of trustworthiness, quality, and value in the Code of Practice for Statistics.’

This change is being implemented whilst OLS:

  • carry out further quality improvements to the new methodology
  • consider users’ feedback on the new approach
  • explore ways to reintroduce some breakdowns of the life sciences sector which have been temporarily removed due to the change in methodology

4.2 Quality

This section outlines where the BaHTSS 2023/2024 complies with the quality pillar requirements of the UK Statistics Authority’s (UKSA) Code of Practice for Statistics and highlights where there are planned future improvements.

4.2.1 Identifying life sciences companies

The estimates in this year’s BaHTSS publication are based on the 2023/2024 dataset (the methodology for constructing this dataset is outlined in section 2). Whilst the dataset holds information on an extensive number of life sciences companies, the dataset does not contain the full population of life sciences companies operating in the UK and therefore any figures in this report should be treated as an estimate. Whilst the BaHTSS series provides the OLS’s best estimate of the size and composition of the sector, limited information is known about its coverage and what the true population of the life sciences sector is in the UK.

Whilst limited information is known about the proportion of the life science sector population that is not included in the BaHTSS dataset, extensive quality assurance is undertaken to ensure records in the dataset meet the inclusion criteria set out in section 2.1. These steps include:

  • removal of non-active companies using IDBR
  • validating records found are legitimate companies with a UK presence using IDBR
  • manual review of the companies with the highest impact on national figures. This review checks whether they meet the BaHTSS definition criteria set out in section 2.1
  • checking for implausible values and missing data

In 2023/2024 a web scraping approach was introduced for the first time to identify new companies in the sector, as outlined in section 2.1.2. This approach:

  • identified a wider range of companies compared to past manual approaches
  • association of each company to a website so further research can be conducted on individual companies
  • flexibility to refine automated techniques through small-scale sample reviews

An initial manual validation was undertaken on a subset of records found through the new web scraping approach. These checks revealed that there was a notable number of ‘false positives’ (companies that were identified as belonging to the life sciences sector by the models but would not meet the publication inclusion criteria set out in section 2.1.1).

To separate life sciences companies from non-life sciences companies, the SIC approach outlined in section 2.1.2 was used. After this filtering was implemented, a further manual review was conducted, and an acceptable level of accuracy was found. This manual reviewed covered the companies that accounted for the largest share of employment and turnover.

Any companies found to not be in-scope through the manual review were removed from the dataset. One of the limitations of this SIC approach is that some life sciences companies found through web crawling use SIC codes that are not included in the list of SICs outlined in section 2.1. This means that some life sciences companies using SICs that are less obviously relevant to the sector will be excluded from the dataset.

A subset of companies in the core dataset from past years were also manually reviewed to verify they still met the scope criteria set out in section 2.1. This review found that an acceptable rate of companies was still meeting this definition.

4.2.2 Employment and turnover estimates

Employment and turnover values are taken from ONS’s IDBR dataset for the 2023/2024 report. IDBR covers all businesses registered for VAT and PAYE.

To be consistent with ONS’s view of the UK business environment and other government analysis that uses IDBR, companies that aren’t included in IDBR also aren’t included in the BaHTSS publication.

Whilst IDBR contains a degree of imputation and estimation for some company’s turnover and employment, the move to using IDBR is an improvement in quality compared to sources used in previous years. The advantages to using IDBR include:

  • a higher quality source for turnover and employment figures. ONS take steps to disaggregate UK activity from international activity based on surveying companies in the register. This previously had to be estimated and manually adjusted from past sources
  • improved methods for linking companies to their overarching business through the use of corporate structures within ONS’s IDBR data. This provides consistency with other statistics produced across government

4.2.3 Subsectors

Companies are assigned either a biopharmaceutical or medical technology subsector classification as outlined in section 2.2. The new methodology using web-scraping (as covered in section 2.2) was tested by assessing the share of companies in each subsector compared to data reported on in past publications. Quality assurance was also carried out through manual reviews (the results of which were fed back into the models as training data).

4.2.4 Future developments in quality

Whilst the methodology changes in 2023/2024 have resulted in some improvements in terms of quality, steps will be taken by OLS to further improve the quality of the estimates used in the BaHTSS publication for future releases. OLS will focus on developing the following areas with the aim of further compliance with the Code of Practice and the aim of returning to an official statistics designation:

  • further refinement of the web crawling-based approach and improving the accuracy for identifying life sciences companies. This year’s approach resulted in a notable number of companies identified through web crawling that did not meet the inclusion criteria set out in section 2.1.1. This was mitigated through the use of filtering by SIC codes (as outlined in section 2.1.2) and the validation of this approach was done via a manual review. Further work will be done to see how the use of manual methods can be further reduced where appropriate, and how the technical process could be improved
  • considering whether coverage of life sciences companies in the dataset could be increased by identifying omitted life sciences companies within the same IDBR corporates structures as in-scope companies
  • considering how companies identified through web crawling can be more accurately assessed to determine if they meet the life sciences definition criteria set out in section 2.1.1. The 2023/2024 statistics used SIC codes to do this, but OLS will consider if there are ways to capture a wider range of companies using SICs not listed in section 2.1.1.

4.3 Value

This section outlines where the BaHTSS 2023/2024 complies with the Value pillar requirements of the UK Statistics Authority’s (UKSA) Code of Practice for Statistics and highlights where there are planned future improvements.

4.3.1 Value of BaHTSS 2023/2024 statistics

These statistics are produced due to the need for a comprehensive view of the life sciences industry in the UK that is not provided by other published data sources. SIC codes are used to classify businesses by industry in other official statistics. This classification system has categories for businesses whose primary activity is the manufacture of pharmaceuticals, manufacture of types of medical equipment, and those whose primary activity is biotechnology R&D.

However, the categories within the SIC system do not align with the full breadth of the life sciences industry, meaning that it is not possible to achieve complete data coverage of the life sciences industry using SIC codes. SIC codes are used in the BaHTSS report to support with the removal of non-life sciences companies identified through web-crawling and to identify companies with primary activities of manufacturing and R&D. Despite this, the BaHTSS publication still reflects a bespoke definition of the sector that is more comprehensive than any definition that could be achieved with SIC codes alone.

The bespoke methods used in this publication to identify and classify UK life sciences businesses therefore mean these statistics fulfil a previously unmet need in providing a more complete view of the life sciences sector.

4.3.2 Future developments in value

The methodology changes introduced for the 2023/2024 statistics have resulted in certain breakdowns no longer being available. OLS will focus on developing the following areas with the aim of further compliance with the code and the aim of returning to an official statistics designation:

  • re-introduce more granular level reporting that is not available in the 2023/2024 report. This includes site level reporting and more granular subsector analysis beyond the biopharmaceutical and medical technology subsectors. OLS will:
    • consider using further information on local units available in ONS’s IDBR dataset
    • explore the use of new data sources that can be used to identify companies’ sub-activities
  • complete further work to introduce more metrics into the BaHTSS series that are known needs of users. OLS will consider what sources are available that could allow reporting on additional variables
  • further work will be done by OLS to consider if other existing sources can be used to bring in more harmonised and consistent subclassifications with other government sources

4.4 Trustworthiness

This section outlines where the BaHTSS 2023/2024 complies with the trustworthines pillar requirements of the UK Statistics Authority’s (UKSA) Code of Practice for Statistics and confirms there are no required future improvements.

4.4.1 Trustworthiness of BaHTSS 2023/2024 statistics

These statistics are produced and published with the aim of ensuring that an accurate and objective view of the life sciences sector in the UK can be conveyed to the public. All material is produced in line with principles in the Code of Practice for Statistics on impartiality.

Decisions on how these statistics are compiled and presented are made by analysts in OLS in line with guidance from the Heads of Profession for Statistics in the Department for Science, Innovation and Technology (DSIT), the Department of Health and Social Care (DHSC) and the Department for Business and Trade (DBT).

Access to the BaHTSS publication ahead of release is restricted to those involved in the production of these statistics. The circulation of the statistics 24 hours ahead of their release is restricted to the minimum necessary number of eligible recipients. The job titles of these individuals are outlined in the pre-release access list, which can be found on the accompanying 2023/2024 statistics landing page. 

The processes and data used to compile the statistics, along with an assessment of the quality and limitations of these processes and data, are outlined in this document.

New companies and their subsector classifications were identified by a contractor on behalf of OLS. All further dataset construction, quality assurance and statistics production was conducted by OLS professional analysts in line with the requirements set out in the Code of Practice for Statistics.

All professionals involved in the creation, publication and storage of this dataset are well-versed in data protection and operate in compliance with data protection legislation. In these published statistics, OLS comply with the strict disclosure control rules required by ONS due to the use of their IDBR dataset.

4.4.2 Future developments in trustworthiness

The 2023/2024 BaHTSS statistics comply with all trustworthiness pillars set out in the Code of Practice for Statistics and no change in processes relating to governance have changed from past publications.

5. Revision policy

It has not been possible to backdate or revise previous estimates of the life sciences sector (relating to 2021/2022 and earlier) using the new methodology. The 2023/2024 data collection relies on identifying a list of companies that are in-scope at a single point in time based on descriptive text on their websites, and it would not be possible to replicate this process for earlier time periods. This additionally means estimates for 2023/2024 are not directly comparable to past estimates.

The new methodology will be continued for future reports with additional developments in quality and coverage as outlined in section 4. This means future reports will reintroduce a time series and any figures from 2023/2024 onwards will be backdated with any developments, where possible.

Turnover values relating to 2021/2022 and earlier have been adjusted into 2023/2024 prices using GDP deflators.

Any revisions in future will be in line with the DSIT’s statistics revision policy and highlighted clearly in the reports.

6. Timeliness and punctuality

This publication is usually released on an annual basis, covering activity relating up to and including the end of the latest available financial year. A publication for data relating to 2022/23 was not produced due to the extensive changes in methodology that have been applied. Data for 2023/2024 was first published in October 2025. OLS will consider the most appropriate publication schedule going forward to enable the statistics to enter the public domain in a timely manner.

7. Cost and burden

The BaHTSS data collection process for publications relating to 2021/2022 and earlier was heavily reliant on manual input, knowledge and processing. The move to using the new approaches highlighted in section 2 aims to reduce the need for unnecessary manual steps. This has already achieved a reduction in manual efforts including:

  • the use of SICs to identify manufacturing and R&D activity. This was previously assessed manually on a company-by-company basis (as described in section 2.3)
  • identification of a larger net of companies for inclusion through automated means. This was then filtered down through small sampling reviews and SIC based filtering (as described in section 2.1)

This process will continue to be developed as part of the aim to improve quality and coverage, and OLS will continue to explore ways of making the statistics production process more efficient.

The data partners listed in section 8 continued to provide data on companies entering and leaving the industry for the 2023/2024 collection. Data providers keep a record of these changes as part of their usual business cycle and provide a summary as part of the BaHTSS data collection each year. OLS will review annually whether this contribution can be reduced or alleviated through introduction of more automation in the collection processes.

8. Data partner acknowledgement

OLS gratefully acknowledge the contribution of the following regional and national organisations to the BaHTSS publication. These data partners are listed below:

  • Association of British Healthcare Industries (ABHI)
  • Association of the British Pharmaceutical Industry (ABPI)
  • AXREM
  • BioIndustry Association (BIA)
  • BioNow
  • BioPartner UK
  • Biosciences Knowledge Transfer Network (KTN)
  • British Healthcare Trade Association (BHTA)
  • British In Vitro Diagnostics Association (BIVDA)
  • HealthTech and Medicines Knowledge Transfer Network (KTN)
  • Innovate UK
  • Invest Northern Ireland
  • MedCity
  • Medicines Discovery Catapult
  • Medilink East Midlands
  • Medilink North of England
  • Medilink South West
  • Medilink West Midlands
  • MediWales
  • Medicines and Healthcare products Regulatory Agency (MHRA)
  • OBN
  • One Nucleus
  • Scottish Enterprise
  • South East Health Technologies Alliance (SEHTA)
  • TechUK
  • Welsh Government