MHRA guidance on the use of real-world data in clinical studies to support regulatory decisions

Published 16 December 2021

1. Scope

This document provides an introduction to the MHRA’s real-world data (RWD) guideline series, and points to consider when evaluating whether a RWD source is of sufficient quality for the intended use.

Sponsors interested in the use of RWD in their development programmes are encouraged to engage with the MHRA for further advice on specific proposals.

2. Introduction to MHRA RWD guidelines

There are vast amounts of data being collected on patients, for example, in electronic health records (EHR), and disease and patient registries. Such data are commonly called RWD, reflecting that they are collected while patients go about their regular lives, as opposed to being specifically collected in a clinical study. When such data are analysed, the information produced may be referred to as real-world evidence (RWE).

For the purposes of this series of guidelines, RWD are defined as data relating to patient health status or delivery of health care collected outside of a clinical study. Sources of RWD include electronic healthcare records (EHR) defined as structured, digital collections of patient level medical data, primary and secondary care records, disease registries, and administrative data on births and deaths. Other sources of RWD include patient reported outcomes (PRO) data and data which are collected outside of a clinical study setting such as through wearable devices, specialised/secure websites, or tablets.

The MHRA is producing a series of guidelines to provide general points to consider for sponsors planning to conduct clinical research using RWD to support regulatory decision making.

The guidelines provide information to aid the design of studies aiming to provide evidence suitable for supporting regulatory decisions. This guidance is applicable to studies regardless of geographical location.

Guidance is also given on requirements for gaining approval for studies to be run in the UK.

While extensively used for monitoring the performance of medicines and devices after regulatory approval, RWD has traditionally been utilised much less frequently when it comes to demonstrating the safety and efficacy of an intervention.

Use of such data sources for this purpose in clinical studies has the potential to increase the speed and to reduce the cost of development programmes, which would see effective medicines being approved more quickly, or even programmes which were previously thought to be unfeasible becoming feasible, with the consequent benefit to public health. Evidence derived from RWD may also be more representative of the true effects of a treatment in the community and more generalisable than data from the standardised setting of a traditional clinical trial.

When planning a study using RWD it is important to demonstrate that the data source is of sufficient quality for the intended use. The advantages of using RWD quoted above are of no relevance if the data source is of poor quality, as reliable decisions cannot be made based upon an unreliable data source.

Furthermore, running a study using RWD does not negate the importance of general principles relating to the strength of evidence produced by a study. For example, evidence from randomised comparisons of treatments is more reliable than non-randomised evidence and blinding to treatment allocation reduces bias. Such principles remain applicable for studies using RWD. The limitations/strengths of different types of studies will be addressed in more detail in guidelines related to those designs.

There is a growing level of activity and planning around the use of RWD in this arena globally[footnote 1] [footnote 2] [footnote 3] [footnote 4]. The MHRA has expertise in using RWD for pharmacovigilance based on UK primary care data, for example supplied via CPRD (Clinical Practice Research Datalink) and EHR data are becoming more readily available from health systems including NHS digital. Capabilities for delivering EHR-based studies for the purposes of comparative effectiveness have also been developed enabling greater insights into the uses of RWD .

Further to the guidelines, the MHRA encourages sponsors to explore the opportunities presented by the utilisation of RWD and applicants with proposals are invited to discuss these via MHRA scientific advice meetings.

3. Data Quality

The quality of the source data should be understood including its accuracy, validity, variability, reliability and provenance. Care should be taken to understand the origin of the source database along with any transformation or manipulation that may have occurred during the processing of the database. Users must be able to define the provenance of the source data, explain the mechanisms used to link data points, manage discrepancies, and describe any limitations or considerations associated with the data.

The RWD (whether from a database or other data sources) being used must be demonstrated to be of sufficient quality for the intended use. Published guidelines for good database selection from other fields, including pharmacoepidemiology, are applicable. Areas of consideration prior to submitting the study protocol include:

  • Will the database be used as the source population for recruitment, or be used to supplement other data sources? Will it include an appropriate population in terms of size, coverage and representativeness? It is recommended that statistical power calculations are used to assess whether the potential number of patients would enable a clinically important treatment effect to be detected.
  • Are important baseline characteristics measured regularly and likely to be up to date?
  • Will the database capture the interventions, outcomes and other study variables in sufficient detail, consistently and without bias, with the level of frequency and completeness that is required for the study? If not, how will data from the database be combined with additional data sources for the study? It is recommended that an observational feasibility study is conducted by the sponsor to assess the recent capture of study variables prior to undertaking a RWD study.
  • How will changes in data collection during the study period be handled, both for individual patients, e.g., those who move between healthcare professionals or the whole population, e.g., changes in coding systems?
  • How will any potential interoperability issues between health systems in the devolved nations of the UK, and internationally, be addressed? It is recommended that methods to cross reference and manage the data across different centres and countries, including common data models, are explored.
  • Is the time between the occurrence of events and availability of the data to the study team suitable for the usage of the data in the study? How soon will data be able to be analysed after last patient, last visit? If used for monitoring adverse events, what impact would the availability of data have on the suitability of the database?
  • What method will be used to link the database to additional data sources for the study? Is reliable linkage likely to be possible for the study population? Have the proposed linkage methods been validated previously?
  • Which privacy and security policies apply to the use of the database? What restrictions apply on the transfer, storage, use, publication and retention of the data?
  • What data quality checks are undertaken by the data controller? Which additional study-specific checks will be needed?

The responsibilities associated with the collection, maintenance and onward use of RWD should be clearly defined and understood by all parties. Ideally this should include those generating the source data, but it is recognised that this is likely to be difficult due to the nature of some records such as those generated as part of routine clinical care. Processes should exist for the resolution of discrepancies and communication of issues identified by later processing.

Sponsors should include in the study protocol a description of the tools and methods for selection, extraction, transfer, and handling of data and how they have been or will be validated. It is essential that processes are established to ensure the integrity of the data from acquisition through to archiving and sufficient detail captured to allow for the verification of these activities. It is expected that the validity of the RWD that are intended to be used in the study is formally documented and approved by the sponsor before the study protocol is published or submitted to the MHRA.

Acceptable processes would need to be in place for any additional data collected outside of the main source database.

‘Digital health technologies’ can be used to gather health related data. They include sensors, wearables and other technologies, such as ingestible devices and implantables. Such devices and apps might be used to collect RWD as part of routine clinical care for patient reported outcomes or home-based measurements. For example, completing an ‘activities of daily living’ questionnaire online before attending an appointment or using an oxygen saturation probe to monitor a patient with congenital cardiac disease awaiting intervention. Other sources of RWD might utilise devices that capture indicators of function (for example daily step count). These might be used as supportive evidence of safety or efficacy of a treatment. It is important that both the device and any tool (such as a questionnaire) are suitability validated for the measurements required. The relevance, objectivity, and practicality of measurements should be considered, taking into account the disease, age, and potential functional abilities of the user. Some devices may be regulated as a medical device and advice can be sought from MHRA when required.

General Principles:

  • As for all studies, data quality and assurance and appropriate data management oversight is essential.
  • Data quality processes must be detailed in the study protocol and these must be appropriately validated to ensure that any process is fit for its intended use.
  • Data quality checks should be conducted to ensure all data values are recorded, handled, and stored in a way that allows its accurate reporting, interpretation and verification.
  • Data should meet agreed specifications and requirements defined by the sponsor to support the study requirements to ensure that any received data contains only expected data files and that all data elements are structured correctly as per the agreed specification.


The majority of MHRA Good Clinical Practice (GCP) inspections are carried out under the risk-based compliance programme. These can be either systems-based or study specific. Inspectors work in conjunction with assessors to identify studies for inspection based on a wide variety of factors including areas of interest e.g., ePROs, novel interventions or based on regulatory triggers.

Inspection of the systems and processes used for the oversight of RWD suppliers and the subsequent onward management of RWD data may also occur in conjunction with an inspection of a sponsor or Contract Research Organisation (CRO). Areas of particular interest for review would include randomisation methods, data management, Investigational Medicinal Product (IMP) management, safety reporting, sponsor oversight and where externally provided databases are used to collect study data. Processes used to assure the integrity of the reported data are a common focus throughout any inspection of a licensing application.

4. Advice

If advice beyond what is contained in the guidelines would be helpful, please request a scientific advice meeting to discuss any plans. This can include representatives with expertise in licensing, clinical trial approval, post-licensing studies, paediatric studies, medical devices, inspection and Electronic Healthcare Records as applicable.

  1. FDA. (2018) Framework for FDA’s Real-World Evidence Program

  2. HMA/EMA. (2019) HMA-EMA Joint Big Data Taskforce Phase II report: Evolving Data-Driven Regulation

  3. Innovative Medicines Initiative. (2018) GetReal Initiative

  4. The Academy of Medical Sciences. (2018) Next Steps for Using real World Evidence