Care Quality Commission: Relativity Active Learning

Active Learning is a technology assisted review tool that helps the Care Quality Commission (CQC) quickly organise its data and predict which documents are most likely to be relevant to reviewers, the legal team and therefore inquiries.

1. Summary

1 - Name

Relativity Active Learning

2 - Description

Active Learning is a technology assisted review tool that helps the Care Quality Commission (CQC) quickly organise its data and predict which documents are most likely to be relevant to reviewers, the legal team and therefore inquiries. By using Active Learning, CQC reduces the total time to review documents as the tool learns from previous manual decisions to rank the unreviewed documents in order of relevance. This tool assists CQC staff in identifying the more relevant documents faster. The documents specified here can vary in nature but will have been promoted as part of our discovery excercise as documents that may be within scope of the term of reference of an inquiry .

3 - Website URL

https://help.relativity.com/Server2025/Content/Relativity/Active_Learning/Active_Learning.htm

4 - Contact email

inquiriesandinvestigations@cqc.org.uk

Tier 2 - Owner and Responsibility

1.1 - Organisation or department

Care Quality Commission

1.2 - Team

Inquiries and Investigations team - Legal Services

1.3 - Senior responsible owner

Deputy Director

1.4 - Third party involvement

Yes

1.4.1 - Third party

KL Discovery Limited (KLD)

1.4.2 - Companies House Number

03159322

1.4.3 - Third party role

CQC have procured the tool from KLD who have helped CQC to deploy it into use.

1.4.4 - Procurement procedure type

Existing arrangement as part of a call off agreement.

1.4.5 - Third party data access terms

Data can only be accessed by any other third party such as legal counsel through CQC approval.

Tier 2 - Description and Rationale

2.1 - Detailed description

The tool works on the basis of learning from previously coded documents that were labelled by humans in a manual way (relevant/not relevant) and uses this information to determine the likelihood that the set of documents under review are relevant or otherwise, and assigns this both a relevance % and a prediction score and then prioritises these for human review accordingly. Its purpose is to streamline the review process by focussing human review effort on the key documents in a systemic way. It does this in an ongoing way by constantly learning and working in the background to integrate ongoing human decisions. Its users will be the legal review team within the inquiries team. Its limitations for it to work effectively are a set number of documents which will have to be first manually reviewed. This tool will not be applied or used to determine the content of a document or decide what the summary of any document maybe.

2.2 - Benefits

As the tool ‘learns’, the ranking can be used to recommend documents most likely to be relevant for prioritised review next. It thus enables reviewers to locate and review earlier in the process than they otherwise would have done. Its benefit is therefore twofold, huge productivity gains, value for money in terms of human labour cost, a more efficient process, easier to comply with deadlines, mitigate reputational damage through review and a better end product (witness statement).This tool will therefore save the public / taxpayers money both in terms of associated costs but also in terms of the costs of inquiries themselves by ensuring that inquiries are only provided with the most relevant documents

2.3 - Previous process

Prior to this model being implemented, staff had to manually go through documents in an unsorted order meaning much time was wasted reading documentation that was not relevant.

2.4 - Alternatives considered

No other technological alternatives exist. The majority of off the shelf tools like this technology operate in the same way. The only alternative would be to process the documentation manually and this would lead to the issues discussed in the previous process and a loss of the benefits discussed above.

Tier 2 - Deployment Context

3.1 - Integration into broader operational process

Public inquiries are official investigations, established by a UK or devolved government minister, but conducted by an independent body, to examine matters of “public concern” about a particular event or set of events. There are two main types of public inquiry – statutory and non statutory. Inquiries have addressed topics such as transport accidents, fires, the mismanagement of pension funds, self-inflicted deaths in custody, outbreaks of diseaseon-statutory. Statutory inquiries, which are the most common, are usually established under the Inquiries Act 2005 which grants special powers to compel testimony and the release of other forms of evidence. Public inquiries address three key questions:

What happened? Why did it happen and who is to blame? What can be done to prevent this happening again? To answer these three questions, inquiries start by collecting evidence, analysing documents and examining witness testimonies. The remit of a public inquiry is set out in the terms of reference (TORs). These are specific instructions outlining the questions that the inquiry should address, the types of information and feedback that the government wants. The CQC will use this tool within the Inquiries and Investigations team to assist with our response to an inquiry request. This tool will be used as part of the the multi disciplinary approach and the range of tools the team uses to respond to Inquiry asks in the form of a Rule 9 which may request exhibiting of documents and /or submitting witness statements It will be used at the start to help us more efficiently comb through the large body of documents that have been discovered, which will be reviewed and used to form the narrartive of any response and shared with the inquiry. Use of the tool stops once information is submitted until any further requests.
Relativity Active Learning is an officially developed tool, but optional utility provided by the software vendor for the Relativity platform. The potential to use this utility is always available in any Relativity environment, but is not applied on any given matter by default.

The uses for active learning are varied as are the output values. More can be learned by reviewing the vendor’s documentation here. https://help.relativity.com/Server2024/Content/Relativity/Active_Learning/Active_Learning.htm

3.2 - Human review

As the Active Learning Utility is a user-driven process, Human Review of the tool’s output is set by users. Fidelity testing is routinely performed by the software vendor on the Relativity platform annually and upon user request. CQC will be involved in monitoring the use, and output of the tool. CQC inquiries team will decide what documents are selected, read and reviewed. The tool will help streamline the process by advising and supporting the human in making their decision on what next to read (order) but ultimately any decision taken to select which documents are exhbited and inputted to an inquiry is soley decided by a human on the team.

3.3 - Frequency and scale of usage

The use of the Active Learning Utility varies from project to project. The workflow can support only a handful of users at a time, or hundreds if so elected. It is used electively, so the frequency with which is it used and the number of people who interact with it varies commensurately but because of it’s utility it will likely be used all current and future inquries and therefore will likely have anywhere up to 20+ plus people benefitting from it’s output currently.

3.4 - Required training

This will be a new tool for the majority of the team so training will be provided more generally on the tool itself and on the machine learning element. Therefore, once it is in live use, there will be end user training for the platform provided to those in the team to ensure that users understand the role of the AI tool and how to ensure it is effective. This will be provided as part of the general onboarding to the relativity platform.

3.5 - Appeals and review

No, this is an internal process. If there is a concern regarding Data Output accuracy, it will be raised by CQC, and it will be addressed by KLD and Relativity support teams directly.

Tier 2 - Tool Specification

4.1.1 - System architecture

Relativity Active Learning runs on a localized platform utilising both Java and PostgreSQL located on the same logical machine. For Relativity Server at KLDiscovery, this logical machine is always located on the same data centre, domain, and security environment as all other Relativity platform elements.

Data output by this logical machine is transferred via internal API for hosting and storage on the Relativity platform using a standard T-SQL database also located in a secure data centre, domain, and security environment controlled by KLDiscovery.

More specific configuration information can be found here: https://help.relativity.com/Server2024/Content/Installing_and_Upgrading/Relativity_Upgrade/Upgrading_or_installing_Relativity_Analytics.htm

4.1.2 - System-level input

CQC documents that have been provided and require reviewing.

4.1.3 - System-level output

Relativity Active Learning Score and percentage of relevance.

4.1.4 - Maintenance

Regular system maintenance occurs monthly, performing security reviews and updates as needed. Relativity platform maintenance and upgrades typically takes place bi-annually, which normally included updates to the Relativity Analytics/Active Learning utility.

4.1.5 - Models

Support Vector Machine (SVM) model.

Tier 2 - Model Specification

4.2.1. - Model name

Support Vector Machine (SVM) model, a modified version of LIBLINEAR.

4.2.2 - Model version

24.0.375.2

4.2.3 - Model task

It will look to classify documents as either “Relevant” or “Not Relevant” based on learning acquired from reviewer coding and prioritise those documents.

4.2.4 - Model input

Human Reviewers coded documents.

4.2.5 - Model output

Serves and prirotises documents most likely to be relevant.

4.2.6 - Model architecture

Binary supervised machine learning classifier.

4.2.7 - Model performance

https://help.relativity.com/Server2024/Content/Relativity/Active_Learning/Active_Learning_performance_baselines.htm

4.2.8 - Datasets and their purposes

CQC Documents that are classified by CQC employees that have been deemed to be possibly relevant based on the Inquiries terms of Reference.

Tier 2 - Operational Data Specification

4.4.1 - Data sources

From the CCQ (SharePoint Y drives etc). Data will be shared with KLD and Relativity through SFTP, Azureblob or Egress.

4.4.2 - Sensitive attributes

Yes, may contain patient data. Data can only be accessed following an approval process, access is given via a two factor authentication (2FA) process and have undergone strict security assurance processes.

4.4.3 - Data processing methods

N/A

4.4.4 - Data access and storage

Yes, may contain patient data. Data will be stored for a period (which some inquiries will define) after an inquiry is completed for legislative purposes. Data can only be accessed following an approval process, access is given via a two-factor authentication process and undergone strict security assurance processes.

4.4.5 - Data sharing agreements

Data will be shared with the Inquiry as per our legal Obligations under the Inquiries Act 2005. The inquiry(s) will have its own data protection agreement/protocol.

Tier 2 - Risks, Mitigations and Impact Assessments

5.1 - Impact assessments

We have undertaken a number of assessments which have been signed off an agreed and include Security by design assessment, Architecture assessment, Contracts with security and data schedules We have signed off one DPIA and in the process of finalising the final DPIA.

5.2 - Risks and mitigations

The tool is not likely to affect individuals or groups but it does carry the risk that it assigns incorrect coding to a document which may be important to the CQC or the inquiry. This will be mitigated by training it on human decisions first, by having a quality control (QC) process built in which involves human review following machine review, by having the machine constantly learning through ongoing tagging and review so it can increase its performance.

Updates to this page

Published 30 March 2026