Salix Finance: Fraud Risk Assessment Reviewer

A tool to review Grant Applicant Fraud Risk Assessments against a defined criteria list to support determination of its robustness

1. Summary

1 - Name

Social Housing Decarbonisation Fund Wave 2 fraud risk assessment reviewer

2 - Description

Social Housing Decarbonisation Fund Wave 2 fraud risk assessment reviewer is Microsoft Copilot agent instructed to review fraud risk assessments and produce an output to specify if criteria from a checklist has been met around risks and controls. To achieve the output, the tool searches for key words and phrases associated with satisfying each checklist criteria. It’s being used to ensure a more time-efficient and objective review of fraud risk assessments.

3 - Website URL

N/A

4 - Contact email

salixci@salixfinance.co.uk

Tier 2 - Owner and Responsibility

1.1 - Organisation or department

Salix Finance

1.2 - Team

Salix Finance housing

1.3 - Senior responsible owner

Director of housing

1.4 - Third party involvement

Yes

1.4.1 - Third party

PriceWaterhouseCoopers LLP (PwC)

1.4.2 - Companies House Number

OC309347

1.4.3 - Third party role

The third party owns the licence to Microsoft Copilot on which the tool is built. The third party developed the concept and user case, built the tool, tested the tool and deployed the tool with oversight of Salix Finance at a standing monthly continuous improvement meeting.

1.4.4 - Procurement procedure type

PWC appointed as Crown Commercial contract - PS22265

1.4.5 - Third party data access terms

PwC holds a licence for the Microsoft Copilot on which the tool is built. As as result, it controls indiviuals who have access to the tool. Access is only provided to those who have been onboarded to the Social Housing Decarbonisation Fund Wave 2 project.

Tier 2 - Description and Rationale

2.1 - Detailed description

The fraud risk review Copilot agent is a Microsoft 365-based tool designed to support reviewing fraud risk assessments against a standardised checklist. It operates within the Microsoft CoPilot framework and uses a rule-based logic engine to deliver consistent, rationale-backed outputs.

The agent is built using Microsoft’s orchestration layer, which integrates large language models (LLMs) with data sources provided by the user, in this instance uploaded Excel documents containing a fraud risk assessment.

The agent is provided with a defined set of instructions that correspond to each item in the fraud risk checklist. These instructions are embedded as system prompts and guide the agent to: - evaluate each checklist criterion individually - for each criterion, generate both a binary outcome (satisfied / not satisfied) and two short bullet points, one explaining the rationale and one citing supporting evidence from the document.

The agent uses deterministic logic rather than probabilistic modelling, ensuring that outputs are traceable and repeatable. It doesn’t learn or adapt over time, which supports auditability and compliance with internal governance standards.

2.2 - Benefits

The benefits of this tool are a reduction of manual effort and increased consistency of output compared to a human-performed task.

2.3 - Previous process

Before the deployment of the tool, the review of fraud risk assessments was conducted manually by risk and compliance professionals. Reviewers would read through each assessment document and cross-reference it against a checklist of criteria, typically maintained in a separate document or spreadsheet.

2.4 - Alternatives considered

Manual review process: continuing with the legacy approach where reviewers manually assessed fraud risk documents against the checklist.

Trade-offs: while this method allowed for nuanced human judgement, it was time-consuming, inconsistent across reviewers and lacked a standardised output format. It also posed challenges for auditability and scalability.

Quantexa platform: explored for its advanced entity resolution and network analytics capabilities.

Trade-offs: while technically robust, the cost to deploy and customise Quantexa for checklist-based document review was prohibitive. It’s core functionality wasn’t well-suited to the structured, criteria-based nature of fraud risk assessment reviews.

OpenAI custom GPT model: considered for building a bespoke GPT-based reviewer.

Trade-offs: although capable of sophisticated language understanding, the cost of deployment and hosting wasn’t viable. Additionally, concerns around data residency and the risk of sensitive information being processed in the United States made it unsuitable for internal use.

Tier 2 - Deployment Context

3.1 - Integration into broader operational process

The fraud risk review Copilot agent doesn’t automate decisions. Instead, it provides a structured review output for each fraud risk assessment, indicating whether each checklist criterion has been satisfied. For every item, the agent generates both a binary outcome (satisfied / not satisfied) and two short bullet points, one explaining the rationale and one citing supporting evidence from the document.

Each output is reviewed by a human analyst, who performs quality control (QC) checks to ensure the rationale and evidence are appropriate and that the checklist has been applied correctly. Analysts may amend or override the agent’s assessment where necessary, and their final judgement is recorded in the firm’s risk documentation or reporting templates.

This integration ensures that the tool enhances consistency and efficiency while maintaining human oversight and accountability throughout the review process.

3.2 - Human review

Human quality control is performed on every output generated by the fraud risk review Copilot agent. Quality checkers review the rationale and evidence for each checklist item and can override the tool’s assessment where necessary. This ensures that final decisions still reflect human judgement.

3.3 - Frequency and scale of usage

The tool is designed for ease of use, with embedded instructions that ensure consistent execution of all review tasks. Prior to initial deployment, each user participates in an onboarding session led by the tool’s architect, covering functionality, scope and expected usage within the operational workflow.

It’s forecast that the tool will be deployed to carry out a review on an estimated 77 cases between September 2025 and the ending of the project in April 2026.

Note: this is a forecast figure and number of cases is likely to fluctuate as grant recipients have ability to shorten or extend their expected completion dates.

3.4 - Required training

The tool is straightforward to use due to having all instuctions pre built. This means that users are only required to input an Excel document. There is no requirement for users to provide a ‘prompt’ as this is all built into the fraud risk review Copilot agent’s instructions

All users will have received generic AI training so are aware of hallucination risks associated with AI usage.

An onboarding session is provided to each user by the tool’s architecture team prior to the using it.

The tool’s architecture team are easy accessible to users as are internal Microsoft Copilot subject matter experts should issues arising or ‘in the moment’ support is required.

3.5 - Appeals and review

N/A - members of the public do not interact with the tool.

However, users of the tool can provide feedback and /or report functionality issues with the tool to:

  • the development team - in first instance, users can report issues and provide feedback to the development team

  • the development team can escalate unresolvable issues to internal Microsoft Copilot subject matter experts

  • internal subject matter experts, where needed, can escalate issues and provide feedback to Microsoft.

Tier 2 - Tool Specification

4.1.1 - System architecture

The tool is built on Microsoft’s Copilot architecture, which integrates large language models (LLMs) with enterprise-grade data security and orchestration capabilities. It operates within the Microsoft 365 ecosystem and is designed to support structured document review workflows.

Key technical features - Microsoft Copilot integration: it doesn’t connect to enterprise content repositories such as SharePoint or OneDrive. Instead, it processes user-uploaded Excel documents directly. - Instruction-driven review logic: the agent is configured with a fixed set of checklist-based instructions. These guide the model to evaluate each row or section of the Excel document against defined fraud risk criteria.

4.1.2 - System-level input

Microsoft Excel documents containing completed fraud risk assessments.

4.1.3 - System-level output

For each criteria on the checklist, it produces a binary assessment (satisfied / not satisfied) and two short bullet points containing rationale and evidence for the output.

4.1.4 - Maintenance

The tool is scheduled to be reviewed on a quarterly basis with ad hoc reviews able to be performed based on real time user feedback.

4.1.5 - Models

The fraud risk review Copilot agent uses a single large language model (LLM) built into Microsoft’s Copilot framework. It follows a fixed set of instructions to review uploaded Excel documents against a checklist. No other models — such as machine learning, statistical or predictive models — are used.

Tier 2 - Model Specification

4.2.1. - Model name

Microsoft’s CoPilot

4.2.2 - Model version

The fraud risk review Copilot agent will always use the latest version of GPT available within the user’s licence - The fraud risk review Copilot agent is cloud-based and runs on Microsoft’s Azure infrastructure, which means it’s always connected to the latest version of GPT available. As Microsoft updates the underlying artificial intelligence (AI) models within the user’s licence, those improvements are automatically reflected in the agent—no manual updates are needed.

The Copilot agent connects to Microsoft’s Azure OpenAI service via secure application programming interfaces (APIs). This allows it to call the most up-to-date version of GPT available, ensuring it always benefits from the latest improvements in language understanding, reasoning and performance

4.2.7 - Model performance

In terms of testing performance to ensure tool works and is accurate:

Three sets of dummy data were used to initially build and test the tool.

The tool was then tested against ten data sets previously completed by human review to ensure more widespread functionality. It was recoreded that the tool was perfoming at an average first time pass rate of 88%, compared to the human-led review first time pass rate of 91%.

The tool was then piloted on ‘live’ cases for a three-month period within a closely monitored controlled environment to ensure functionality on wider scale.

4.2.8 - Datasets and their purposes

Dummy data sets

2.4.3. Development Data

4.3.1 - Development data description

Three sets of dummy data fraud risk assessments were used when developing the tool.

4.3.2 - Data modality

Tabular.

4.3.3 - Data quantities

Three sets of dummy data.

4.3.4 - Sensitive attributes

N/A

4.3.5 - Data completeness and representativeness

N/A

4.3.6 - Data cleaning

A review was performed to ensure the dummy data contained no reference to personal identifiers.

4.3.7 - Data collection

N/A

4.3.8 - Data access and storage

Dummy data was produced by PwC and is stored on PwC systems in accordance with its data handling policies.

4.3.9 - Data sharing agreements

N/A

Tier 2 - Operational Data Specification

4.4.1 - Data sources

The grant recipient provides its fraud risk assessment in the form of a Microsoft Excel document - this is input into the tool by the user.

Tier 2 - Risks, Mitigations and Impact Assessments

5.1 - Impact assessments

PwC consulted internally with resources available and it was determined that a protection impact assessment (DPIA) wasn’t required as the tool would not be processing any personal data.

5.2 - Risks and mitigations

The risk of Artificial Intelligence (AI) hallucination of outputs is mitigated by human quality check of all outputs.

The risk of machine-generated outputs being recorded as final outputs is mitigated by human quality check of all outputs.

Updates to this page

Published 3 October 2025