BDUK: Intervention Area Design Algorithms

The model is a rule-based geospatial classification system that analyses Unique Property Reference Numbers, subsidy data, and commercial plans, applying sequential filtering algorithms and geospatial decision rules to determine gigabit intervention eligibility.

Tier 1 Information

1 - Name

Intervention Area Design Algorithms

2 - Description

The algorithmic tool is employed to process and analyse large datasets, particularly geospatial data, to determine the priority and eligibility of premises for gigabit subsidy. The algorithm steps support the decision which Unique Property Reference Numbers should be moved into initial and deferred scope when designing an intervention area.

3 - Website URL

N/A

4 - Contact email

arthur.robijns@bduk.gov.uk

Tier 2 - Owner and Responsibility

1.1 - Organisation or department

Building Digital UK (BDUK)

1.2 - Team

Analysis & Evaluation

1.3 - Senior responsible owner

Head of Strategy & Analysis

1.4 - External supplier involvement

No

Tier 2 - Description and Rationale

2.1 - Detailed description

The algorithmic tool operates through a structured, sequential process to refine premises eligibility for gigabit subsidy. It begins with preprocessing, where UPRNs in voucher priority areas are deferred if vouchers are likely to deliver better value. Premises already subsidised or under construction are removed, as well as those classified under AddressBase filters, such as war memorials. Additional commercial plan data from market engagement is also assessed to refine scope. Next, the algorithm descopes premises classified as White or Under Review (UR) if they share an exact location with Black or Grey premises. RPlink analysis further descopes White or UR premises on road segments where over 25% of properties are Black or Grey and defers White premises where over 25% are UR. For Built-up Areas (BUA), if over 50% of premises are Grey or Black, all White and UR premises are removed. If over 50% of premises in total (UR, Grey, or Black), White premises are deferred. These steps are applied sequentially, ensuring a refined and data-driven approach to prioritising gigabit subsidy allocation.

2.2 - Scope

The algorithmic tool is designed to enhance data processing, analysis, and decision-making by identifying coverage gaps, supporting intervention choices, engaging suppliers, and managing subsidy control. It integrates multiple data sources, applies geospatial algorithms, and provides visualisations to facilitate informed decisions. The tool is intended for analysing gigabit coverage gaps, aiding GIS contract delivery, and supporting local area reviews. However, it is not designed for real-time data processing, non-geospatial analysis, or standalone decision-making. This ensures clarity on its scope and purpose while preventing misconceptions about its capabilities.

2.3 - Benefit

The algorithmic tool enhances data processing accuracy, decision-making efficiency, and programme delivery. It automates data cleansing, filtering, and integration, ensuring reliable and up-to-date information. By providing geospatial analysis and visualisations, it supports evidence-based decisions for contract management and intervention strategies. The tool also optimises programme delivery by identifying coverage gaps, aiding supplier engagement, and ensuring subsidy control. Additionally, it improves information accessibility for governance, enhances transparency, and standardises decision-making for consistency. Its use is justified by its ability to streamline processes, improve data accuracy, and support effective programme execution.

2.4 - Previous process

Before deploying the algorithmic tool, the decision-making process involved Public Review (PR) data processing to establish a legal basis for intervention. However, raw PR data required cleaning to ensure premises were eligible for gigabit upgrades, they had not already received subsidies, and were not in Voucher Priority Areas. Basic checks included verifying Unique Property Reference Numbers (UPRNs) against AddressBase Premium, an Ordnance Survey database categorising premises. More advanced validation was needed to resolve anomalies in PR results, as categorisations could appear inconsistent—such as neighbouring properties being classified differently (e.g., one semi-detached house marked White and the other Grey). This manual and structured review process ensured more accurate intervention area mapping before the algorithmic tool automated these tasks.

2.5 - Alternatives considered

During the selection process an alternative model was considered called the Network Graph Diffusion, which is analysis to eliminate the “noise” with a Machine Learning diffusion algorithm. However, the model is still under development and hasn’t gone through the various performance tests that are required before deploying it.

Tier 2 - Decision making Process

3.1 - Process integration

The algorithmic tool is fully integrated into the decision-making process by automating data collection, analysis, and classification to determine gigabit subsidy eligibility. It influences decisions by removing manual inconsistencies, ensuring objectivity, and improving efficiency in identifying intervention areas. The tool processes large datasets, applies geospatial algorithms, and presents insights through visualisations, dashboards, and reports. The wider decision-making process begins with data collection and preprocessing, where Public Review, commercial plans, and Ordnance Survey data are cleaned and integrated. The tool then applies classification rules to assess premises eligibility before engaging with stakeholders such as suppliers and local authorities to validate findings. Decision-makers use the tool’s outputs to guide contract procurement and policy enforcement, ensuring interventions are effectively targeted. The process is iterative, with ongoing reviews and updates refining intervention areas as new data becomes available.

3.2 - Provided information

The tool provides decision-makers with detailed geospatial, subsidy, and market data to support evidence-based decisions. It analyses UPRNs, premises classifications, and road segment density while incorporating supplier commercial plans and intervention feasibility assessments. Public Review data is processed to establish the legal basis for intervention, and the tool refines classifications to improve accuracy. The results are presented through interactive maps in platforms like Tableau and Looker, detailed reports in Excel, structured datasets in BigQuery, and quality assurance dashboards for data verification. This ensures that stakeholders have access to clear and comprehensive information to support their decisions.

3.3 - Frequency and scale of usage

The tool runs daily through LiveView to check for updates related to alternate interventions such as Vouchers, Superfast, LFFN, or other programmes. Major updates occur three times per year following the Open Market Review (OMR), which incorporates the latest supplier data on commercial plans. Additionally, the Local Area Review (LAR) process allows for further refinements based on local feedback. The tool supports multiple Gigabit Infrastructure Subsidy (GIS) procurements, voucher schemes, and funding allocations across different regions, ensuring decisions reflect the most current data.

3.4 - Human decisions and review

While the tool automates much of the data analysis and classification, human review remains an essential part of the process. Stakeholders, including local authorities and suppliers, review classifications to ensure accuracy and account for real-world considerations. Analysts manually refine intervention areas based on feedback, particularly in cases where the algorithm produces anomalies or when commercial plans require further validation. Additionally, exception handling mechanisms are in place to allow for case-by-case reviews where premises classifications may be disputed or require further assessment.

3.5 - Required training

Personnel involved in deploying or using the tool must undergo appropriate training to ensure effective operation and compliance. Data analysts and GIS specialists receive operational training on processing and interpreting outputs. Decision-makers undergo training on using visualisations and reports to inform subsidy allocation.

3.6 - Appeals and review

Several mechanisms are in place to allow for review and appeal of decisions influenced by the tool. Suppliers and local authorities can challenge the classifications and submit additional commercial plans for reassessment. The PR and OMR processes allow businesses to provide corrections during designated review periods. Governance mechanisms enable policymakers to refine intervention strategies based on stakeholder feedback and evolving market conditions. Additionally, quality assurance dashboards and manual review processes help ensure transparency and data integrity, allowing for continuous improvement in decision-making.

Tier 2 - Tool Specification

4.1.1 - System architecture

The algorithmic tool is built on a data processing pipeline that integrates multiple datasets, applies geospatial and rule-based algorithms, and outputs refined UPRN classifications. The core components include BigQuery for data storage and transformation, geospatial processing algorithms for classification and filtering, and visualisation tools such as Tableau and Looker for decision support. The system processes inputs from Public Review data, Open Market Review (OMR) updates, AddressBase, Ordnance Survey datasets, and supplier commercial plans. The tool runs daily via LiveView for minor updates and is refreshed three times per year following OMR changes.

4.1.2 - Phase

Production

4.1.3 - Maintenance

The tool undergoes daily maintenance checks via LiveView, which ensures updates to alternate interventions such as Superfast, LFFN, or other programmes are incorporated. Major updates occur three times per year following the Open Market Review (OMR) to integrate new supplier commercial plans. Additionally, Local Area Reviews (LARs) provide further refinements based on regional stakeholder input. The tool is monitored through quality assurance dashboards and manual review processes, with ongoing technical maintenance handled by data engineers and analysts responsible for ensuring performance, accuracy, and compliance with programme requirements.

4.1.4 - Models

The tool primarily relies on rule-based algorithms and geospatial analysis models rather than machine learning. It applies sequential classification rules to assess premises eligibility based on factors such as location, subsidy history, road segment density, and built-up area classification. Specific models include:

  • Geospatial Filtering Algorithms: Identify coverage gaps and classify premises based on proximity to commercial interventions.
  • RPlink Model: Evaluates premises on road segments to determine eligibility based on the proportion of Grey, Black, and Under Review premises.
  • Built-up Area (BUA) Model: Analyses areas with high-density premises to determine intervention eligibility based on the percentage of existing coverage.
  • AddressBase Premium Filters: Categorises premises to remove ineligible sites (e.g., war memorials, under-construction buildings).
  • Decision Support Visualisation Models: Generate interactive maps and dashboards for various teams. These models work sequentially, ensuring that UPRN is classified accurately while allowing for manual validation and stakeholder input where necessary.

Tier 2 - Model Specification

4.2.1 - Model name

Intervention Area Design Algorithms

4.2.2 - Model version

2.0.0

4.2.3 - Model task

The algorithms support which UPRNs should be moved into initial and deferred scope. The algorithm steps are applied sequentially, however, before that UPRNs in other BDUK interventions/products are removed as well as UPRNs which do not require gigabit subsidy

4.2.4 - Model input

The model processes geospatial, market, and subsidy-related data to classify premises for gigabit intervention. Inputs include:

  • Unique Property Reference Numbers (UPRNs) from AddressBase and Ordnance Survey.

  • Public Review (PR) classifications defining legal intervention areas.

  • Open Market Review (OMR) data, updated three times per year, reflecting supplier commercial plans.

  • Road segment classifications from RPlink, identifying clusters of eligible premises.

  • Built-up Area (BUA) classifications to assess intervention viability in high-density locations.

4.2.5 - Model output

The model produces classification decisions regarding premises’ eligibility for intervention - White (eligible), Grey/Black (covered by commercial plans), Under Review, or Deferred Scope.

4.2.6 - Model architecture

The tool is rule-based and does not use machine learning. It operates through a sequential filtering algorithm, applying geospatial and predefined rules to classify premises.

4.2.7 - Model performance

Since this is a rule-based system, performance is evaluated by:

  • Classification accuracy, verified through Quality Assurance dashboards and stakeholder validation.

  • Consistency over time, ensuring stable decision-making across multiple data updates.

  • Computational efficiency, measured in query execution times and scalability for large data processing.

  • Fairness considerations, where Local Area Reviews (LARs) and OMR collections help validate classifications against real-world conditions.

4.2.8 - Datasets

The model relies on multiple datasets for classification:

  • ONS UPRN Directory (ONSUD): Allocates addresses to a range of geographies using the grid reference of the UPRN

  • AddressBase Premium (Ordnance Survey): Provides UPRNs and property classifications.

  • Public Review Data: Establishes legal eligibility for intervention.

  • Open Market Review (OMR) Data: Updates supplier commercial plans three times per year.

Geospatial data: - x and y coordinated: Groups premises into single location

  • RPlink Road Segment Data: Groups premises into geospatial clusters.

  • Built-up Area (BUA) Data: Determines eligibility based on density and infrastructure coverage.

4.2.9 - Dataset purposes

Since the tool is rules-based rather than trained on datasets, datasets are used directly in processing rather than for training, validation, or testing.

Tier 2 - Data Specification

4.3.1 - Source data name

  • ONS UPRN Directory (ONSUD)
  • AddressBase Premium (Ordnance Survey)
  • Public Review Data
  • Open Market Review (OMR) Data
  • Built-up Area (BUA) Data: Determines eligibility based on density and infrastructure coverage.
  • Road link data BDUK’s datasets:
  • Double subsidy data
  • Supplier and local authority feedback (changelog)

4.3.2 - Data modality

Geospatial data

4.3.3 - Data description

  • ONS UPRN Directory (ONSUD): Allocates addresses to a range of geographies using the grid reference of the UPRN

  • AddressBase Premium (Ordnance Survey): Provides UPRNs and property classifications.

  • Public Review Data: Establishes legal eligibility for intervention.

  • Open Market Review (OMR) Data: Updates supplier commercial plans three times per year.

Geospatial data: - x and y coordinated: Groups premises into single location

  • RPlink Road Segment Data: Groups premises into geospatial clusters.

  • Built-up Area (BUA) Data: Determines eligibility based on density and infrastructure coverage.

4.3.4 - Data quantities

The tool processes large-scale geospatial and subsidy-related datasets, including millions of Unique Property Reference Numbers (UPRNs) from AddressBase and Ordnance Survey. The dataset size varies but typically includes thousands of premises per review cycle. Since this is a rule-based system, no traditional training, validation, or testing splits are required.

4.3.5 - Sensitive attributes

The tool does not process personal or sensitive data. It relies on geospatial, market, and infrastructure datasets such as UPRNs, commercial build plans, and public review classifications. No personal, protected characteristics, or proxy variables are included.

4.3.6 - Data completeness and representativeness

The data is comprehensive for its purpose, but gaps may exist due to delays in supplier updates, ONS data revisions, or inaccuracies in Public Review classifications. Built-up area classifications and road segment data are periodically refreshed to maintain accuracy. Local authority and supplier engagement helps validate any discrepancies.

4.3.7 - Source data URL

4.3.8 - Data collection

The data used by the tool comes from multiple sources, each maintained by different organisations and updated at different intervals. The Open Market Review (OMR) is conducted three times per year, gathering the latest supplier commercial plans and intervention updates. AddressBase Premium is maintained by Ordnance Survey (OS) and is regularly updated to ensure accurate property classifications and UPRN assignments. Built-up Area (BUA) classifications are maintained by the Office for National Statistics (ONS), with updates incorporated as new housing developments are recorded. Additionally, the Unique Property Reference Number (UPRN) Directory, known as ONSUD (ONS UPRN Directory), is also maintained by ONS and is regularly updated to provide authoritative address matching and geospatial referencing.

4.3.9 - Data cleaning

The tool applies several preprocessing steps to ensure data quality and relevance. It removes premises that have already received subsidies, such as through vouchers. It also applies AddressBase Premium filters to exclude premises that do not require intervention (e.g., war memorials, under-construction buildings). Further cleaning involves validating UPRNs, ensuring alignment with commercial build plans, and filtering out anomalies in Public Review classifications. Road segment data and built-up area classifications are also refined to improve accuracy.

4.3.10 - Data sharing agreements

Data sharing follows government policies and supplier agreements. Public Review and Open Market Review (OMR) data are collected under specific legal frameworks, and data from commercial providers is shared under confidentiality agreements. Local authorities, suppliers, and programme teams have limited access to specific datasets for validation and decision-making. Any public-facing data is aggregated and anonymised to prevent exposure of commercially sensitive or location-specific details.

4.3.11 - Data access and storage

Access to the data is strictly controlled. Programme teams, government analysts, and approved stakeholders (e.g., local authorities and suppliers) have access under defined roles and permissions. Data is stored securely within government-managed cloud environments, such as BigQuery, with access restricted based on organisational requirements. Data retention policies follow regulatory requirements, ensuring that outdated or superseded data is removed after its operational relevance expires.

Tier 2 - Risks, Mitigations and Impact Assessments

5.1 - Impact assessment

The algorithm is a part of the eligibility mapping for BDUK’s GIS programme, and as such is included in the governance, risk management & impact assessments carried out regularly as part of the programme’s delivery. This includes regular risk committees, reviews, and broader oversight on the programme.

A DPIA has not been carried out specifically for the algorithms as they do not process any personal data.

5.2 - Risks and mitigations

The blanket coverage of built up areas may descope some premises which would be appropriate for procurement, particularly if the winning bidder has extensive coverage in the area already. Some built up areas or roads are descoped on the basis of future commercial build plans which the local body may feel are unrealistic.

In some areas, new houses have been built and not yet associated with the relevant Built up Area (these were updated by the Office of National Statistics in October 2022). We are dependent on ONS data - which is good from the perspective of objectivity and independence, but does mean that errors could affect our work.

If required to mitigate these risks, we can override the algorithms and return premises to the categorisation assigned at Public Review, but only do so if there is a very strong reason - otherwise we risk skewing procurements away from the hardest to reach premises. Note that we cannot “upwards adjust” the Public Review categorisation - Black/Grey = always out of scope, UR = at most Deferred Scope.

Reasons we may override the algorithms are: premises cannot currently access superfast speeds or have historically been overlooked by interventions, or there is specific local knowledge of delivery issues which make inclusion in the procurement scope the best value option for delivery.

Updates to this page

Published 15 September 2025