MHCLG: Safer Greener Buildings Building Image Recognition Tool

Ascertains an accurate estimate of residential building height and an understanding of the external wall system materials used.

Tier 1 Information

1 - Name

Safer Greener Buildings Building Image Recognition Tool

2 - Description

MHCLG has developed the Safer Greener Buildings Image Recognition Tool to ascertain an accurate estimate of building height and an understanding of the external wall system materials used on residential buildings. There is no reliable source of data for building heights in England and the types of external wall systems they have on them. Ordnance Survey (OS) maps have proven to be unreliable and we estimate that there are circa 75k residential mid rise buildings in England out of a pot of circa 500k OS buildings. The image recognition tool works by feeding it buildings that could potentially be mid rise based on the OS data, the algorithm then finds the building, rectifies the image, identifies vertical planes to count the number of storeys allowing the department to ascertain an accurate estimate of building height. The next stage of the process is to ascertain the external wall system materials. The algorithm has been trained to recognise systems based on training data from MHCLG high rise data collections.

3 - Website URL

N/A

4 - Contact email

chiefdataoffice@levellingup.gov.uk

Tier 2 - Owner and Responsibility

1.1 - Organisation or department

Ministry of Housing, Communities and Local Government (MHCLG)

1.2 - Team

Safer and Greener Buildings Analysis

1.3 - Senior responsible owner

Director - Remediation Policy

1.4 - External supplier involvement

Yes

1.4.1 - External supplier

Urban Tide

1.4.2 - Companies House Number

09222463

1.4.3 - External supplier role

The suppliers’ contractual role is to develop, run and platform the algorithms. They will package the algorithms as a product and document the tool ready for handover to MHCLG ahead of the end of the contract.

1.4.4 - Procurement procedure type

Via a call-off contract which is a contract between a supplier and buyer for the provision of services, goods or works. Using the Dynamic Purchasing System (DPS) which is a type of commercial agreement available through Crown Commercial Service (CCS) that allows suppliers to join at any time during the agreement’s life.

1.4.5 - Data access terms

Limited access while the contract is ongoing to specific data with requirement to delete and confirm deletion following the contract finishing.

Tier 2 - Description and Rationale

2.1 - Detailed description

The first action the algorithm undertakes is to build an image pipeline, from images of buildings from the Ordnance Survey pot of potential buildings. This process involves improving the quality of images by removing potential obstacles, stretching the image to account for the angle of the camera etc. Next the tool identifies windows on all horizontal and vertical planes. This ensures that if a vertical plane has a window missed by the algorithm then it will be captured by another horizontal plane allowing for the correct number of storeys to be counted. Following this the probability of the external wall system materials are based on training data provided by MHCLG on different external wall systems from our various data collections.

2.2 - Scope

The algorithm has been designed to ascertain an accurate estimate of building height and an understanding of the external wall system materials used on residential buildings to support in policy engagement for potential schemes offered by the department.

2.3 - Benefit

The key benefit is to have a tool that can quickly identify the height and external wall materials of buildings for use by the department to identify buildings that may require further assessment for proactive pull in to one of the Government’s remediation schemes. The justification for the use of the tool is that the schemes are open to all buildings within height scope and there are no disadvantages to other buildings which the tool fails to identify as they still have the same access to the schemes and funds as buildings identified by the tool.

2.4 - Previous process

There is no pre-existing data set outlining an accurate estimate of building height and the external wall system materials used on residential buildings. MHCLG tested the viability of an AI solution via a discovery, proof of concept and pilot programmes. The results at each stage showed the benefits and feasibility of the project and allowed the department to gain approval at subsequent stages from the Shadow Remediation Board (chaired by the Safer Greener Buildings Director General) to move to the next stage of the work.

2.5 - Alternatives considered

During the discovery exercise to establish a tool that would aid in this process MHCLG considered satellite imaging and other mapping tools. MHCLG has reviewed options to use other sources of information such as information from building surveyors, Fire and Rescue Services, Cladding Action Groups and Local Authorities. These sources are used alongside the algorithm but due to the large number of buildings in scope and the time it takes to identify them manually, these sources will not cover all buildings in England at the required pace, therefore these methods support each other allowing MHCLG and our stakeholders to support building a comprehensive data set as quickly as possible.

Tier 2 - Decision making Process

3.1 - Process integration

Results from the tool are provided to the Department and its partners. They manually review each building identified by the algorithm and to make a decision on whether or not a further assessment is necessary.

3.2 - Provided information

The algorithm provides the address, Unique Property Reference Number (UPRN), longitude and latitude of buildings which are sourced from Ordnance Survey and then the estimated building height and external wall system materials used. In the future we will work with the delivery partners to see if they would find the front end User Interface useful which allows for easier filtering/searching of the buildings and output data as well.

3.3 - Frequency and scale of usage

Around 4500+ buildings have been provided to our delivery partner as part of the Pilot period and they have manually reviewed around 2000+. MHCLG plan to run the algorithm on the whole of England and provide the results to our delivery partner to direct engagement. Following this initial run on all buildings in England, we will carry out further development of the algorithm and rerun it to achieve better results. These improved results will then be provided to our delivery partners to complete engagement with relevant stakeholders not contacted in the first round.

3.4 - Human decisions and review

Each building is manually reviewed by our delivery partner prior to making a decision on whether or not a further assessment is necessary.

3.5 - Required training

Currently the algorithm is being run and managed by our supplier so no training is required. Following the end of the contract the algorithm will be packaged and provided to us. At this point training and skills will be required to deploy it on our environment but it is not clear what these skills are.

3.6 - Appeals and review

As use of the algorithm will not disadvantage anyone, appeals are not relevant in this case. If building owners are unhappy about being contacted they will be able to follow our delivery partner complaints processes. If the source of the complaint is attributable to the algorithm then this would feed back to MHCLG who would review the complaint and take action if required.

Tier 2 - Tool Specification

4.1.1 - System architecture

The development of the system relies on the robust and secure uSmart environment, which utilises the AWS stack and serves as the foundation for all processes involved. Using uSmart and AWS ensures a highly reliable and well-documented development environment. Our teams and processes strictly adhere to the latest ISO 27001 standards and Cyber Essentials Plus requirements to maintain the highest security standards, safeguarding against potential technology-related risks.

The underlying algorithms powering the system are not only based on extensive academic research but have also undergone thorough scrutiny through peer-reviewed publications. This commitment to academic rigour ensures that the algorithms are grounded in scientific knowledge and have been subjected to external validation and assessment.

Throughout the development cycle, a continuous improvement approach was adopted. This means that as the system evolved, each component underwent development and refinement, benefiting from iterative feedback loops and incorporating lessons learned from previous stages. This iterative approach reduced the risk of overlooking critical technological aspects and allowed for gradually enhancing the overall solution.

To ensure transparency and ease of collaboration, all the steps involved in the development process are fully documented. This documentation is a valuable resource that can be shared with new project teams, enabling them to familiarise themselves with the intricacies of the multi-stage process quickly. By facilitating knowledge transfer, the system can be further explored, refined, and developed by future teams.

The system mitigates potential technology-related risks by incorporating a robust technology development approach, leveraging secure infrastructure, adhering to established standards, and basing the algorithms on solid academic research. The continuous improvement cycle and comprehensive documentation further enhance the system’s resilience and enable the seamless integration of new teams into the development process.

4.1.2 - Phase

Beta/Pilot

4.1.3 - Maintenance

It is in pilot so is being reviewed and improved.

4.1.4 - Models

Image enhancement and recognition models

Tier 2 - Model Specification

4.2.1 - Model name

Safer Greener Buildings Building Image Recognition Tool

4.2.2 - Model version

V12

4.2.3 - Model task

MHCLG has developed the Safer Greener Buildings Image Recognition Tool to ascertain an accurate estimate of building height and an understanding of the external wall system materials used on residential buildings. There is no reliable source of data for building heights in England and the types of external wall systems they have on them. Ordnance Survey (OS) maps have proven to be unreliable and we estimate that there are circa 75k residential mid rise buildings in England out of a pot of circa 500k OS buildings that could potentially be mid rise. The image recognition tool works by feeding it buildings that could potentially be mid rise based on the OS data, the algorithm then finds the building, rectifies the image, identifies windows in vertical planes to count the number of storeys. Number of storeys is then used as a proxy for height allowing us to identify the mid rise (5 to 7 storey) building stock. The next stage of the process is to ascertain the external wall system materials used on residential buildings. The algorithm has been trained to recognise external wall system materials used based on training data from MHCLG high rise data collections.

4.2.4 - Model input

Potential mid rise buildings from OS data, Google Streetview images and MHCLG External Wall System training data from our funds and data collections.

4.2.5 - Model output

The tool enabled the department to build a data set outlining an accurate estimate of residential building heights and the external wall system materials used.

4.2.6 - Model architecture

The algorithm uses an image recognition model.
The storey counting algorithm is a rule-based pipeline incorporating the following elements: Image rectification (partially adapted from [1]) based on features from Line Segment Detector [2] to correct for image skew. A Mask-R-CNN model [3] with additional flood-filling step to identify & mask the building facade within the image. A CNN model (fine-tuned from [4]) for identifying windows Agglomerative clustering for grouping windows into building storeys. [1] https://github.com/chsasank/Image-Rectification [2] http://www.ipol.im/pub/art/2012/gjmr-lsd/article.pdf [3] https://github.com/matterport/Mask_RCNN [4] https://github.com/lck1201/win_det_heatmaps/

4.2.7 - Model performance

Overall, the model performed acceptably for its purpose. Floor counting accuracy on a sample of 1000 buildings in Bristol was 60% compared to OS data which was 44% accurate. When looking at edge cases (modern buildings where the algorithm performed less well) the accuracy rate dropped to 46% for calculating the exact number of floors, however for in scope buildings (5-7 storeys) the accuracy rate was 86%. This is because when the algorithm is incorrect it tends to be 1 storey out, therefore when looking at a 5-7 storey range as opposed to the exact number of storeys the overall accuracy rate is much higher.

4.2.8 - Datasets

OS data, Google Streetview images, MHCLG data collections of external wall systems

4.2.9 - Dataset purposes

OS data is used to feed the model with potential buildings in scope. Google Street images are used to count the number of storeys, and the external wall system materials used. MHCLG data collections of external wall systems are used to train the algorithms.

Tier 2 - Data Specification

4.3.1 - Source data name

Ordnance Survey mapping data and Google streetview

4.3.2 - Data modality

Geospatial data

4.3.3 - Data description

Building storey height and images of external building walls

4.3.4 - Data quantities

Data from around 15,000 buildings used for training then further images collected from the StreetView API.

4.3.5 - Sensitive attributes

The data used is about buildings and not address level so no personal data is processed in the tool. Addresses to contact responsible individuals for follow up action are outside the scope of the tool.

4.3.6 - Data completeness and representativeness

OS building height data isn’t accurate but the data is representative of the target population. Some StreetView images are of poor quality or are unavailable. So the data is not complete but is complemented by other methods to identify buildings in scope.

4.3.7 - Source data URL

Source data from Ordnance Survey is not openly accessible but used within government under the Public Sector Geospatial Agreement. https://www.ordnancesurvey.co.uk/customers/public-sector/public-sector-geospatial-agreement
Streetview API is supplied and is available under licence by a 3rd party reseller

4.3.8 - Data collection

Images of buildings fed into the algorithm from OS data is collected using the StreetView API. This has been approved by the Google StreetView approved reseller who have confirmed that we are complying with Google’s T&Cs.

4.3.9 - Data cleaning

Prior to collecting the data the streetview image is manipulated using the API to provide a better quality image. Following which the image is processed by removing trees etc, changing the orientation of the image to account for camera tilt and then changing the shading and colouring of the image to help improve the algorithm results.

4.3.10 - Data sharing agreements

Data sharing as per the contract between MHCLG and the contractor. Further data sharing is in place between MHCLG and delivery partners who will act upon the results.

4.3.11 - Data access and storage

The contractor, MHCLG analysts and the scheme delivery partner have access to the data. The contract supplier is contractually obligated to delete the data following the end of the contract. The supplier contractually obligated to arrange for secure storage of the data for use by MHCLG in line with the obligations under the call off contract. MHCLG and delivery partners are expected to hold data for the life of the building safety remediation programme after which it will be managed in line with departmental data retention policies.

Tier 2 - Risks, Mitigations and Impact Assessments

5.1 - Impact assessment

An Algorithmic impact assessment took place in summer 2023 and it was deemed to be low risk. It is low risk because in terms of impact to policy the manual checks are carried out prior to any decision being made and the that being selected by the algorithm does not disadvantage other buildings in scope not identified by the algorithm and therefore included in the built data set. The tool builds a data set of accurate estimates of residential building height and external wall system materials used, this data set is used by delivery partners to ascertain whether any further assessment of the building is required. Any building owner can apply for government remediation schemes regardless of being selected by this tool, the tool is a way to proactively engage with building owners.

5.2 - Risks and mitigations

The algorithm is used as one method to identify mid-rise buildings to support engagement with building owners. The main risks identified are costs (both of development and manual building checking).

Updates to this page

Published 28 April 2025