Research and analysis

Better Outcomes through Linked Data: Local authority homelessness and rough sleeping data linking

Published 11 December 2025

Applies to England

Foreword

The Ministry of Housing, Communities and Local Government (MHCLG) is committed to following an evidence-informed approach to reducing homelessness and rough sleeping.

This study presents analysis of linked data between people sleeping rough and people in the statutory homelessness system in Cheshire West and Chester. The research was conducted by Cheshire West and Chester as a pilot study as part of the Better Outcomes through Linked Data (BOLD) programme led by Ministry of Justice. In this research, the Homelessness Case Level Information Collection (H-CLIC) has been linked to data held by rough sleeping teams at a local level, with the aim of establishing the possibilities and insights local linking could create in homelessness and rough sleeping. Cheshire West and Chester plan to publish a full report of the findings.

Linked datasets have the potential to enhance existing data, providing a richer source of information and more context into the ongoing issues faced by vulnerable people aided by government departments. This evidence base is essential to help MHCLG and local government make data driven decisions to improve the lives of those affected by homelessness and rough sleeping.

I would like to thank our collaborators at Cheshire West and Chester for conducting the research, and the BOLD Homelessness Pilot team for summarising the findings here.

This work has given MHCLG the opportunity to collaborate on exciting areas of research and improve our evidence base, for which I am especially grateful.

Stephen Aldridge

Director for Analysis and Data & Chief Economist

Ministry of Housing, Communities and Local Government

Executive summary

This report summarises the findings from analysis of the associations between homelessness and offending in a local authority which has been conducted by the Ministry of Housing, Communities and Local Government (MHCLG) in partnership with Cheshire West and Chester Council (CWAC) as part of the Better Outcomes through Linked Data (BOLD) programme. For more information on BOLD: Ministry of Justice: Better Outcomes through Linked Data (BOLD)

Background

The BOLD programme provided a grant to enable Cheshire West and Chester Council to link their homelessness and rough sleeping data and analyse the resulting dataset.

The aim was to explore user journeys through statutory homelessness and rough sleeping so more targeted support could be provided, especially for those who are not accessing support to which they are entitled. An additional aim was to identify if any particular groups of people were being systematically excluded from data collections and understanding the feasibility for running similar data linking projects at a local level in the future.

Headline findings

Most people sleeping rough in Cheshire West and Chester in 2023 (85%) were found in the statutory homelessness dataset in the same year. Most (78%) people in both datasets had approached housing options before being seen sleeping rough in 2023.

People with support needs were generally less likely to approach Housing Options or make a homelessness application, than those without reported support needs. However, this was not the case if they had an offending history, which may be because there is a duty upon prison and probation services to refer people at risk of homelessness to local authorities for support.

The vast majority (94%) of people sleeping rough who made a homelessness application were found to either not be in priority need or be intentionally homeless, and therefore not entitled to ongoing main duty homelessness support.

Linking these datasets produced useful insights but due to data quality challenges, and inflexible data management systems the linking process was labour intensive and time consuming.

Please note findings may not be generalisable as they come from one local authority only. In addition analysis of subgroups such as people with support needs had small sample sizes and should be treated with caution.

1. Background

The Ministry of Housing, Communities and Local Government (MHCLG), (previously the Department for Levelling Up, Housing and Communities (DLUHC)) partnered with Cheshire West and Chester Council (CWAC) to link data on people sleeping rough and people in the statutory homelessness system at a local level. This work was undertaken as part of the Better Outcomes through Linked Data (BOLD) programme.

The BOLD programme is led by Ministry of Justice in partnership with the Ministry of Housing, Communities and Local Government (MHCLG), Department of Health and Social Care, Welsh Government and Public Health Wales. It was created to demonstrate how people with complex needs can be better supported by linking and improving the government data held on them in a safe and secure way. BOLD has initially focused on four pilot areas: reducing homelessness, substance misuse, re-offending and supporting victims of crime.

The BOLD programme provided a grant to enable Cheshire West and Chester Council to link their homelessness and rough sleeping data and analyse the resulting dataset.

The aim was to explore user journeys through statutory homelessness and rough sleeping so more targeted support could be provided, especially for those who are not accessing support to which they are entitled. An additional aim was to identify if any particular groups of people were being systematically excluded from data collections and understanding the feasibility for running similar data linking projects at a local level in the future.

3. Methodology

The homelessness dataset at Cheshire West and Chester is held within a system called Housing Jigsaw MRI (referred to as “the case management system”) which is used by the Council’s Housing Options team to record housing advice and homelessness activities under Part 7 of the Housing Act 1996.

The rough sleeping dataset is an Excel spreadsheet completed on a daily basis by the commissioned outreach team to record information about people sleeping rough. The spreadsheet was introduced to help complete and submit the Rough Sleeping Monthly Survey.[footnote 1]

The case management system uses two unique identifiers to identify people and cases, namely Customer ID and Case ID. The people in the rough sleeping dataset are identified by their names and date of birth. 

In order to match records from both sources, exact matches were paired on first name, last name, and date of birth. This yielded a match for 230 people. Due to the data inconsistencies in names and date of birth between the two datasets, fuzzy matching (a method for linking data which involves identifying similar, but not identical elements in datasets) was then employed to improve the matching rate. The linking method had a default probability threshold setting of 70%, meaning matches would be made if two records were at least 70% likely to be a true match. This was reduced to 50%, with additional manual review. Due to the nature of the datasets the data cleaning and manual review was time consuming and labour intensive.

After exact and fuzzy matching were applied, 370 single people out of the 437 in the rough sleeping dataset were identified as having a positive match in the case management system. These are referred to as being recorded in both datasets.

The main findings of the data linking process was that there are limitations when linking one complex system with another and common issues were highlighted around inconsistent data attributed to manual inputting, such as duplicate persons, misspelt names, and empty fields. Data cleansing, merging and formatting enabled the data to be integrated into one source.

3. Findings

Of the 1,353 people included in this analysis, 1,286 had made a homelessness application and 437 were recorded in the rough sleeping data set from daily outreach patrols.

370 people (85% of people sleeping rough) were both recorded in the rough sleeping data set and had approached Housing Options for homelessness support (either housing advice or a full homelessness application).

67 people (15%) were recorded in the rough sleeping data set but were not recorded in the homelessness data set within the 12 months. 288 people (78%) of the 370 found in both datasets had approached Housing Options before being recorded as sleeping rough in 2023.

70% of single people who made a homelessness application were male, and 82% of those who both slept rough and made a homelessness application were male.

Of the 370 people in both data sets, 252 people (68%) made a homelessness application (they met the reason to believe threshold and had a needs and circumstance assessment taken). 118 people (32%) did not make a homelessness application but were given housing advice.

People with support needs were much less likely to approach Housing Options or make a homelessness application, with 80% (54 of 67) of those who did not make an application having one or more support needs compared with 49% (181 of 370) of people who did make an application.

This is a consistent pattern across support needs except for people with an offending history, which may be because there is a duty upon prison and probation services to refer people at risk of homelessness to local authorities for support. 24% of people sleeping rough who made a homelessness application had an offending history, compared with <15% of people sleeping rough who did not make an application. Further detail on support needs is presented in Table 1.

Table 1:  People sleeping rough were less likely to make a homelessness application if they had one or more support needs


Support needs People sleeping rough who did not make a homelessness application (67) People sleeping rough who did make a homelessness application (370)
No support needs 13 (20%) 189 (51%)
One or more 54 (80%) 181 (49%)
Mental health 50 (75%) 115 (31%)
Offending history <10 (<15%) 90 (24%)
Physical ill health 19 (28%) 66 (18%)
Substance dependency 27 (40%) 62 (17%)
Alcohol dependency <20 (<30%)* 39 (11%)
Learning disability Not recorded in the rough sleeping dataset 15 (4%)

*Additional numbers were suppressed in this table to minimise the risk of disclosure.

A table of the number and percentage of people sleeping rough who did and did not make homelessness applications in 2023 in Cheshire West and Chester, by support need.

Of the 252 people sleeping rough who made a full homelessness application:

  • 117 people were not found to be in priority need. Of these 70 people went on to reappear in the homelessness dataset, and were seen by outreach on average another 10 times.
  • 22 people were found to be in priority need and not intentionally homeless. Of these, 13 people appear in the homelessness dataset again and were seen by outreach on average another 4 times.
  • 28 people were found to be in priority need but intentionally homeless. Of these 16 people appear in the homelessness dataset again and were seen by outreach on average another 16 times.
  • 15 people had their applications withdrawn (no decision made), of which 12 people appear in the homelessness dataset again and were seen by outreach on average another 2 times.

Separate to the 370 people recorded in both datasets, a further 120 people who approached Housing Options for assistance stated they were sleeping rough but were not recorded in the rough sleeping dataset during the 12 months. The demographics of this cohort are included in Table 2.

Table 2: Demographic statistics for respondents who reported they were slept rough to housing support but were not recorded in the rough sleeping dataset during the 12 months (n=120)


Category Slept rough but not recorded in the rough sleeping dataset (%)
All Respondents n=120  
Gender*    
 Men 75  
 Women 25  
 Other -  
Nationality    
 UK 37  
 EU/EEA <5  
 Non-EU/EEA <5  
 Non-Response <65  
Age    
 18-30 36  
 31-40 33  
 41-50 17  
 51+ 14  
Homeless Assessment    
 Owed a Prevention Duty <5  
 Owed a Relief Duty 33  
 Not owed a duty <65  

*Gender outputs have been rounded to the nearest 10.
**Due to the small sample size some data has been suppressed for privacy reasons.

Cheshire West and Chester plan to publish a full report of the findings.

4. Limitations

There were several limitations to this project. This report summarised data from one local authority and may not be relevant to other areas. Care must be taken when drawing conclusions from the data, especially when looking at smaller groups such as those who were recorded as sleeping rough but did not make a homelessness application. Findings should be regarded as indicative.

This report only covers people recorded in one or both of the datasets in 2023. This is because the council’s IT system for recording H-CLIC changed in December 2022, with the case management system replacing Locata. Individuals who appeared in the homelessness data before January 2023, but not since then were not included.

The case management systems is complex with a vast amount of reporting fields, some of which are incorrectly populated, or left incomplete at the triage stage. This can lead to incorrect information within the dataset.

Within both datasets there are common issues such as names being misspelt, or middle names used as forenames in one dataset but not the other. Dates of birth sometimes appeared in the USA format. Both required manual matching and amending.

Instances of duplication through multiple case IDs in the case management system. While there were 381 unique customer IDs in the case management data for people who were also identified in the rough sleeping dataset, manual checks revealed 11 of these were likely to be duplicates, as they were recorded as having the same demographics, contact details and national insurance numbers.

The inherent limitations of the data must also be considered. The hidden nature of rough sleeping, especially in certain demographics such as women, means there will be individuals recorded rough sleeping who do not interact with outreach teams. Others may interact but do not engage or provide personal details. Similarly, for legal reasons or lack of awareness around available support, some people experiencing homelessness will not approach Housing Options. There is a risk this might lead to lower representation of specific groups such as non-UK nationals.

5. Points for consideration for local authorities conducting linking projects

Linking data between the two systems generated useful insights for the council which have informed policy development and operational changes in Housing Options service delivery since they were generated. Councils should consider whether joining these data systems will allow them to better support their residents.

The council created the following points for consideration for local authorities seeking to conduct data linking:

  • Creating a unique identifier (such as a client ID) for each person
  • Establishing guidelines for formats of fields such as names and dates, and building in validation checks into software systems (e.g. only allowing dates to be entered in a certain format; not allowing the user to proceed without completing necessary fields)
  • Establishing a quality assurance process, including running regular deduplication processes and establishing processes for searching people when recording new interactions, to prevent duplicates
  • Ensuring information is accessible and understood between teams, including such practices as keeping an up-to-date data dictionary; if there are separate homelessness and rough sleeping teams, communicating all proposed changes in data fields to one another; and keeping a singular universal file in the Cloud/SharePoint accessible to all users
  • Requesting the software provider:
    • Pre-populate fields when adding a new record for the same person – this will minimise the need for repetition on the part of the person entering data, therefore resulting in a lower risk of typing errors and a lower risk of duplication
    • Allow direct downloads of data into a common format such as CSV
    • Accommodate data transfers or downloads when switching provider
  • Ensuring there is a process to account for any gaps during transition periods when switching provider
  • Keeping data for as long as it has the potential to be useful, while adhering to data protection rules. For example, it may be beneficial to retain records for several years if it can help to identify when people are returning to sleeping rough after a long period of no contact
  • Choosing a suitable linking method depending on the completeness of the data and the project aims. Several probability thresholds for probabilistic matching may need to be tested
  • Recognising the limitations of the linking, including that the analysis will be limited to contacts which occurred within a local area (i.e., it might not provide insight into people who have previously slept rough in other parts of the country)
  1. Initially collected for the ‘Everyone In’ scheme during the pandemic, the monthly Rough Sleeping Survey helps monitor local authority performance and accountability towards ending rough sleeping.