© Crown copyright 2018
This publication is licensed under the terms of the Open Government Licence v3.0 except where otherwise stated. To view this licence, visit nationalarchives.gov.uk/doc/open-government-licence/version/3 or write to the Information Policy Team, The National Archives, Kew, London TW9 4DU, or email: email@example.com.
Where we have identified any third party copyright information you will need to obtain permission from the copyright holders concerned.
This publication is available at https://www.gov.uk/government/publications/quality-assurance-of-administrative-data-in-the-uk-house-price-index/hm-land-registry-data
This is the quality assurance of administrative data (QAAD) of HM Land Registry used in the UK House Price Index (UK HPI).
HM Land Registry data used in the UK House Price Index (UK HPI) is based on the Price Paid Data (PPD) and register information collected during the registration process.
The process outlined below describes the collection, production and quality assurance methods of this data. The ownership of the PPD and the UK HPI extracted data, are the responsibility of HM Land Registry Data Group.
As the Ordnance Survey (OS) MasterMap data is a source of information for property types on some registrations it is also considered. The official providers for the UK HPI include:
- HM Land Registry
- the Office for National Statistics
- Registers of Scotland (RoS)
- Land & Property Services (Northern Ireland statistics and research agency)
2. Data sources
|Data type||Administrative source||Data quality concern||Public interest||Matrix classification||High level rationale|
|Price Paid Data||HM Land Registry||Medium||High||A2||There are sufficient safeguards in place to ensure the quality of the Price Paid Data. This is an important contributor to the overall index and a published dataset.
The medium concern relates to registration delays. This has been exaggerated for new builds recently because of the operational backlog in HM Land Registry. This area will be revisited whenever appropriate.
|Register data (cash/mortgage indicator and repossession volumes)||HM Land Registry||Low||High||A2||This data is captured as part of our statutory function and is clearly defined and understood.|
|Map data||Ordnance Survey||Medium||Low||A2||The quality concern level attributed to this data relates to the HM Land Registry process, it is not a reflection of the quality of the originating OS MasterMap offering.|
3. Overview of publication processes
HM Land Registry share monthly Price Paid Data on residential property transactions and an indicator identifying cash/mortgage sales for England and Wales with the Office for National Statistics (ONS). This information is collected as part of the registration process.
The ONS then calculate historic house price trends and model house price estimates monthly using the approved UK HPI methodology. The price data and mortgage indicator provided by HM Land Registry is combined with property attribute data from the other official providers, such as the Valuations Office Agency (VOA) and the Council of Mortgage Lenders (providing property size, condition and type of buyer information) to calculate the latest house price inflation estimates. These estimates are combined with the estimates from the official providers to create the estimates for Great Britain and the United Kingdom. The resulting data is published monthly on GOV.UK by HM Land Registry.
The economic commentary of the monthly calculation in the UK summary report and those accompanying the charts and heat maps in all reports are produced by ONS. Before publication the commentaries are reviewed by the official providers.
HM Land Registry extract the repossession volume data before an automated system creates the report and support service information that is published.
HM Land Registry is responsible for the publication of the monthly index. This includes the online reports, downloadable data (example of downloadable data September 2017) and the updating of the interactive search tool. The official providers review the final reports before publication, the content is mutually agreed, and HM Land Registry issue the final versions at an agreed publication time.
HM Land Registry have automated the process where appropriate to reduce inefficiencies and errors. For example, the extraction of the data is automated using ETLs and the naming and movement of files, which aid or form part of the publication are controlled in a similar way. This prevents issues that could arise from a manual process of moving and copying data across numerous workbooks.
Read the Publication process summary (PDF, 185KB).
4. Assessment of HM Land Registry Data using the Administrative Data Quality Assurance toolkit
HM Land Registry’s Data Group have reviewed its Price Paid Data collection processes and data sources against the Administrative Data toolkit. This document outlines the collection processes and quality assurance measure against the levels within the toolkit.
The evidence meets the A2 classification: Enhanced assurance category with regards to our land and property database dataset and quality assurance processes.
4.1 Practice area 1: operational context and administrative data collection
HM Land Registry collects administrative data while fulfilling its statutory duties of creating, managing, controlling and maintaining the register for land and property in England and Wales, within the Land Registration Act 2002 and Land Registration Rules 2003. The main purpose is to populate the land register documenting and protecting the legal rights of the owner, tenant and/or third parties. The register shows details of each title registered, accompanied by its corresponding title plan. This information is covered by HM Land Registry’s indemnity, which may be liable to pay compensation for any inaccuracies.
The collection of the PPD is undertaken as part of the registration process. This process begins when an application is lodged with HM Land Registry following the sale of a property.
The collection is completed by core registration staff across four casework systems. Each casework system processes a different type of application (first registrations of a title, transfer of part of a lease or part of a title, which could include developer sites and finally a transaction involving the whole of a registered title). Each system uses the same price paid capture software. In addition standalone access to the PPD capture software is available on a restricted basis. Price Paid Data is captured upfront in the registration process.
Price Paid Data is supplied by applicants as a legal requirement when a property transfer for value (money) has occurred. This is required in order to complete the registration process. Most applicants are solicitors or conveyancers, few are companies or citizens.
The collection of the PPD is completed by core registration staff. As time moves on, so has the dataset and our ability to automate much of the process. The most recent enhancements use information on the type of transaction taking place, to determine whether it should be captured in the price paid dataset. Further categorisation within the dataset determine whether it should be extracted for use within the UK HPI, for example to report specifically on repossessions in England and Wales.
The system uses the type of application (for example a transfer not for value or a transfer under a power of sale) to determine whether Price Paid Data should be captured. If the application type has the potential to require PPD then the system will display the necessary screens that allow the caseworker to collect the attributes which make up the dataset. The attributes are linked to the property through a unique title number and combined with address data that is processed against OS AddressBase Premium® product, incorporating Royal Mail’s PAF® database.
The following fields comprise the address data included in Price Paid Data:
- Primary Addressable Object Name (PAON). Typically, the house number or name.
- Secondary Addressable Object Name (SAON). If there is a sub-building, for example the building is divided into flats, there will be a SAON
Where the property type attribute is unavailable, the caseworker can use the OS MasterMap® information to identify the property type. By accessing a visual representation of the data it is possible to determine whether a property is a terraced or detached house for example. Detailed guidance is provided to caseworkers to help them identify any unclear property types.
See examples from the property type categorisation guide (PDF, 179KB) for detached and terraced properties.
Each entry in the price paid dataset is given a unique identifier to track or identify individual sales on properties. It can also be used to make changes or deletions if an error is detected by HM Land Registry or a user.
Price Paid Data is captured in two categories - Standard PPD and Additional PPD.
Standard refers to the sale of residential properties for full market value to private individuals.
Additional price paid relates to property sales to companies, for value, buy to lets and/or non-residential.
The website provides more information on the Price Paid Data Attributes included in the PPD. When the information is saved, another set of business rules automatically determine whether the data is recorded as standard, additional or is to be excluded.
If a transaction is identified as falling outside of either the standard or additional PPD it is excluded from the dataset. List of exclusions from UK HPI.
Data input at the start of the registration process create indicators that capture whether a transaction was a cash or mortgage sale. This determines the content of an application, such as the presence of a mortgage deed at the time of lodgement, as this will exclude sales, where an application to register a mortgage, is lodged separately. Only a small percentage of transactions that take place, subject to a mortgage would not be lodged as one application.
While proportionally the cash/mortgage indicator is a relatively minor element of the overall data supplied for the UK HPI, the transaction codes used to identify the presence of a mortgage is subject to quality assurance checks. In July 2017, the accuracy of transaction code selection was over 95%.
There will always be a delay before the sale of a property is reflected in the register that HM Land Registry maintains, which impacts the Price Paid Information (PPI) recorded as part of the overall process. This is because the registration of a property transaction is the very last stage in the conveyancing process, which always happens after the legal completion of the transaction. How quickly this occurs depends on several factors such as:
how quickly the application for registration is made by the conveyancer. Depending on the circumstances, this can be several weeks after completion. HM Land Registry is unable to comment or speculate as to why some applications take longer to be submitted than others. Standard practice is that conveyancers lodge a ‘priority search’ after a contract has been exchanged and just before completion. This reserves priority for the prospective application to register the transaction over any other application(s) that may be submitted for a period of thirty working days. In usual circumstances, the application will be lodged during that priority period, sometimes if there is a delay the conveyancer may have to lodge another priority search.
when HM Land Registry receives an application - the time taken to process it can vary. Most applications are processed quickly, for example, updates to the register resulting from a sale or new mortgage are processed on average within seven working days. We currently handle more than 17,000 of these applications every working day. Occasionally the interval between a sale and registration is longer than two months; this is particularly true of transactions that require the creation of a new register, such as new build properties. An increase to the volume of registration applications HM Land Registry receives is causing additional delays. These applications are processed within 58 working days on average. More than 2,000 of these applications are handled every working day.
HM Land Registry is working hard to improve the time it takes to process ‘new title’ applications, by recruiting more staff and making targeted use of overtime. Average processing times are reducing, the aim is to complete all applications within 25 working days by March 2018. You can find more information about our current performance and the amount of work we are receiving and processing on GOV.UK.
The provisional (first) estimate for the UK HPI in a specific month is calculated on approximately 40% of the transactions that are registered. The second and third estimates calculated are based on approximately 80% to 90% of the final registered transactions. Revisions to the UK HPI estimates arise when the index is re-calculated to incorporate these additional transactions.
The following table summarises the revision policy introduced for the UK HPI in June 2017.
|Frequency of publication||Frequency of revisions||Period revision covered||Reason|
|Monthly||Monthly (quarterly for Northern Ireland data)||Previous 12 months
(four quarters for Northern Ireland)
|Inclusion of additional registered transactions|
|Monthly||Ad-hoc||The entire dataset||Improvements in methodology
Large scale revisions to historic data series
As an example, the June 2017 release contained data up to March 2017. Average prices, indices, growth rates and sales volumes previously published for all months from April 2016 to March 2017 were revised.
Similarly, Northern Ireland’s quarterly data will be revised back to Quarter 2 (April to June 2016). This may differ to the revision policy of the Northern Ireland House Price Index published by the Land and Property Services Northern Ireland
Ad hoc revisions
On occasions, revisions will need to be made outside the usual timeframe. Examples of such revisions include improvements to methodology, revisions to data and the discovery of incorrect data through extensive quality assurance procedures.
Each of these revisions will be examined to see if the effects are significant in terms of the degree of change or whether the changes affect the story the data portray. If revisions arising through improvements to methodology or changes to source data are found to be insignificant, they will be introduced in the next planned set of revisions in-line with the timetable above.
If these revisions are thought to affect analysis or are sufficiently large, they will be introduced more quickly.
When incorrect data is discovered after publication, it will be examined for its impact. When changes are significant, a correction notice will be issued as soon as it is practical. Minor corrections will be included in the next planned release. In all cases, a full explanation will be included as part of the release.
Due to the upfront capture of the data, there may be occasions where a caseworker identifies that an application does not meet the original PPD criteria for capture. If this is identified after the data has been extracted for the month, a correction will be made in the following month’s data. Corrections to the data account for less than 7% of the overall file each month, with 20% derived from historic data that is more than five years old.
- the system excludes properties as ‘unknown property type’ rather than report incorrect data. (Limited impact).
- current backlogs (as of November 2017) mean that transactions of new build properties are not processed immediately. This results in a bias towards the number of resold properties with the preliminary (first) estimates in the dataset. (Medium impact)
Assessment Rating A2: Enhanced assurance – This data is captured and managed internally so there is a good understanding about the reasons for capturing the data, the process for capturing and extracting the data and any potential bias or limitations. HM Land Registry is responsible for the collection of the data and have direct input into the collection methods. Ordnance Survey details are referenced during the collection of the Price Paid Data as part of the registration process, which is a core statutory function.
4.2 Practice area 2: communication with data supply partners
The main data supply is generated from within HM Land Registry, as such the main data supply partner is HM Land Registry’s Operations group. The Data Group are responsible for the quality and publication of HM Land Registry’s data and work closely with the Operations group to maintain the quality of the statistics, raise queries and oversee changes to the collection process.
Whilst there is no legal basis for the data supply, internal operating agreements are in place to ensure the data continues to be collected. No changes can be made to the collection process without the approval of the Data Group who considers user needs alongside operational issues.
The Data Group also input into and review the quality assurance measures on a quarterly basis through planned meetings and reviews, to ensure the data collected remains fit for purpose. All Operational Quality Assurance checks are detailed in Practice area 3.
HM Land Registry receive regular updates from OS in regards to OS MasterMap® and AddressBase®.
Assessment Rating A2: Enhanced assurance – with the exclusion of the Ordnance Survey data, HM Land Registry is the data provider. The Data Group have a direct relationship with the HM Land Registry Operational directorate and meet regularly to discuss processes, make changes and flag issues. The Data Group also have a close relationship with their supplier Ordnance Survey, whose data is integral to HM Land Registry statutory function. The team has an in-depth understanding of the data provided that feeds into the UK HPI. This has been available as open data since 2012, resulting in regular reviews and feedback from users, which are considered as part of ongoing UK HPI work.
4.3 Practice area 3: quality assurance principles, standards and checks applied by data suppliers
There are several Quality Assurance checks in place within the Operations group of HM Land Registry to ensure the quality of the Price Paid Dataset and UK HPI is maintained. Details of these checks and where they occur in the process are outlined below:
- In system processing checks:
- double keying of price paid entry (initial entry is obscured)
- postcode confirmation checks
- where a previous application on a title has been excluded or included in the dataset, the caseworker is prompted to confirm their choice
- where the type of property is changed from a previous application the caseworker is prompted to check and give a reason for the change (if changed)
- if a buy-to-let mortgage code is entered, but the PPD selected is full market value the caseworker is prompted to check the PPD entry
- the system checks whether the proprietor entered matches the category of PPD entered. The caseworker must review this if it is not a match.
- Local Office Audit checks
Each local office has a Compliance and Audit Team (CAT team) who are responsible for checking the quality of the data captured on a daily and monthly basis. Reports are generated by the system and produce two types of report:
daily exception reports
Exception report generated daily for each office. The report details Property Price Information that has been input on the previous day outside of agreed parameters. Exceptions will occur when an invalid postcode (or none) was entered, where the full market value (price paid) falls outside of the price band set for the area or if the type of property has been entered as unknown. Price bands are reviewed on an annual basis by the Data Group.
The Quality Key Performance Indicator (QKPI) elements and some of the wider assurance measures are monitored by the Assurance Audit Team, which is part of the Register Integrity and Assurance Group (part of Corporate Legal Services). The Assurance Audit Team are independent of HM Land Registry Operations. The Audit team check 1% of auditable registrations marked off over five consecutive working days (sample period).
The randomised sample is representative of output, ensuring the result reflect the nature and quality of entries that have been added to the register across all application types and property price data recorded. The total titles checked and total errors found are used to calculate the overall % accuracy achieved. If an application fails 1 or more checkpoints the application will not pass the audit.
An application that appears on the sample will be deemed auditable if one or more check-points are capable of being checked. The information to check the accuracy of the data must be available to the auditor.
A sample period is five consecutive working days, including weekends if overtime has been worked. This allows all working days, other than the last four working days of every month - see limitation in owner’s comments below - in the financial year having the potential to be selected.
All sampling periods are determined in advance, before the financial year begins and are selected at random following strict method guidelines. Each report of completed registrations is randomised, as report content is generated by the application order age. The first auditable 1% of New Titles (NT’s) and 1% of Dealings (DLG’s) for each office, for each sample period are audited. Any NT which is found to be non-auditable is replaced with the next available NT of the same category (first registration, dispositionary first lease, transfer of part)
QKPI check-points for price paid entries are:
- when Standard PPD or Additional PPD is recorded: is the consideration entered correct?
- where a price paid/value stated entry has been made: is the consideration/value correct?
Wider assurance measures around Price Paid Information and Price Paid Data are as follows:
- has the case been correctly categorised?
- if it has been amended, is the property type correct?
- has the correct transaction type been selected?
- has a price paid/value stated entry been appropriately made/omitted/retained?
- when a new entry has been made, has the correct entry been made?
- when a new entry has been made, is the date correct?
The QKPI target is 98%, which has been achieved for the past 5 years.
For OS AddressBase® these updates are quarterly and include:
- Unique Property Reference Number (UPRN)
- Current or live addresses
- Alternative addresses
- Provisional addresses
- Historic addresses
- Royal Mail Postcode Address File
- National Grid coordinates
- Latitude and longitude coordinates
- Four levels of classification
- Valuation Office Agency classification scheme
- Feature life cycle dates
- Local authority addresses
- Object Without Postal Address (OWPA) records
- multiple occupancy addresses
- Local authority street information
- USRN (Unique Street Reference Number)
The source data is collated, verified and quality assured by GeoPlace. There are 348 local authorities in England and Wales, inputting updates to their Local Land and Property Gazetteers (LLPG). These changes are submitted to the GeoPlace Hub daily, weekly or monthly as part of the update schedule.
When received these updates are checked to ensure they have been produced in accordance with the NLPG Data Entry Conventions (DEC-NLPG) and comply with the national standard for the representation of address information – BS 7666 Parts 1 and 2.
For more information refer to Ordnance Survey Product guide, AddressBase (PDF, 904KB).
For OS MasterMap, this update provides daily changes to the map database, HM Land Registry carries out monthly validation of the data to check its completeness. The caseworkers may use the TopographicLine and TopographicArea attributes to classify the property type (detached, semi-detached, terraced or flat).
Ordnance Survey MasterMap is subject to Ordnance Survey capture Specification and quality assurance principles (PDF, 3.6MB). Ordnance Survey has adopted ISO accreditation - ISO19158: quality assurance of data supply (2012) and provides formal recognition of operating according to international standards. Since the data is not a main source of information, no specific analysis is undertaken to determine its accuracy for the purpose of the UK HPI.
Assessment Rating A2: Enhanced assurance – We have a close working relationship with both the Operational and Audit areas within HM Land Registry, which enables a deep understanding of the principles followed and an input into the quality assurance levels employed. The Audit team are independent of both the Data Group and Operations and so can provide an impartial view on the data. The Ordnance Survey data follows quality assurance methods. The teams are satisfied with the levels of quality assurance, considering the overall influence the data will have on the final caseworker’s decisions during the capture process for this dataset.
4.4 Practice area 4: producers quality assurance investigations and documentation
The Data Group monitor data quality levels of audits and raise issues with the Assurance Audit Team.
Daily quality checks are carried out by the HM Land Registry’s Operations team, while independent monthly checks are conducted by the Assurance Audit Team (as detailed in Practice Area 3). Additional checks are carried out when the data is in the price paid database. Any erroneous data is removed when cleansing business rules are applied.
In addition to the existing system checks there is also a customer reporting tool, which citizens can use to report errors. These errors are investigated and amended within the dataset when identified. Approximately 10-20 errors are identified per month (out of an average monthly file of approximately 70,000 transactions). These are reported in both historic and current transactions, but on average only 60% of cases are upheld.
When HM Land Registry transaction data is received it is cleansed before it is matched with other datasets, such as the Valuation Office Agency (VOA) data. Transactions flagged by HM Land Registry for removal are deleted. Transactions with postcodes not found on the National Statistics Postcode Lookup (NSLP) are also removed. Each month, further quality assurance is conducted to ensure VOA data can successfully match HM Land Registry’s data, we currently achieve a 95% match. This means the resulting modelled house price data is of good quality.
The modelling process used in the production of house price data includes an automated assurance process that assesses modelled house prices for property with certain attributes (derived using various sources of input data) against the price for a similar property. If the modelled price is substantially different, for example it exceeds a predefined tolerance, then the price is excluded from the final house price estimate. This equates to around 50 sales per month out of average file consisting of 70,000 transactions. While they are excluded from the average price calculations, they are included in the sales volume figures.
The UK HPI modelling process used can also account for records where a match cannot be made between the VOA data and price data provided by HM Land Registry. Each attribute used in the hedonic regression model is given a weight that represents the relative importance of the attribute when explaining house prices. If a record used in the model has a missing attribute, then the weight of that record is adjusted downwards to represent the importance of the missing attribute. This process allows the use of more property transaction data in the monthly calculation of average house prices, even if some attribute data is missing.
Prior to the creation of the final reports and the distribution of the calculated data, an automated process runs a comparison of the returned file from ONS against the previous month to check the validity of the returned file, all processes thereafter are automated to reduce manual handling errors.
Prior to Publication, all reports are uploaded to a central facility which allows all data providers to review and comment on their content. Final versions are uploaded after everyone has commented. These are dated and timed to allow the team to review the final version before publication.
Assessment Rating A2: Enhanced assurance – While much of this is covered in practice area 4, because HM Land Registry produces the UK HPI and are a data provider, an additional layer of assessment takes place. The data is shared with the working group while the UK HPI is being developed through the process and calculation methodology.
5. User engagement
The UK HPI publication and data input is managed by a Product Manager within the Data Group. It is the responsibility of the Product Manager to oversee the process, respond to queries and monitor the feedback received on any of the available offerings.
HM Land Registry’s press office issue a press notice on GOV.UK and by email to press contacts who opt to receive it. They also alert the Data Group when the UK HPI is mentioned in the media through a daily News Roundup email. This allows the team to consider the context and fact check the information.
Dedicated contact points for the UK HPI Working Group members and their area of responsibility are available within the UK HPI reports and on GOV.UK. HM Land Registry is responsible for dealing with daily queries relating to both the England and Wales data and any publication queries.
HM Land Registry’s Press Office also manages media calls, while the Customer Contact Centre and Data Services Team respond to queries on the publication, data and services each month. These teams are kept up-to-date on any enhancements or changes to the UK HPI routinely.
All feedback is logged in a central log against UK HPI, this information is used when planning future enhancements. Some urgent queries are managed by the Product Manager in conjunction with the subject matter expert, who maybe one of the other providers. An anonymous feedback process on GOV.UK captures queries and feedback, which is recorded and acted on where appropriate.
As part of the UK HPI development, a series of user events were held in England, Wales, Scotland and Northern Ireland, These events promoted the online user consultation and allowed the UK HPI Working Group to actively engage with users and gather feedback. Keys stakeholders and users of the house price statistics attended the event.
Following the first publication of the UK HPI in June 2016, a survey was conducted. The September 2016 survey was used to evaluate the publication and gather feedback on future developments. A summary of responses has been published in relation to this survey.
Key users of the UK HPI engage with both HM Land Registry and the other providers, so their queries are managed appropriately. The Data Group have a good working relationship and ongoing engagement with key users. Following the UK HPI user events, it was clear that users also accessed the raw PPD which feeds into the UK HPI. This data and is released under Open Government Licence and is available on GOV.UK. When the Data Group responds to queries it is important that the services offered by the other UK HPI providers are considered, as users also engage with them.
To ensure users and stakeholders are kept informed of changes to the data, blogs and articles are published. The team will continue to use surveys to gather users feedback. In June 2016, a LinkedIn Group dedicated to the UK HPI was launched. More than 500 key stakeholders and users were invited to join. Thirty-two influential members joined in the first month, the number of members continues to grow. The purpose of the LinkedIn forum is:
- to seek and gather ongoing feedback
- to link to important announcements, for example a change to publication dates
- to advertise surveys, consultations, articles and blogs
- to potentially recruit users for testing enhancements before deployment during projects
- to create a distribution channel for publication each month
LinkedIn group rules statement
To request membership click ‘Join’ and your request will be reviewed by the Group Manager. We encourage debate and discussion. By submitting a comment, you agree to abide by the following rules:
- be respectful of others who use this site
- stay on topic
- keep comments concise
- not use language that is offensive
We commit to respond to comments within two working days but reserve the right not to publish any comments that contravene any of these rules. We do not post personal information in comments such as addresses, phone numbers, email addresses or other online contact details, which may relate to individuals.
*[SAON]Secondary Addressable Object Name