Guidance

Annex F: analytical principles

Updated 18 December 2023

Applies to England

Following an announcement in the government’s Autumn Statement 2022, the planned adult social care charging reforms are now delayed until October 2025.

Part of the analysis for cost of care exercises will already be done by the data collection tool itself. The tool gathers intermediate data (such as staffing ratios) and applies calculations to give a consistent breakdown of costs per resident per week or per contact hour for each responding care location. The data collection tool should enable the extraction of this breakdown as a single column of data for each responding care location (with each row containing the various cost and data items gathered) to avoid needing to extract individual numbers.

The sections below first discuss approaches to data collation, data cleaning, operating costs, and lastly returns on operations and return on capital, which have a subjective element.

Data collation

The data is suited to collation in the following structure. Each location’s response is contained within one column and each row relates to one of the cost categories in Annex A, Section 3.

Data will likely need to be collated separately for care homes and domiciliary care due to their different cost structures. Using care homes as an example:

£ Resident per week Responding location 1 Responding location 2 Count of responses Lower quartile Median Upper quartile
Nursing staff              
Care staff              
Therapy staff (occupational and physio)              
Activity co-ordinators              
             

Data cleaning

Looking at the distributions of each cost line (the amount per resident per week or per contact hour) and applying quantitative methods such as Tukey’s rule may highlight outliers. While it is important to challenge unusual values with providers and give them an opportunity to explain and correct the data, it may be necessary to remove obviously implausible data from the analysis before calculating the results. It may be possible just to exclude the implausible cost values rather than a location’s entire response. The use of medians as proposed below also helps manage outlying values.

Analysis of the costs per resident per week/per contact hour (excluding Return on Operations and Return on Capital)

The data structure shown above can then enable, for each cost category in Annex A, Section 3 (excluding return on operations and return on capital), the calculation of:

  • the count of responses
  • median costs – medians should be used for this purpose (as opposed to means) because means are more sensitive to unusually low or unusually high costs reported
  • the Lower Quartile cost
  • the Upper Quartile cost, to indicate the extent of variation

It is important to be aware of the difference between zero and missing values when calculating these items, particularly as Microsoft Excel does not have a clear separation[footnote 1]. For example, the ‘nursing costs’ row will not be reported by providers that only provide care home services without nursing; if zeros are included in the calculation of the median, it will bias the calculated median downwards. It is important to ensure that these care homes are excluded from the median calculation, for example by an Excel formula which computes medians whilst ignoring zeros[footnote 2].

Analysis of return on operations and return on capital

These are a subjective judgement for the local authority, though relevant quantitative approaches that could inform this judgement are set out in Annex D. The cost of care exercise tool collects freehold valuations to inform these approaches.

Good practice principles for analysis

Spreadsheets carry a significant risk of calculation errors. Spreadsheets should follow the 20 principles of good spreadsheet practice set out by the Institute of Chartered Accountants in England and Wales. Consultancy work should also abide by these standards.

Spreadsheets should always be verified by someone different from their author. This is a form of ‘analytical proof reading’.

As noted above, it is important to be aware of the difference between cells containing zeros and cells containing missing/unreported data.

Completed data collection templates and subsequent analysis should be filed in such a way that an audit trail is created of how the exercise was carried out and how its final results were arrived at. Analysis should always aim to be reproducible.

  1. Microsoft Excel can inadvertently replace missing values with zeros when using formulas to move data between tabs, so might show a zero rather than missing nurse staffing cost for example. 

  2. See for example this Excel tutorial on median values, which illustrates the formula =MEDIAN(IF(A2:A17<>0,A2:A17)) for data stored in the range A2 to A17. See this Excel tutorial for counts