Addressing trust in public sector data use

Question 1

1. Key findings

Accepted Answer

Data is needed not just to develop new technology but also to enable the evaluation of it. The UK will be unable to embrace the opportunities presented by AI unless data held by the government and wider public sector is shared.
The sharing of personal data must be conducted in a way that is trustworthy, aligned with society’s values and people’s expectations. Public consent is crucial to the long term sustainability of data sharing activity.
Addressing legal and technical barriers to data sharing has been the focus of much recent work. Data protection law provides a framework for data sharing. If consistently interpreted and applied, this may help to build and sustain trust. However, there has been relatively limited effort by the government and wider public sector to address public trust explicitly.
A lot of personal data is shared across and outside the public sector. While this may be for beneficial purposes, public awareness of it is generally low. This gives rise to an environment of ‘tenuous trust’.
Trust can be undermined by the inconsistent interpretation and application of legal mechanisms for data sharing, as well as the adoption of different security and technical standards. This creates a complex and confusing environment which also hinders transparency.
There is also a communication challenge: framing the broader data sharing narrative to articulate how the public sector uses personal data is important. But this must also reflect what is publicly acceptable.
CDEI will explore the subject of trust in further work with a particular focus on public awareness and acceptability. This can potentially be partially addressed by enabling citizens to have more control over certain data about them. However, trust may also be strengthened by identifying the clear conditions under which it is appropriate to share data in the public interest, without explicit user control or consent.
Where data is shared in the public interest, there needs to be greater clarity about how the public interest is defined and judged. An individual’s right to privacy must be weighed against the rights of other citizens and of communities and society more widely.
CDEI will work with partners to articulate the conditions for public interest data sharing. This will include a consideration of the appropriate system level governance structures and the potential role of an independent body to create an anonymised environment or integrated data infrastructure for access to public sector data. The data held could potentially be used to support areas of innovation that may bring significant public benefits.
In its next phase of work, CDEI will collaborate with partners on use cases where there could be particular value to sharing more data in a way that is trustworthy.

Question 2

2. Areas to explore for further work

Accepted Answer

CDEI is looking into how to begin to address the issues raised in this report. We plan to explore how we can:

Identify opportunities to give citizens greater access to the data the public sector holds about them, and to support the creation of digital products that operate on shared data. Examples could include: a. Creating personal digital records of qualifications for citizens that they can share with prospective employers or educational providers b. Sharing medical test result data with patients who would like immediate access and the ability to share with other providers c. Supporting the use of mechanisms that allow digital sharing of death records by next of kin to speed up the probate process
Convene relevant stakeholders including the Office for National Statistics (ONS), and the Information Commissioner’s Office (ICO), as well as civil society organisations and the wider public, to establish clear principles for determining what constitutes public interest in the sharing and use of data by looking at specific areas including: a. Improving healthcare (e.g. allow research access to prescribing data to assess risks of opioid overprescribing) b. Improving public service delivery c. Preventing harm or discrimination in public service delivery (e.g. enable access to court records to enable an assessment of the consistency of court judgements)

This work should help to clarify the different public interest considerations when using data to develop and test AI applications, versus the deployment of AI systems in real world situations.

Work with the ONS, the ICO and others to establish clear guidelines on privacy protection standards to be applied when sharing data for public interest uses; this will include consideration of technical approaches such as differential privacy and homomorphic encryption, along with legal approaches such as audited standards applied to organisations with access to data.
Consider approaches to strengthen democratic accountability, including Parliament’s role in scrutinising the sharing and use of data in the public sector.

Question 3

3. Executive summary

Accepted Answer

3.1 Rationale

Data sharing is fundamental to effective government and the running of public services. But it is not an end in itself. Data needs to be shared to drive improvements in service delivery and benefit citizens. For this to happen sustainably and effectively, public trust in the way data is shared and used is vital. Without such trust, the government and wider public sector risks losing society’s consent, setting back innovation as well as the smooth running of public services. Maximising the benefits of data driven technology therefore requires a solid foundation of societal approval.

AI and data driven technology offer extraordinary potential to improve decision making and service delivery in the public sector - from improved diagnostics to more efficient infrastructure and personalised public services. This makes effective use of data more important than it has ever been, and requires a step-change in the way data is shared and used. Yet sharing more data also poses risks and challenges to current governance arrangements.

The only way to build trust sustainably is to operate in a trustworthy way. Without adequate safeguards the collection and use of personal data risks changing power relationships between the citizen and the state. Insights derived by big data and the matching of different data sets can also undermine individual privacy or personal autonomy. Trade-offs are required which reflect democratic values, wider public acceptability and a shared vision of a data driven society. CDEI has a key role to play in exploring this challenge and setting out how it can be addressed. This report identifies barriers to data sharing, but focuses on building and sustaining the public trust which is vital if society is to maximise the benefits of data driven technology.

There are many areas where the sharing of anonymised and identifiable personal data by the public sector already improves services, prevents harm, and benefits the public. Over the last 20 years, different governments have adopted various measures to increase data sharing, including creating new legal sharing gateways. However, despite efforts to increase the amount of data sharing across the government, and significant successes in areas like open data, data sharing continues to be challenging and resource-intensive. This report identifies a range of technical, legal and cultural barriers that can inhibit data sharing.

Technical barriers include limited adoption of common data standards and inconsistent security requirements across the public sector. Such inconsistency can prevent data sharing, or increase the cost and time for organisations to finalise data sharing agreements.

While there are often pre-existing legal gateways for data sharing, underpinned by data protection legislation, there is still a large amount of legal confusion on the part of public sector bodies wishing to share data which can cause them to start from scratch when determining legality and commit significant resources to legal advice. It is not unusual for the development of data sharing agreements to delay the projects for which the data is intended. While the legal scrutiny of data sharing arrangements is an important part of governance, improving the efficiency of these processes - without sacrificing their rigour - would allow data to be shared more quickly and at less expense.

Even when legal, the permissive nature of many legal gateways means significant cultural and organisational barriers to data sharing remain. Individual departments and agencies decide whether or not to share the data they hold and may be overly risk averse. Data sharing may not be prioritised by a department if it would require them to bear costs to deliver benefits that accrue elsewhere (i.e. to those gaining access to the data). Departments sharing data may need to invest significant resources to do so, as well as considering potential reputational or legal risks. This may hold up progress towards finding common agreement on data sharing. When there is an absence of incentives, even relatively small obstacles may mean data sharing is not deemed worthwhile by those who hold the data - despite the fact that other parts of the public sector might benefit significantly.

3.3 Public trust

It is also the case that the benefits of data sharing are not always felt equally among citizens. Indeed some people believe that the government prioritises using data to increase efficiency and fulfil its own objectives, rather than for the explicit and direct benefit to individual citizens. Barriers to data sharing are reinforced by a lack of public trust and an absence of a developed understanding of public acceptability. While the technical, legal, and cultural aspects need consideration, there is arguably a wider issue at stake. Whereas we have a well-developed understanding of the social contract between the citizen and the state in relation to taxation and public spending, the same is not true of data. Given that data is perhaps as important to the functioning of the state as money, and its significance is increasing, we argue it is time to consider a clear social contract between the citizen and state over how data is shared and used.

Establishing legality and adhering to security standards is fundamental to trustworthy data sharing - and are components of the existing data protection framework. But many of those involved in data sharing projects are conscious that the ethics are not straightforward and that public consent is far from certain. Survey evidence suggests a significant proportion of the population (between 40-60% of people) believes that the government’s use of data is not serving their interests.^{[footnote 1]}, ^{[footnote 2]}

Such uncertainty may mean that potentially valuable projects do not proceed, as the ‘rules of the game’ are unclear. In other cases, data is shared, but with little public awareness. This risks damaging trust further if such uses of data become widely publicised, particularly given evidence that there is deep-seated public distrust of governmental data-use.^{[footnote 3]}

This report identifies an environment of tenuous trust in which data may be shared for valuable purposes, but the manner in which this is communicated to the public is primarily to limit a potential negative reaction, rather than active positive engagement. A lack of understanding and debate around what is publicly acceptable when it comes to data sharing and use may also create perverse incentives for departments to be less transparent about their work.

Such a challenge is not unique to the UK Government, and private organisations are having to consider how to maximise the value of data in a way that is trustworthy and practical. The UK has already demonstrated leadership in the area of data sharing, particularly with regard to open data and the publication of public data sets to support innovation.^{[footnote 4]}

Nevertheless, the government and wider public sector will be unable to deliver the standard of services that citizens are entitled to expect if they lack societal consent to share data as effectively as is now required. The development and use of data driven technology to serve the public good is dependent on access to high quality data. CDEI supports a number of government initiatives already underway, including the Office for National Statistics (ONS) Secure Research Service and its accreditation regime, and the ICO’s Data Sharing Code. However, we believe that substantial additional investment is required. This could include a cost recovery model to address the financial disincentives of data sharing. Such a model would compensate those departments committing resources to sharing data for projects led elsewhere in the government.

While progress has been made on technical, legal, and cultural barriers, few of these measures (particularly outside of the health sector) are focused explicitly on addressing public trust. Inconsistent approaches to addressing barriers may also undermine trust by creating a complex environment with limited transparency, reflecting a lack of consensus around what constitutes safe public interest data sharing. There is a risk that this will stand in the way of future innovation.

3.4 Next steps

Efforts to address the issue of public trust directly will have only limited success if they rely on the well trodden path of developing high-level governance principles and extolling the benefits of successful initiatives. While principles and promotion of the societal benefits are necessary, a trusted and trustworthy approach needs to be built on stronger foundations. Indeed, even in terms of communication there is a wider challenge around reflecting public acceptability and highlighting the potential value of data sharing in specific contexts.

CDEI intends to focus on two areas outlined below.

Promoting citizen-driven data uses

Scepticism about whether data is used for citizens’ own benefit undermines public trust in data use and contributes to a feeling of powerlessness.^{[footnote 5]}

Government data sharing is often used to inform decisions about citizens without their involvement.

To address this, we will explore areas where citizens can feasibly have more control over how their data is shared and used. Initiatives like Open Banking have attempted to enable citizens to take more control over their data, while also encouraging innovation within the sector. CDEI will endeavour to learn from this and consider contexts where data portability and data mobility in the public sector should be given higher priority.

An important consideration would be the likelihood that citizens make use of having greater control over data about them. As such, compelling use cases would need to be developed. There may also be scope for individuals to enrol with third party intermediaries who would be given permission to take particular decisions on their behalf about how data about them is shared and used.

While explicit individual consent is often assumed to be the ideal basis for data sharing, there are a number of contexts in the public sector where this is not practical or desirable. This includes, for instance, when collecting administrative data from public services, or when actively identifying individuals eligible for particular support. Even when it is possible, individual consent in this context can often not be considered to be freely given because citizens may not have a meaningful choice if, for example, they need to access essential public services.

By working with public sector partners seeking to share data, and with civil society organisations and the public, CDEI will endeavour to articulate conditions under which data can, or should, be shared in the public interest while maintaining trust. In many cases it is not clear who assesses the balance of benefits and harms in respect of sharing particular data sets. This includes weighing an individual’s right to privacy against the rights of other people or communities, or society at large. In addition there are also cases when the boundary between using data to research policy issues and to intervene actively is unclear. For instance, a research project might identify characteristics that are likely to be shared by vulnerable people in a specific context, which may be uncontroversial. However, using this research to identify people as likely to be vulnerable and then intervene on their behalf raises different ethical issues.

CDEI will consider whether existing safeguards adequately ensure such data sharing can be considered trustworthy, and aim to develop a more consistent framework to address the current complex and uncertain environment. Such conditions would need to consider the protection of individual citizens from privacy invasion, protection of vulnerable groups, and measures of transparency and public engagement. A clear framework consistently applied would help to protect the dignity and privacy of individuals and build public trust, while also supporting the wider public interest.

We recognise that existing data protection legislation and the ICO’s forthcoming Data Sharing Code include processes such as Data Protection Impact Assessments, designed to support the identification of potential risks and harms. Indeed, it is unlikely that CDEI will recommend replacing existing mechanisms for data sharing. However, we are interested in exploring whether it is possible to define a public interest approach to data sharing with clear criteria regarding public benefit and protection of individual rights.

Question 4

4. Glossary

Accepted Answer

Aggregation: Combining data about individuals to analyse trends while protecting individual privacy by using groups of individuals or whole populations rather than isolating one individual at a time.

Anonymisation: Aggregating or transforming personal data so it can no longer be related back to a given individual. Often this is incorrectly used for data which has only been de-identified.

Data-driven technology: Technologies which are dependent and based on the availability of data. It includes techniques such as machine learning, as well as other data analytics methods.

Data linking: Joining two datasets together through shared or inferred characteristics. Data linking can sometimes be used to re-identify a de-identified dataset.

Data minimisation: The idea that one should only collect and retain the personal data which is necessary.

Data mobility: How easily data can move from one service to another.

De-identification: Removing the identifying characteristics from a dataset (Name, Date of Birth, etc.). It may still be possible for the data to be re-identified and related back to an individual by linking it with other datasets.

Interoperability: The technical ability of services to work together as a single system, with data moving seamlessly between them. This goes beyond portability to look at access to key shared infrastructure, standardised data formats, and secure transfer mechanisms. Interoperability maximises data mobility, however it is often technically complex to achieve.

Personal data: Information that relates to an identified or identifiable living individual.^{[footnote 6]}

Portability: The data mobility right, laid out in the GDPR, to access data about yourself in a way that could be used by another organisation. The transfer is usually done manually by downloading the data, but where technically feasible you have the right to ask the organisation to transfer it for you.

Pseudonymisation: Data which is pseudonymised has been de-identified while maintaining a unique ID to enable linking across data-sets and re-identification when necessary. (Note that the GDPR defines pseudonymisation to refer to all de-identified data, regardless of whether a unique ID is added).^{[footnote 7]}

Question 5

5. Introduction & Context

Accepted Answer

The purpose of CDEI, as set out in its Terms of Reference, is to identify how society can maximise the benefits from the safe and ethical use of data-driven technologies. Part of this task includes identifying and assessing effective and ethical frameworks for data sharing. This first paper on data sharing focuses on the flow of personal data held in the public sector.

Personal data held by the public sector presents significant potential value to individuals and society. The sharing of it is essential for the delivery of services and policy development across the public sector. Citizens rightly expect to be able to benefit from the availability of their information within public services to improve efficiency and coordination - for example when moving between NHS providers. Yet survey evidence also suggests that the public has low levels of trust in the government’s ability to use data in ways that will help them.

Data sharing is central both to innovation and to the ethics of data driven technology. Access to data is necessary for the development of new data driven technology as well as assessing the impact of these systems and holding those who are responsible for them to account. Part of CDEI’s role includes identifying how data can be shared in ways that allow innovation while increasing accountability and ensuring there are adequate safeguards.

Public sector organisations hold many of the most sensitive types of data, which have usually been collected for a particular purpose. While sharing it and combining it with other data sets may be of value, it is important to understand the potential trade-offs with individual privacy and autonomy as well as what the public might consider acceptable. The sharing of such data therefore presents particular challenges. By exploring these challenges, CDEI will develop advice designed to help those working with data to address them. Our aim is to promote more trustworthy sharing that supports innovation while also ensuring that data held in the public sector can benefit individuals.

This report explores recent efforts to increase data sharing across the government and wider public sector, maps the current technical and legal environment for data sharing, and focuses on a series of case studies to understand how common barriers to data sharing can be overcome. It then focuses on the reasons for notably low public trust in public sector use of data and sets out how a more trustworthy data sharing environment may be established. The paper concludes by setting out two areas for further work to address the issue of trust.

5.2 Personal Data in the Public Sector

The public sector is a federated group of government departments and organisations that struggle to work together as efficiently as expected by citizens. Personal data is split between a large number of databases, controlled by different authorities, with different access and sharing policies. Departments tend to only hold the limited data they need to deliver particular services and rarely build systems that can be accessed by other departments.

This data is valuable, not just in an economic sense, but also because it can unlock novel capabilities for public services. There are good reasons to avoid a single vault holding the data on all citizens. The friction this produces legitimately limits the power of the state over its citizens, helps account for the very different contexts in which data is collected, and minimises the impact of data breaches. However there are areas across the public sector where sharing useful data could enable bodies to deliver better services or design new policies.

A complex and inconsistent system for sharing can adversely affect the quality of service received by citizens. Individuals must often navigate various bureaucracies to access their information from one public sector body and provide it to another. This process may be frustrating to citizens, creates additional opportunity for error, and also means they are expected to share sensitive information in relatively insecure formats (e.g. by post).^{[footnote 8]}

Case Study - Blue Badge Parking Permit

Blue Badges are distributed by local councils and allow people with mobility issues access to disabled parking spaces closer to their destination. The clearest way to get a badge is to provide the local council with evidence of disability benefits based on impaired mobility from the DWP. If this is not possible, the applicant must provide written evidence explaining their requirements, including any health treatments they are on, medications they take, or doctors that have assessed their condition. This application is a lengthy process and councils can take 12 weeks or longer to come to a decision. It also makes the disabled individual responsible for coordinating between their local council, the DWP and their doctor, to get the accommodations they need, and may be too burdensome for many people who could benefit from this scheme. The development of an attribute exchange which would enable the verification of eligibility without sharing personal data could make the application process more efficient. This has been piloted by some local authorities.^{[footnote 9]}

An inconsistent and complex data infrastructure also makes it difficult to hold the government to account, both for policy programmes, and for the use of data. The keeping of data in silos may, for example, mean that opportunities to understand how particular policies affect people and groups are missed.^{[footnote 10]}

Data is shared for a number of reasons, and each project has its own benefits and risks. It is also likely that the public view of what is acceptable and trustworthy may differ depending on the use case. The reasons for data sharing are fluid and linked. Academic research using data may, for example, point to new innovations, but then also be used to support service delivery.

The most common reasons for data sharing in the public sector include:

Provision of public services to individuals

To improve the effectiveness and efficiency of delivering services to the public, it is often necessary for different parts of the public sector to share information about an individual’s circumstances. This includes information that is essential to provide a service (e.g. sharing medical records between different NHS providers), and determining the eligibility for particular benefits (such as the Warm Home Discount).^{[footnote 11]}

Law enforcement and community protection

In some cases, personal data about an individual is shared between public sector organisations to police the behaviour of that individual. For example, it may be possible to find information that indicates tax evasion or benefit fraud, or identify people who are using government services but do not have the right to remain in the country. Data is also shared when it is relevant to a criminal prosecution. Social services are expected to share information that relates to risks to children, and health services are required to share information in relation to communicable diseases.

Planning, managing and regulating public services and national infrastructure

Public services are delivered by a mixed economy of different public and private sector organisations. For such organisations, like Clinical Commissioning Groups in the NHS, or local authorities, which have responsibility for budgeting, commissioning or overseeing the delivery of services, access to information about the population served is an important resource.

The same is true of regulated national infrastructure such as the energy grid and the transport network, where effective regulation, oversight and planning require flows of data between different entities, including companies providing services, and public sector organisations with responsibility for the safe and effective provision of such services.

Use case (managing public services): The sharing of prescription data across the NHS

Data contained in patients’ prescriptions is shared across the NHS and also used in aggregated form to produce statistical reports. Organisations receiving prescription data include GP practices, pharmacies and NHS prescription payment and fraud agencies. Pharmacists review prescriptions issued by GPs to dispense the correct medication. Pharmacies share the prescriptions with NHS Prescription Services^{[footnote 12]}, part of the NHS Business Services Authority (BSA) via the prescribing and dispensing information systems. NHS Prescription Services uses the data in individual prescriptions to calculate the remuneration and reimbursement due to dispensing contractors across England. BSA also aggregates the data to help different parts of the NHS track trends and take informed decisions. This includes providing data for:

performance management
financial planning
following clinical best practice
complying with regulations
identifying outlier behaviour
publishing national statistics

Developing new policies

Sharing personal data can enable more innovation to help drive simpler and more efficient public services by finding new ways to address different policy goals, for instance exploring the relationship between school records and later criminal convictions in order to work out where early intervention could have the most impact. While such analysis would typically be conducted on anonymised data sets, identifiable data often needs to be shared so that the data can be accurately matched.

Monitoring

Traditional research tends to be dependent on reviewed data sets which are shared with researchers through a defined request and approvals process. However, access to real time data, or regular data feeds, can be more challenging with routes to accessing such data being less defined. In part, this may be because of technological barriers, but governance mechanisms are also challenging as data cannot be reviewed and approved before being released. This is important as it affects the ability of third parties to evaluate the effectiveness of particular initiatives without relying on historic data.

Evaluating existing policies

Analysis of data about populations is essential to understand whether or not government policy is working. In many cases the relevant information will not be held by the organisation responsible for a policy. For example, to analyse whether training or rehabilitation policies are resulting in people gaining employment, data will need to be collected from different agencies as well as potentially from private sector providers contracted to deliver a service.

Sharing data enables departments to understand the long-term effects of their policies across a number of factors in an individual’s life and build a full picture of the benefits and costs. Even when the final analysis can be done using de-identified data, linking data across departments often requires the sharing of original identifiable data. It also allows other bodies to assess the effectiveness and fairness of existing initiatives, and hold departments to account for their work.

Research

Independent researchers (e.g. academics and policy institutes) may submit research proposals to access data to inform their work, either on public policy or other social science research. Research projects may rely upon single datasets but in many instances different datasets are linked and de-identified.

Research and the Digital Economy Act

Part 5 of the Digital Economy Act included new legal powers to provide the UK Statistics Authority (and ONS as its executive office) with better access to data to support the production of official and national statistics, and statistical research; and to provide accredited researchers with better access to de-identified public sector data to support research projects for the public good.

The Act facilitates the linking and sharing of de-identified data by public authorities for accredited research purposes (except for health and social care bodies). The UK Statistics Authority is the statutory accrediting body responsible for the accreditation of processors (those who de-identify the data and provide secure access to the de-identified data), researchers and their projects.

Before data can be shared for research purposes, it must be processed by an accredited processor so that the data is ‘de-identified’. When the data has been de-identified it can be made available to an accredited researcher in a secure environment for research in the public good, and the processor will ensure that any data (or any analysis based on the data) retained by the researcher, or that are published, are ‘disclosure controlled’ to minimise the risk of data subjects being re-identified or other misuses of the data.

As the statutory accrediting body, the UK Statistics Authority has established a Research Accreditation Panel to oversee the independent accreditation of processors, researchers and research projects.

To date, there are approximately 3000 accredited researchers, with accredited processing environments across the country, including the ONS SRS in London, Newport and Titchfield and the NISRA Research Support Unit in Belfast.

Over the past 20 years, governments have set out a series of policy initiatives to increase data sharing. In the context of an increasingly data driven world, the impact of these policies is insufficient.

The focus of recent reports on this issue has often been on the barriers government faces when it tries to share data, leading to missed opportunities. The think tank Reform, the National Audit Office, and the Public Accounts Committee have all identified similar categories of barriers which continue to stifle effective use of data.^{[footnote 13]}

The clearest obstacle to data sharing is the set of complex technical constraints limiting the ability of departments to coordinate. Many elements of the public sector rely on their own legacy IT systems and poor quality data which also make it difficult to conform to common standards. In addition, public sector organisations set their own security requirements for data sharing and use different approaches to enabling access to third parties.

Perceived legal challenges are often highlighted as barriers to data sharing. There have been many initiatives to create specific legal gateways for sharing data which sit on top of data protection legislation (see below). The result is a complex and confusing environment. Uncertainty about the legal bases for data sharing and the absence of common terms and conditions mean that bodies attempting to share data tend to start from scratch when determining legality. Different data sets are also governed by different legislation and such an absence of consistency can create further confusion.

The need to identify the appropriate legal gateway and follow particular processes and approaches may introduce legitimate friction in the system which acts as a safeguard against misuse of data. However, in many cases delays may also occur as a result of a lack of clear business processes or the need to secure the resources needed to procure legal expertise.

Existing data protection law may provide a framework to enable trustworthy data sharing. However, there are concerns that misunderstandings relating to GDPR and what it permits may have created further aversion to data sharing. The Information Commissioner’s Office’s updated Data Sharing Code of Practice (the draft Code was published in 2019) will seek to address this.^{[footnote 14]}

Related to this are issues of terminology. GDPR and the Data Protection Act 2018 use clearly defined terms, particularly with regard to anonymous information and pseudonymisation. However such terms are not well understood by citizens, and other public bodies continue to use different terms which lead to further inconsistencies.^{[footnote 15]}

While the ICO’s Data Sharing Code will help, there likely remains a considerable communications challenge.

The identified barriers to data sharing can be exacerbated by the prevalence of a siloed and risk averse culture in the public sector. In some cases data sharing may raise potential reputational risks or other threats causing a conflict of interest. In other instances, a lack of certainty about the legality, security and ethics of data sharing may result in the outright rejection of data sharing requests or significant delay in establishing the terms of data sharing. Departments are generally focused on the services they provide directly and there is little incentive to share data in support of a wider government agenda or benefit citizens in other areas. Such a culture may have emerged to protect citizens’ privacy and reduce the risks of data misuse. However, this may cause potential benefits and opportunities to be missed, or not adequately evaluated.

Some departments expressed concern about the safety of their data if they share it with others, especially if they cannot confirm the security arrangements in other organisations. While understandable and right, this can discourage opportunities to use data to its full potential.

Source: Challenges in using data across government, National Audit Office (2019)^{[footnote 16]}

In addition to the barriers detailed above there is a lack of incentives to encourage public sector organisations to share data for ethical and valuable purposes. While sharing data may bring benefits to citizens and different parts of government, it does not follow that it is viewed by those holding the data as worthwhile. Even relatively small obstacles may stop data sharing from happening with the costs viewed as outweighing any potential benefits. Relevant legislation e.g. Digital Economy Act, is largely permissive, meaning organisations can refuse to share data without a risk of penalties.

These barriers require time and money to overcome, and the lack of central leadership exacerbates this problem - in 2019 the National Audit Office concluded that data is generally not treated as a strategic asset by government departments, and data projects are rarely prioritised in funding decisions.^{[footnote 17]}

This is in part because the existing costs of poor data are hidden and the benefits only appear in the long-term. Furthermore, departments will often collect data with a particular purpose in mind with little consideration given to the value of potential secondary uses. This can increase costs as additional work needs to be undertaken to make the data suitable for sharing.

Although these barriers are important, our analysis of data sharing projects highlights the more fundamental issue of the public’s lack of trust in government use of data. It is likely that public trust varies depending on the type and context of the data use - and may be higher, for example, in relation to aggregated data used for statistical purposes, than identifiable data being shared. However understanding what is deemed trustworthy and what is publicly acceptable is not straightforward.

Surveys suggest that the public assume much more data is shared than takes place in reality. Yet a lot of data-sharing that is happening takes place without the knowledge of the citizens whose information is being used. There is a long history of data sharing projects becoming public and triggering a backlash. Organisations agreeing to share data risk coming under pressure from the media and civil society organisations. This may cause government departments to limit their external communication to prevent further weakening trust. We explore public trust in chapter 5.

Question 6

6. A brief history of government data sharing initiatives

Accepted Answer

To understand how these barriers have developed and the steps taken to try to overcome them, it is useful to look at the history of this policy debate. The drive for more data sharing, alongside concerns about privacy, has a long history. The timeline below sets out some of the key milestones and highlights an absence of an explicit focus on public trust.

Timeline Diagram

Modernising Government White Paper (1999)

This paper on public sector reform saw effective data sharing as part of creating a digital government
It pointed out some technical, legal and cultural barriers to overcome, while highlighted the need for privacy and transparency

Privacy and Data Sharing (2002)

This was a full report on data sharing as part of the Modernising Government agenda
It proposed new data sharing legislation to simplify the legal environment
It also proposed a new Public Services Trust Charter for trustworthy public bodies, with accompanying kitemark

Department for Constitutional Affairs (2003-4)

The DCA explored the need for the proposed legislation, but found it unnecessary
Instead, they developed a code of practice and provided legal guidance

Transformational Government (2005)

An agenda owned by HM Treasury to pursue shared IT services across government
It set up a Data Sharing Ministerial Committee which controversially claimed “information will normally be shared…provided it is in the public interest”

Data Sharing Review (2008)

An independent review called in the light of controversial centralised IT projects such as the National Identity Register
Similar to the DCA analysis, it found no legal barriers but much legal confusion, cultural barriers, and a lack of trust
It called for a statutory Data Sharing Code of Practice from the ICO, and more Privacy Impact assessments
It also called for a fast-track data sharing power, allowing Secretaries of State to amend any legislation or add a power using secondary legislation to enable data sharing
It pointed to the need for ‘safe havens’ for academic research, although the government highlighted that systems such as the ONS Secure Data Service were already in place

Coroner’s and Justice Act (2009)

The recommendations from the Data Sharing Review were originally included in this act, however the broad fast-track power received a lot of opposition due to lack of safeguards
The only measure which was enacted was the requirement for a Data Sharing Code of Practice from the ICO, introduced in 2011

Government Digital Service (2010)

The new centralised IT service sought to overcome technical barriers to data sharing by setting data standards for procurement
Rather than centralised databases on citizens, there is a move towards alternative distributed solutions for linking data such as the Verify initiative

Data Sharing Code of Practice (2011)

Published by the ICO

Open Data White Paper (2012)

This paper moved the focus of data policy from creating joined-up government to enabling innovation from publicly held datasets
The goal was to increase citizen choice, drive innovation, and enable external research

Improving Access for Research and Policy (2012)

A report by the Administrative Data Taskforce looking at opening government administrative data for external researchers
Proposed more use of safe havens and a new gateway allowing data to be linked for research

Law Commission report (2014)

An investigation into whether the legal framework was inhibiting data sharing by public bodies
Found no major legislative obstacles, but a lot of confusion based on the complex statutory gateways that impeded transparency and led to differing interpretations.
Called for reform with fewer gateways based on principles, rather than projects.

Data Sharing discussion (2014)

An “open policy making” process where potential new data sharing legislation was published and discussed with civil society groups
The new legislation focused on research, tailored public services, and managing fraud, error and debt
The engagement had led to a number of new safeguards on making sure the sharing would be beneficial and not punitive to the individual

Better Use of Data (2016)

A public consultation on the final proposal from this policy-making process
The consultation pointed to the need for more transparency, accountability, and accessibility to individuals

Digital Economy Act (2017)

The powers within Part 5 of the Digital Economy Act are designed to help overcome legislative barriers to data sharing.

Data Protection Act (2018)

The Data Protection Act 2018 (DPA 2018) sits alongside the EU General Data Protection Regulation (GDPR) and sets out the framework for data protection in the UK.
The ICO is obliged by the Data Protection Act 2018 to produce the data sharing code.
The ICO is expected to publish a new Data Sharing Code of Practice in 2020.

National Data Strategy (2020)

Development of the Strategy is underway and led by DCMS to improve access, efficiency and trust in private and public data use.

Question 7

7. The data sharing environment

Accepted Answer

This chapter sets out the current context for the sharing of data held in the public sector and considers the existing legal, technical and cultural factors. These are also linked to developing a trustworthy environment for data sharing.

There have been efforts to promote consistent approaches to data sharing that also aim to support data controllers to work in ways that are safe and ethical.

Five Safes

A number of past reports into data sharing have called for ‘data safe havens’ to be set up to provide security when analysing data.^{[footnote 18]}

Administrative Data Taskforce, Improving Access for Research and Policy (2012)

The gold standard in this regard is the Office for National Statistics (ONS) and their Secure Research Service. The ONS has promoted and led the way by implementing the Five Safes principles:

Safe people: researchers must be experienced, accredited and they must sign a confidentiality contract
Safe projects: every project must be in the public benefit, must be approved, and the results publicly available
Safe settings: data is only accessible in secure environments
Safe data: the data is de-identified as much as possible
Safe outputs: the outputs of analysis must not identify any individuals

The principle of ‘safe settings’ is one of the most important principles due to the ease of copying and transferring sensitive data for another purpose, but it is also the hardest to enforce. Many data controllers therefore establish their own facilities for researchers to use, where they are required to access the data through managed equipment and their activity is closely monitored. There are also secure environments at HMRC, and in Scotland which includes data from NHS Scotland.

Over time, the Five Safes is set to provide increased consistency for researchers accessing sensitive government data.

Protecting Individuals’ Identities

Anonymisation is an important principle often invoked in justifying data sharing, particularly when looking to share data for research or evaluation purposes with academic or private sector organisations. However anonymisation cannot be achieved simply by removing identifiers (de-identified data) since any data that includes information about individuals raises a possibility of ‘jigsaw’ re-identification, in which different sets of information are cross-checked allowing information about an individual to be discovered (often termed the ‘mosaic effect’).

Relying solely on de-identifying data may provide individuals with a false sense of security when reality falls short of their expectations of privacy. Techniques to protect people’s identity rely on a combination of information reduction (de-identification and obfuscation), secure controls over access to data, and legal and contractual obligations on organisations accessing data.

Attribute exchange

There are a number of data sharing use-cases where it would be simpler and more secure to share a response to a query rather than the full individual data. GOV.UK Verify is an example of this mechanism, as identity verification is done by a third party with access to multiple sources of identity data, and only the result is passed on to the organisation which requires the verification - minimising the exchange of sensitive data.

Privacy Enhancing Technologies (PETs)

There are also a number of more sophisticated emerging technologies which can better protect the privacy and security of different data sharing approaches. These limit access to individual data, either by transforming it, encrypting it, or storing it on a different system, while still enabling analysis.

Differentially private algorithms are one example, designed so that calculated statistics do not give more information about a particular individual than if that individual had not been included in the dataset. Federated learning is another mechanism, which allows a central authority to create machine learning models without collecting individual’s data in a central location. A third technology is homomorphic encryption, which allows computations to be performed on encrypted data. Finally, trusted execution environments allow code and data to be isolated from the rest of the software that is running on a system, in order to ensure their confidentiality.

Such technologies to protect individual data have many potential benefits, although the large number of alternatives could easily cause confusion about the right approach and make collaboration more difficult. The Royal Society has called for the Government to both fund the development of PETs, and become an early adopter in order to provide support and expertise to wider society.^{[footnote 19]}

However, all these techniques for maximising individual privacy and data security only go so far. There are further questions about privacy in general that need to be asked, especially about what uses of data are acceptable and how decisions to share data are made.

Development of Ethical Frameworks

In June 2018 the UK Government published a Data Ethics framework setting out the principles for ethical use of data by public services. Principle number 6 is to “Make your work transparent and accountable”. This principle stresses the value of sharing algorithms and data in order to allow work to be understood, reviewed and challenged. There is little evidence that the framework is yet having a significant practical impact although it is set to be refreshed and promoted by DCMS.

UK Statistics Authority: Data Ethics Committee

The National Statistician’s Data Ethics Advisory Committee (NSDEC) was established to advise the National Statistician whether the access, use and sharing of public data, for research and statistical purposes, is ethical and for the public good. It provides a good example of the way in which ethical principles can be practically applied to decisions regarding the sharing and use of data.

NSDEC considers projects and policy proposals, which make use of innovative and novel data, from the Office for National Statistics (ONS) and the Government Statistical Service (GSS) and advise the National Statistician on the ethical appropriateness of these. The Committee must include at least five independent external members. To make decisions the committee assess proposals against the NSDEC’s ethical principles:^{[footnote 20]}

The use of data has clear benefits for users and serves the public good.
The data subject’s identity (whether person or organisation) is protected, information is kept confidential and secure, and the issue of consent is considered appropriately.
The risks and limits of new technologies are considered and there is sufficient human oversight so that methods employed are consistent with recognised standards of integrity and quality.
Data used and methods employed are consistent with legal requirements such as Data Protection Legislation, the Human Rights Act 1998, the Statistics and Registration Service Act 2007 and the common law duty of confidence.
The views of the public are considered in light of the data used and the perceived benefits of the research.
The access, use and sharing of data is transparent, and is communicated clearly and accessible to the public.

The NHS is home to one of the richest data sources in the UK. The insights derived from it have the potential to drive innovation and improve health care, for instance through precision medicine, genomics, or medical imaging.^{[footnote 21]}

However health data is highly identifiable and generally viewed as being intensely personal, creating unique ethical concerns. For this reason, the NHS has been at the centre of the public sector data use debate for decades; the first review of Patient-Identifiable Information was led by Dame Fiona Caldicott in 1997.

Trust in the NHS using data appears to be high relative to other parts of the public sector.^{[footnote 22]}

Nevertheless, there have been a number of high profile cases which have highlighted the different tensions at play. This includes projects such as our case study of the Royal Free and DeepMind, as well as wider sector initiatives such as care.data, which became a focus of media attention and triggered a public backlash. These issues are perhaps exacerbated by the organisational complexity of the NHS.

There are ongoing efforts by the NHS and the Department of Health and Social Care to address trust while also seeking to support socially valuable innovation. Most noticeably, in 2014 Dame Fiona was appointed in an official advisory role as the National Data Guardian for Health and Social Care (NDG) and the role was placed on a statutory footing on 1 April 2019. A core component of the role is trust, “with a focus on what can be done to help people be aware of, and more actively engaged in, decisions about how patient data is used and protected.”^{[footnote 23]}

Based on a recent public consultation on her work, the following priorities were chosen:

Supporting public understanding and knowledge, by championing meaningful transparency and public engagement
Encouraging information sharing for individual care
Safeguarding a confidential health and care system, by clarifying reasonable expectations and what use-cases should not go ahead

The NDG led an independent review of data security, consent and opt-outs from 2015 to 2016 which led to the implementation of greater transparency in NHS data sharing and more role for patient consent and opt-outs for secondary uses of data.^{[footnote 24]}

Inconsistent Approaches

Different approaches to governance may undermine public trust. A fragmented landscape is not only frustrating for researchers who must navigate different governance regimes, but may also lead to inconsistent governance decisions being taken. Greater use of the Research strand of the Digital Economy Act by departments, which has been approved by Parliament and was developed through an open policy making process, may help to address this.

As outlined above, ONS has introduced strong governance processes, including accreditation frameworks which address ethical considerations, as well as security. However, when it comes to the sharing of real time data, regular data feeds (for monitoring purposes) or even data that might have a greater direct impact on citizens, there is an absence of common approaches.

7.2 Legal measures

Confusion about what is legally permissible has hampered data sharing proposals, and there have been a number of attempts, most recently the Digital Economy Act, to address this. An added complication is that legal permissions governing particular data sets (in addition to data protection legislation) tend to relate to specific departments rather than the public sector as a whole. The holding of different data by different departments and agencies can seem rather arbitrary and adds further complexity. Establishing the legality of data sharing is a key element of trustworthy data sharing, but an inconsistent and confusing environment may also undermine trust.

Data Protection Act 2018 (DPA)

Data protection has been the applicable law governing data sharing since the Data Protection Act 1998. The introduction of the EU’s GDPR together with the DPA 2018 strengthened many of these protections.

One of the GDPR requirements is that public authorities must have a valid lawful basis for processing or sharing personal data. Public authorities are generally able to use the ‘public task’ basis instead of requiring explicit consent, which allows for either carrying out a specific task in the public interest laid down by law or exercising official authority which is also laid down by law.^{[footnote 25]}

Departments headed by a Minister of the Crown have common law powers to share information which covers this requirement. Other public bodies can also share information to fulfil a power given to them by legislation. Data sharing may also be necessary in order to comply with a legal obligation.

Although individual consent is not legally required, data sharing is expected to be transparent to data subjects so they know what is happening to their data, and that the data shared is proportionate to a limited purpose. While re-use for research and statistical purposes may be compatible with this purpose, other public interest uses may not be. If there is a high risk to the individual, a Data Protection Impact Assessment is required. Despite this, there is still a misconception that consent is legally necessary for sharing data which leads to risk aversion.^{[footnote 26]}

Consent and public sector data use

GDPR: Consent

Consent means offering individuals real choice and control. Genuine consent should put individuals in charge, build trust and engagement, and enhance your reputation.

Source: ICO’s Guide to Data Protection^{[footnote 27]}

Public sector organisations rarely use consent as the basis for processing data. In most cases this would be inappropriate since the processing of data is a precondition of the provision of a service and there is therefore an imbalance in power between government and the citizen. In these circumstances consent would not be regarded as freely given. Instead, the performance of a public task is the lawful basis for processing data. This may also allow for processing by a third party data processor, often a private sector organisation, acting on behalf of the public sector organisation.

GDPR: Public Task^{[footnote 28]}

You can rely on this lawful basis if you need to process personal data: ‘in the exercise of official authority’. This covers public functions and powers that are set out in law; or to perform a specific task in the public interest that is set out in law.
It is most relevant to public authorities, but it can apply to any organisation that exercises official authority or carries out tasks in the public interest.
You do not need a specific statutory power to process personal data, but your underlying task, function or power must have a clear basis in law.
The processing must be necessary. If you could reasonably perform your tasks or exercise your powers in a less intrusive way, this lawful basis does not apply.
Section 8 of the Data Protection Act 2018 (DPA 2018) says that the public task basis will cover processing necessary for: the administration of justice; parliamentary functions; statutory functions; governmental functions; or activities that support or promote democratic engagement

Commissioners for Revenue and Customs Act 2005

HMRC has specific legislative restrictions on data sharing to protect confidentiality which often prevents them from sharing data. The founding legislation of HMRC, the Commissioners for Revenue and Customs Act 2005, restricts data sharing to its specific functions, when a new legal gateway exists, or when it is in the public interest (as defined in section 20 of the act). Because of the itemisation of these specific cases, HMRC has more limited disclosure powers than most other public bodies, especially compared to minister-led government departments.

Statutory gateways

While common law powers are often sufficient, many other data sharing legal gateways have been created in order to provide legal certainty where bodies have been concerned about ambiguity. This has led to a very complex data-sharing legal environment. The Law Commission report in 2014 found “the law has developed without consistent oversight and scrutiny, resulting in a complex web of statutory provisions.”^{[footnote 29]}

While most of these gateways are permissive, some require data sharing in specific circumstances. For example, the Children Act 1989 requires Local Authorities to provide the Education Secretary with any requested information on students or children in care, and any other information necessary to monitor the performance of schools or care homes.

Digital Economy Act 2017

The Digital Economy Act 2017 (DEA) sought to provide clarity and limit the need for any new legal gateways to share data, as well as taking into account the HMRC confidentiality rules.

A data sharing project relying on the DEA must be pursuing one of four possible purposes:

The improvement, targeting, or monitoring of a public service by specific public bodies, as long as it has an approved objective aimed at improving the wellbeing of an individual or household
The sharing of civil registration data (births, marriages and deaths) to enable other public services
For trial projects to identify or act on fraud, error or debt in public finances
Projects and researchers approved by the Statistics Board for research that is in the public interest

Each of the DEA powers requires a code of practice to be issued that is consistent with the Information Commissioner’s data sharing code of practice. The specific objectives must be agreed through secondary legislation, and any project relying on these must have a publicly accessible data sharing agreement which is included in a public register.

For the public service task, the initially agreed objectives were to identify households struggling with multiple disadvantages, and to address fuel and water poverty.

Use Case: The Troubled Families Programme and the Digital Economy Act

A local authority could look to access data held by a range of partners (including the local police force, schools, and others) to identify whether there are individuals or households who meet the criteria for support under the Troubled Families Programme. The proposed information share would need to be consistent with the multiple disadvantages objective of the public service delivery power created by the Digital Economy Act 2017, and the bodies that the local authority wishes to share data with would need to be present on Schedule 4 in the Act as specified persons able to disclose data under this gateway. This means that the local authority has a lawful power to share data. As the power is permissive, the local authority will still need to agree with the other bodies to share information for this purpose and draw up an appropriate data sharing agreement and ensure that it is compliant with the Data Protection Legislation. The consent of citizens is not required.

Question 8

8. Our Case Studies

Accepted Answer

Reports into data sharing commonly focus on identifying barriers standing in the way of data sharing. However, a significant amount of data is successfully shared across the public sector, as well as with outside organisations. The following case studies explore such examples. These enable an understanding of how barriers to sharing are overcome and whether there are lessons for future data sharing projects.

8.1 Overview

The case studies explore different ways personal data has been shared. They deliberately concentrate on the sharing of some of the most sensitive and largest datasets held by the public sector. In each case, data has been successfully shared and the objectives of the project have been achieved. The case studies were selected to enable an examination of a wide range of data sharing scenarios, including sharing with private organisations, individual citizens, and central government.

Summary of CDEI Case Studies

Case Study	Controller	Content	Receiver
Education data shared with Approved Suppliers	Department for Education	Education attainment and pupil characteristics	Accredited suppliers developing analytical tools to be purchased by schools and local authorities
VAT Register Shared with Credit Reference Agencies	HM Revenue & Customs	Business VAT data	Credit reference agencies
Troubled Families Evaluation	Local Authorities, Ministry of Housing, Communities and Local Government, Department for Work and Pensions, Department for Education	Various	Final data set analysed by MHCLG - ONS contracted as a trusted third party to collate the data
Sharing Patients’ GP Records	GP practices	Medical record	Enabling individual patients to access their medical record and share it with other bodies e.g. local authority social care providers
DeepMind / Google Health Medical Diagnosis	Royal Free NHS Hospital Trust	Patient test results	Google/DeepMind

Education Data Shared with Approved Suppliers (Analyse School Performance)

The Department for Education holds a wide range of information about students who have attended schools and colleges in England since 2002 (about 21m individuals).

The National Pupil Database has grown to be the Department for Education’s primary data resource about pupils. It is shared with external education researchers, other departments, as well as private organisations. Our case study explores the provision of data (concerning pupils currently in the system, under the Analyse School Performance initiative)^{[footnote 30]} to companies developing and offering data analytics products to schools. Decisions to share data are considered by the Department for Education’s Data Sharing Approval Panel, which can require appropriate safeguards to be in place prior to sharing information.^{[footnote 31]}

VAT Register Shared with Credit Reference Agencies

Credit reference agencies are an essential part of economic infrastructure, ensuring businesses that need trade credit to purchase supplies are reliable borrowers. While large companies can be assessed on publicly available data, small businesses have few ways to reliably signal their financial status and can struggle to get access to the credit needed to grow. To address this, HMRC shares non-financial VAT registration data with credit reference agencies so these agencies can assess the credit worthiness of smaller businesses. Given the data includes information about small businesses (including registration number and name of business), some of the information could be viewed as being particularly sensitive.

Troubled Families Evaluation

The Troubled Families Programme is an initiative led by the Ministry of Housing, Communities and Local Government (MHCLG) to intervene and provide better joined-up support for families experiencing multiple intersecting problems, for instance mental health problems, domestic abuse, and unemployment.

To evaluate the impact of the programme, MHCLG gathered data from upper tier local authorities as well as across a range of central government departments and passed it into the care of the Office for National Statistics for analysis. The information used includes offending data from the MoJ, employment and benefit data from DWP/HMRC, and attendance data from DfE. Data was collected for all families that are eligible for the programme, not just those enrolled, to provide a control group for the evaluation.

Since 2015, all NHS patients have been able to sign-up to GP online services and use a website or app to view parts of their GP record, including information about medications, allergies, vaccinations, previous illnesses and test results. GPs can share this data with patients through a number of accredited third party services. In some cases, patients are able to share their medical record securely with other providers e.g. non-NHS physiotherapists.

DeepMind Medical Diagnosis

The Royal Free started a project with DeepMind in 2016 where they have used the personal information of 1.6m patients to test the clinical safety of a new app (Streams), prior to it being deployed for direct care. Streams uses a range of patient data to determine whether a patient is at risk of developing Acute Kidney Infection and sends an instant alert to clinicians who are able to take appropriate action promptly. In 2017 the ICO ruled that the Royal Free failed to comply with the Data Protection Act when it provided patient details to Google DeepMind for the purposes of testing the app.^{[footnote 32]} By 2019, Royal Free had completed all actions required by the ICO and they had no outstanding concerns about the app being used for the direct care of patients.

Summary of the value of data sharing realised in the case studies

Enable evaluation of government programmes to inform future decisions and policy development.
Support the development of new data driven tools to aid service delivery and diagnostics.
Ensure new analytical tools are built using the best data available.
Share data with third parties able to deliver important services not provided by the public sector.

8.2 Barriers Encountered and Solutions Applied

The case studies highlight that data sharing takes place for a range of purposes with varying value to citizens and the public sector as a whole. It is also clear that many of the identified barriers are surmountable and even help to ensure decisions to share are scrutinised. However, in many cases, public awareness is low and the approaches taken to sharing are inconsistent.

A common barrier faced by those wishing to share data relates to the inconsistent application of rules and standards for sharing. Individual departments set different requirements in relation to identifying legal gateways, security standards and making decisions around whether or not data can be shared. In some cases, this is because certain data is governed by specific legislation e.g. HMRC data or health data, making it challenging to adopt consistent approaches.

Navigating these hurdles takes time and resources, often including legal and technical expertise. If parliamentary approval is required, it may take years before data can be shared. Agreements to share may require new legal contracts and memoranda of understanding which also take time and expertise to draft. Even in cases where sharing is legally permissible, cultural barriers and risk aversion need to be overcome - which is why senior level buy-in is so important.

In addition, once an agreement to share data is in place, the limited adoption of common standards may mean significant work is needed before data can be shared.

Barrier	Barriers identified in case studies	Solutions applied by the projects in the different case studies
Technical	* Different data standards in use (often because of legacy systems) across the public sector making the adoption of common standards difficult * Data quality is widely variable and limits the ability to connect records Hard to transfer large data sets securely * Insecure legacy systems	* Commit resources and expertise to tidy-up data * Use third-party intermediaries (e.g. ONS) * Work with digital identity providers (which bring additional costs to sharing)
Legal	* Establishing the lawful basis for sharing takes time and isn’t always straightforward (DEA) * May need new agreements for each partner * Data controller’s requirements may not align with the expectations of individual clients	* Create new legal gateways. Note, however, these can take time (possibly parliamentary approval) and resource to be approved * Share Privacy Notices which detail the purpose of the data sharing
Cultural	* Risk aversion is sometimes the result of the impact of historic data breaches * Different departments have different needs requiring separate data sharing agreements	* Requires patience and senior level buy-in (and political support) * Establish clear legal routes as well as separate Memorandums of Understanding
Public Trust	* Limited individual control of what data is collected and what it is used for * Data is often retained for a long-time for research purposes, beyond the initial purpose * Gaining individual consent for such large datasets is very burdensome * Lack of transparency triggers public backlash when sharing becomes public, especially if shared outside government	* Establish approvals committees to consider not just whether the data sharing is legal, but also whether it is ethical to share * Rollout client engagement strategies and proactively inform the public of new data sharing arrangements
Time & Money	While the barriers identified above may be surmountable they often require time and money to overcome. Thus the need to invest additional resources into data sharing can become a further barrier. This includes procuring legal advice as well as other expertise and investing in the technology required to share the data securely.

8.3 The Barriers in Detail

Overcoming Legal Confusion

Before sharing data, legal gateways need to be identified. These provide an element of democratic legitimacy and therefore trust to the exchange but also sit alongside the Data Protection Act and the combination of these can create a confusing and inconsistent environment. Different datasets may be subject to different laws. Identifying the appropriate legal gateways is time consuming and leads to uncertainty. There are also occasions when organisations in the public sector have got this wrong. This was the case in the Royal Free case study, where the Trust was found by the ICO to have incorrectly interpreted the legal framework around data sharing for direct care.

In the case of HMRC sharing data with credit reference agencies, primary legislation was required which took several years to draft and be approved by Parliament. For the Evaluation of the Troubled Families Programme, separate data sharing agreements (Memoranda of Understanding) were needed between MHCLG and all other government departments, as well as 149 local authorities. Legal expertise was also required to identify legal powers to share data and to ensure the requirements of Data Protection legislation and the GDPR were met. Common law powers were eventually relied upon for MHCLG to share the data with other government departments.

In the case of the National Pupil Database (NPD) and other datasets held by the DfE, the Education Act 1996 and other pieces of legislation, including the Children Act 1989^{[footnote 33]}, provide the Department for Education (Secretary of State) with more discretion over decisions to share data. However, even in the case of education data being shared with accredited suppliers, new legal agreements were needed which ran to 50 pages. The Department needed to procure external legal support to draft the contracts for the different suppliers. Similarly, in the case of Ofsted sharing data on fostering agencies with the Alan Turing Institute, it took several months to finalise legal arrangements before the data was shared - despite it only being a 6 month project.

Identifying legal gateways is an important element of trustworthy data sharing. However, the current environment is confusing and can seem inconsistent. The need to engage legal specialists also introduced additional costs to data sharing which created further barriers to be overcome.

Role of Data Controllers

Despite legislation providing legal mechanisms for data sharing, the decision to share often rests with the data controller. This may mean that even in cases where an individual may want his/her data shared, the data controller may hinder this from happening. In the example of GP record sharing, there is variation across different practices. GPs decide which fields of their records patients can easily access and may also place controls around what may be shared with other bodies.^{[footnote 34]}

In other instances, the decision to share rests with individual departments. For the Troubled Families Evaluation, MHCLG needed to work to secure senior-level buy-in across different departments to ensure the data sharing was authorised. This took considerable time and resource. Even under the Digital Economy Act, departments may refuse to share data (or delay the sharing of it) despite requests falling within the remit of the legislation.

A related challenge occurs when the benefits of data sharing are accrued elsewhere. In the case of GP records, GP practices may perceive that there is little value to them in enabling patients to share data, while significant costs may be required to put in place new systems. Similarly with the Troubled Families Programme, effort still had to be put into secure senior level buy-in as individual departments needed to commit resources to linking the data. The fact that the Troubled Families Programme is a high-profile initiative and tackled a range of issues of interest across government is likely to have made it less difficult to persuade departments to share their data. Projects which are less of a government priority may struggle to obtain the data they need, particularly if the value is accrued away from those providing the data.

Overcoming Technical Constraints

Data standards and quality

Sharing sensitive data requires high levels of security, which are hard to meet when data is often managed in legacy systems. It is particularly challenging when sharing across organisational boundaries, where each side may have different requirements for the security of their data and no shared infrastructure.

Even when systems are up to date, the lack of adoption of common data standards and formats across the public sector also inhibits data sharing. This is particularly challenging when agencies seek to link data from different sources. In the Troubled Families Evaluation, each department used a different methodology for matching the data - this may have resulted in differing match rates across different departments. Across a number of the case studies additional resource and expertise were needed to tidy-up the data to ensure it was useable.

Addressing data security

There is an absence of a central infrastructure designed to support the sharing of personal data across the public sector. This has led to different approaches being taken to facilitate the sharing of data.

The creation of the ONS Secure Research Service may help to address this for certain use cases. MHCLG used the ONS service to match the data and create the anonymised dataset for the Troubled Families Evaluation. However, such infrastructure is only available for research purposes and not for sharing data to support service delivery. HMRC, for example, is reliant on its own infrastructure for the regular sharing of the VAT Register with Credit Reference Agencies. While the data now flows regularly there were initial technical difficulties in getting it up and running.

Other case studies also demonstrate that it is possible for sensitive data collected by the public sector to be shared outside of Government. Data from the National Pupil Database is routinely shared with accredited suppliers who are developing analytical tools for schools and local government. For such sharing, the Department is guided by the ONS “Five Safes” data protection framework (see Chapter 3). However, in this case, the results of the data share are not publicly available, although schools and local authorities can purchase the analytical tools created by the organisations who have received the data.

Similarly, HMRC shares non-financial VAT registration data with Credit Reference Agencies (CRAs). For CRAs to be recipients of data their security management systems must be accredited to ISO/IEC 27001 standards and certified by an independent accreditation body. Such accreditation is expected to be renewed annually.

Despite there being a number of secure research environments for data linking and analysis, different bodies continue to have varying requirements. This introduces friction in the system with organisations having to align their approaches. While this may act as an additional safeguard, the differing requirements can also be the result of different legacy systems rather than because one approach is better than another. Bodies seeking to share data must also agree how and where the exchange of information will take place. To undertake the Evaluation of the Troubled Families Initiative, MHCLG used the ONS as a trusted third party.

Use Case: Technical challenges to sharing - Ofsted research project

When Ofsted wanted to share data with the Alan Turing Institute (ATI) for research purposes, agreeing the mechanisms for physically sharing and accessing the data proved challenging. In the time available for the project, Ofsted was not able to host the data locally and so sought to use the Safe Haven environment at the ATI. This in turn needed to be technically assessed by Ofsted IT security staff. As the data left Ofsted’s infrastructure, a greater degree of redaction, rounding or removal of data was necessary to reduce the chance of disclosure. As a result, potentially valuable data was not included in the study. A more consistent approach to collaboration security, with agreed standards in place across the public sector, may have helped ease the sharing process and increased the power of the resulting model, whilst remaining safe and trustworthy.

Overcoming Cultural Barriers

Risk aversion

Cultural barriers to data sharing cropped-up in many of the case studies. Across departments risk aversion and a reluctance to share was described as a key barrier. This may be because of the legacy of high-profile data breaches continues to hang over parts of the public sector.^{[footnote 35]}

Such risk aversion may also explain why data controllers can be reluctant to share data - particularly if the benefits of data sharing are not immediately obvious to the department being asked to share data.

Leadership and Funding

The cases highlight the need for senior level buy in to support successful data sharing. In the case of HMRC, ministerial commitment was needed to pass new legislation in parliament. The perceived risks of data sharing also mean that even in cases where there are clear legal gateways, the support of senior colleagues is crucial.

Given the costly expertise often needed to facilitate data sharing (including legal and technical advice), there may be a reluctance to embark on data sharing projects. This is a particular challenge when a department or public body may not feel they stand to share in any of the benefits of data sharing.

Addressing public trust

By addressing the barriers set out above, those sharing data are also seeking to be trustworthy. Ensuring the sharing is legal is important to secure trust. Equally, data also needs to be shared in a way that is secure and meets certain requirements. However, inconsistent approaches can undermine trust. Furthermore there can also be additional challenges to trust regardless of whether the data sharing is technically legal.

Public sector bodies need to address ethical challenges when sharing data for different purposes. The way bodies consider the ethical issues at stake differs across the public sector. The Department for Education has established the Data Sharing Approvals Panel to consider requests for access to data from the National Pupil Database and other data sets it holds. This includes external representatives. However, while the Panel’s terms of reference require it to consider the ethics of a proposed data share, it has not published details of how ethical issues and public acceptability are considered.

In the case of sharing records with approved suppliers, the DfE has had to consider the ethics of freely providing sensitive data to commercial and non-profit companies which will use it to develop products to be sold to schools. The Department, by only sharing with approved suppliers, has sought to strike a balance between providing accurate data to firms and opening-up the market, with the risk of sharing sensitive data with third party organisations. Similarly, in the context of Royal Free London NHS Foundation Trust sharing data with DeepMind in 2016, there are two distinct stages to consider. Initially, data was shared by Royal Free to enable DeepMind to test the clinical safety of the Streams app. At this stage, those whose data was used may not have received a direct benefit, while DeepMind were able to certify a new commercial product (albeit one which would benefit future patients).^{[footnote 36]}

Now, data is shared via clinicians who are using the deployed app which has been proven to improve patient care.

Ethical challenges observed in the case studies

Ethical challenge	Considerations and examples
Sharing data with commercial organisations	Those requesting or sharing data need to be clear about the value gained and who is likely to benefit by balancing the public versus the commercial interest. This may include opening-up new markets and supporting innovation. Sharing data with commercial organisations may also help to ensure products are developed with high quality data rather than unreliable or incomplete datasets. In such cases it is not always clear where the benefit will accrue. Sharing data to enable a commercial company develop a new technology enables the company to be at the forefront of innovation, but those whose data has been used may not benefit directly.
Challenging cultural norms	Sharing data with particular organisations may challenge traditional cultural norms. For example, HMRC sharing data with external companies (credit reference agencies) confront traditional expectations that tax information generally stays between HMRC and individual citizens.
Linking data	The linking of sensitive data may increase the likelihood of re-identification. Insights gained from linking may have implications for individuals which they were not aware of when then originally shared data with a public sector body. The risk of re-identification and the potential breach of privacy needs to be assessed against the wider public interest.
Individual rights versus public interest	Explicit individual consent is often not sought when the public sector shares data. To do so would pose a major barrier to data sharing and affect the quality of the datasets shared. Despite transparency requirements under data protection, it is unlikely that many citizens are particularly well informed of how information about them is used and shared (although people may also assume more data is shared than currently occurs). There is an absence of an independent third party able to assess the balance of benefits and harms in respect of the sharing of particular data sets.

Engaging with the Public

The case studies reveal the different approaches taken to engage with citizens and make people aware of how information about them is used and shared. In most cases such engagement is limited.

In 2017 the ICO found that the Royal Free NHS Foundation Trust had not done enough to inform patients that information about them was processed by DeepMind during the testing phase of the Streams app.^{[footnote 37]}

As a result of the findings, the Royal Free rethought its approach and sought to provide patients and the public with more information. Patient leaflets answering common questions are now distributed across the hospitals, there is a detailed Q&A on the Royal Free website and there is guidance on opting-out displayed on notice boards. However, the public (patients) may have a different response to their data being shared with a third-party to develop a new diagnostic technology, than when their data is used by the technology to diagnose their condition. In the latter case, the benefit to the individual is more tangible as it directly affects treatment.

The MHCLG also considered public engagement before undertaking the Evaluation of the Troubled Families Programme. Obtaining informed consent from families was explored, but it was not considered practical, particularly for those in the comparison group who were not necessarily in touch with services. Local authorities were expected to inform families about the project when they joined the programme, but families in the comparison group for the evaluation were in less direct contact with services and so were unlikely to be directly informed. Privacy notices were also displayed on local authority websites and noticeboards. It is not clear how effective these are at informing the public about data use. However, MHCLG tested the wording of privacy notices with a small group of families and took on board feedback (as to whether the project and the use of privacy notices were acceptable, appropriate language and how detailed they should be) and worked closely with the Information Commissioner’s Office. While this strategy worked for gaining access to data from most departments, it hindered access to health data which could only be gathered for families on the programme who had been directly informed.

Similarly, the DfE relies on privacy notices shared with parents and pupils and schools to share data. Yet there is little evidence to suggest significant levels of awareness among parents and former pupils of the existence of the database. Little resource has been put into actively explaining to citizens how data collected is used and shared. While the Department has established the Data Sharing Approval Panel which publishes a list of agreed data-sharing projects on gov.uk, only limited information is made available.^{[footnote 38]}

There has been some public concern expressed about sharing education data, which has attracted significant media attention. An active campaign group called defenddigitalme continues to call for increased transparency to parents and a full audit of how student data is shared.^{[footnote 39]}

Public Awareness

Our case studies identify the steps taken to share data in a way that is legal and secure. This often includes several months (or years) of discussion which must also address cultural nervousness around sharing data, as well as developing complex memoranda of understanding, or even creating new legal gateways. Different bodies also have their own security policies which may not always be aligned. This results in further delays and negotiation. Such inconsistent approaches also create a confusing environment.

Despite efforts to address legal and technical requirements, it is not clear that such sharing can be said to carry public trust. While adhering to legal requirements and ensuring strong security standards are in place are an important element of trustworthy data sharing, the public are rarely engaged in the process. There may also be a concern that active public engagement may jeopardise a project if it provokes a backlash. In most instances the citizens are generally unaware and have little control over how information about them is used and shared by the public sector. Indeed, our case studies highlight how the results of a data share are difficult to find and are not easily accessible.

Data sharing and public awareness

Data sharing example	Public engagement/awareness
Sharing of education data with approved suppliers	Limited. Parents and pupils are asked to sign privacy notices as part of school enrolment but consent for specific data sharing projects is not requested. While such consent may not be practical, public awareness is likely to be low and there is a lack of understanding around public acceptability. Agreed data sharing arrangements are published on gov.uk but it is not clear that such information is regularly reviewed by parents.
HMRC sharing VAT register with Credit Reference Agencies	No direct engagement. Businesses are made aware of how their information may be used when joining the VAT register.
MHCLG Evaluation of the Troubled Families Programme	Limited. Privacy notices displayed on Local Authority websites and on noticeboards. Engagement with a small number of families to test wording. Local authorities were also expected to inform families about the project when they joined the programme.
Royal Free NHS Trust and DeepMind	Initially criticised by the ICO for not explaining to patients how information about them was used and shared. The Royal Free now provides information leaflets and notices to patients responding to FAQs.
GP records and sharing app	Differs between GP practices. GP practices which enable patients to use data sharing apps engage with patients. Patients enrolled with practices which do not provide the service may be unaware that such apps are available. GPs may also decide how much of a patient’s record is available on the app (without consulting the patient).

8.4 What can be done?

As set out in Chapter 1 recent reports have identified similar barriers to data sharing. Our summary of recent Government initiatives highlights the number of attempts made to address such barriers which have had only limited success. These initiatives have also tended to focus on overcoming legal and technical barriers rather than the overarching issue of public trust.

Barriers to data sharing and examples of current initiatives designed to address them.

Barrier	Current initiatives
Legal barriers	* The Digital Economy Act aims to provide legal gateways to sharing where they don’t already exist (while also being consistent with the ICO data sharing code). * The ICO Data Sharing Code aims to provide clarity and guidance to controllers in relation to GDPR, while also dispelling myths and clarifying misunderstandings.
Technical barriers	* The ONS Secure Research Service provides a safe environment for researchers to use sensitive public sector data.
Cultural barriers	* The National Data Strategy, led by DCMS, may address some of the cultural barriers to data sharing and attempt to address siloed working.

In addition, as the NAO and others have argued, cross-government leadership is crucial as is the need to treat data as a strategic asset. The National Data Strategy may present an opportunity for such leadership to be demonstrated. But it is also evident that this needs to be accompanied by investment. This could include a centrally managed cost recovery system that could address the disincentives for particular departments to share data if they are unlikely to benefit from the value of such sharing.

However, none of the current initiatives explicitly focuses on addressing public trust. Public acceptability of data sharing is not a prominent theme in policy work focused on driving more data sharing. It is evident that many departments can do more to address trust. This is a crucial issue, particularly given the growth of available data and the potential for significant benefits for citizens to be generated. A coherent approach and robust scrutiny is required.

CDEI and Public Trust

Data sharing projects must consider the ethical considerations and responsibilities government and the wider public sector have around personal data. While there are benefits to increased data sharing and many of the barriers are the result of inconsistent approaches, some of these barriers may function as useful safeguards.

CDEI wants to support those working in the public sector who wish to maximise the value of personal data but also ensure it is used in a way that is both effective and ethical. As well as supporting and championing the steps outlined above, CDEI’s second phase of work on data sharing will focus on the topic of trust.

Question 9

9. Tenuous Trust & Data Sharing

Accepted Answer

9.1 The importance of trust

The legitimacy of data sharing stems from the legal framework in which it is based. Yet while this provides a foundational level of democratic accountability through parliament, the legislative process is likely to be relatively detached from people’s practical experience of the public sector collecting data about them.

For this reason, government use of data requires broader societal consent and a level of public trust. Anything that undermines this public trust can hamper necessary public government functions and lead to societal harm as people disengage or even make efforts to withhold data or share incorrect information.

People’s trust can be split into two distinct factors: trust in competency, and trust in intentions. While people’s trust in different public sector institutions varies, for instance they have a large amount of trust in the NHS compared to central and local government^{[footnote 40]}, there are significant portions of the population who lack trust in either the competency or intentions of overall public sector data-use.

The Institute for Government (IfG)^{[footnote 41]} found widespread perceptions of government incompetence with data, with perceived risks from hackers (52%), accidental leaks (49%), inaccurate records (42%) and lack of skills (31%). A survey by Reform and Deloitte also found that only 17% of people felt that the government was good at keeping information safe and secure^{[footnote 42]}, compared to 30% who felt they were not.

As well as these concerns about competency, there are also those who distrust the intentions of government data-sharing and see it as threatening privacy. 52% of people in the IfG survey felt that the government would not use the data for their benefit, while 40% thought it would be used to actively discriminate against them. Similarly, Reform found 28% of people did not believe the government had their best interests at heart.

The lack of trust is exacerbated by limited clarity into what the government is doing currently. The highest factor of distrust in the IfG survey was this uncertainty, with 61% of people saying the government would use the data for purposes they would not tell individuals about. Reform similarly found 24% of people did not know how the government used their personal data. This uncertainty is combined with a feeling of powerlessness. Reform found 37% felt a lack of control over data about them, while 42% of people in the IfG survey believed they would not be able to change the data if it was incorrect.

With this in mind, approaches to data sharing need to consider the current state of public trust in data sharing and how this can be addressed.

Use Case: ONS student suicide statistics and addressing public concerns

After a large amount of media reporting and commentary of the suicide rate among university students, the ONS set out to verify whether this was a statistically significant phenomenon. This analysis required acquiring student enrolment data from the Higher Education Statistics Agency and linking it with data from death certificates to identify higher education students who had died by suicide.

The ONS had fewer legal obligations as data about subjects who have passed away is not considered personal data under data protection law. However they took a number of voluntary steps to address issues of public trust. They established a working group to develop the analysis, including Public Health England, the Samaritans, and relevant academics. Due to the sensitivity of the topic and the fact that the data subjects were no longer alive, ONS also worked with the National Suicide Prevention Advisory Group, which contains representatives of people bereaved by suicide.

Both groups provided advice on the analysis to be undertaken and around how to present the work in a sensitive way. This included agreeing the methods to prevent disclosure of identifiable information in the analysis, as well as providing advice on language and timing of the report to limit any negative impact on the student population. Not only did this engagement help make the case to the National Statistician Data Ethics Advisory Committee, who needed to approve the project, it also helped to improve the sensitivity of media coverage of the report.

The philosopher Baroness O’Neill highlights two steps to being trusted: first, acting in a trustworthy manner in a way that people want and expect; second, providing evidence of that consistent trustworthiness. Trust is dependent on an individual’s relation to the specific institutions and the particular function that institution is aiming to perform. It must therefore be built upon through each and every project.

An exploration of data sharing frameworks and principles laid out by different jurisdictions (in the UK and overseas) as well as other organisations highlight some common elements which are integral to trusted data sharing projects.^{[footnote 43]}

Trustworthy data sharing must be based on what a citizen would consider valid intentions - providing value back to the citizen or community as a whole. There needs to be specific consideration of the potential risks an individual or group might incur from the project, and there needs to be an inclusive way of weighing these benefits and risks.

Data sharing should also be done competently. This means data is secure, and it is minimised and de-identified as much as possible, so users feel sure that their individual privacy is protected from mis-use.

These principles are the basis of acting in a trustworthy way, but there also needs to be principles in place that provide evidence of this to the public.

First, data sharing should be accountable, with a reliable and understandable decision-making process, with sufficient public engagement and input.

Second, projects should be transparent, with details on who, what, and why, available publicly to enable scrutiny and give people the ability to object to decisions.

Third, individuals need to have access and control, so they can see what data is held about them, how it is impacting decisions, and have as much say over how it is used as possible.

Trust matrix: Key questions to be addressed by those seeking to share and use data

Value (and impact) Data use should provide a benefit to individuals or society that is measured and evidenced * Who benefits from the data being shared? * Who has to take on any risk?	* Is there a clear statement of the expected benefits? * How are different groups (and individuals) in society affected? * Does the benefits statement distinguish between benefits from ‘anonymous’ use of data (to produce statistics, test hypotheses, model impacts, develop potential products) and use of personal data (to deliver products to individuals or make decisions about individuals)? * Does the benefits statement clearly state how benefits, potential harms and biases will be measured?
Security * What is in place to ensure data is used securely and protects individual privacy?	* What measures are in place to prevent misuse and to control for extensions to original purposes? * Is there appropriate use of data minimisation, de-identification and privacy enhancing technology? * Is the extent of data used justified by the benefits statement? * If data is being used anonymously is there a clear definition of what this means (e.g. ONS five safes) and how it is applied? Given the differing interpretations of this term, it can help to use alternative language stating explicitly how privacy is being preserved.
Accountability (over and above compliance with the Data Protection Act) Who is responsible for decisions about data use?	* How are decisions made about acceptable levels of efficacy and safety; the trade-offs between benefits and risks, including risks of privacy invasion or bias; levels of transparency and user control? * Are the decisions and their rationale documented? * What mechanisms are in place to ensure accountability for decisions? * If individual subjects do not give explicit consent, what mechanisms are in place to ensure broader societal consent?
Transparency * To what extent is the rationale and operation of the project open to public scrutiny?	* Are answers to the issues raised in this framework in the public domain including the rationale for any trade-offs between privacy and efficacy? * Is an appropriate budget and resource in place to communicate the rationale for the project to those affected? * To what extent is the evidence of efficacy and privacy open to independent scrutiny through open source code and scientific evaluation?
Control * What role do individuals have in the decision to share data about them?	* To what extent does the project result in a product of service that delivers a benefit to individuals who can choose whether or not to use it? * To what extent does the project enable individuals to see or use any data generated themselves through for example, data portability mechanisms or the use of personal data stores?

When considered alongside these principles of trustworthy data sharing, our case studies suggest that the existing public trust in data sharing by the public sector can be described as “tenuous”. The features of “tenuous trust” include:

Ambiguous Value	Data protection impact assessments are often undertaken and published, however there is not a common understanding around the conditions under which personal data should be shared in public interest and what level of risk it is reasonable to ask of individuals.
High (but inconsistent) Security	Data is shared securely with rigorous protocols in place - although the variable security requirements differ across the public sector may sometimes become a barrier to data being shared. Many bodies apply the best practice of the ONS “Five Safes”. GDPR requires that “all the means reasonably likely” are used to ensure that anonymous information cannot be re-identified, a process which requires continuous review. Yet our case studies suggest there are inconsistent views around the sharing of identifiable data.
Limited Accountability	Data is shared across the public sector when specific powers enable it. In most cases there are specific legal gateways that are ultimately subject to parliamentary approval. Data sharing arrangements tend to be subject to regular reviews, although the exact governance frameworks vary. DfE has established the Data Sharing Approvals Panel, while other departments adopt different approaches. HMRC has clear legal gateways and had to gain parliamentary approval. There is not a universal approach to addressing the ethics of data sharing, nor are there general governance mechanisms to evaluate whether projects are in the public interest.
Limited Transparency	While it is generally possible to find limited details of data sharing projects (often on gov.uk), such information tends not to be proactively shared with the people whose data is being used. Privacy notices are also published, often on websites or displayed on notice boards. It is unlikely that they are read by many people, particularly as they tend not to be proactively shared with citizens. The benefits of data sharing are also rarely communicated to the people whose data has been used.
Limited Control	Individual citizens generally do not have control over how data about them is used and shared, and getting access to data often relies on the lengthy legal process of Subject Access Requests which also requires an awareness that the data is being held.

These findings differ across sectors, for instance health and tax data have additional legal protections. While this creates more complexity, these approaches may help maintain trust when data is viewed as particularly sensitive.

This does not mean that the public would be certain to find such examples of data sharing unacceptable. But it may mean that revelations about such sharing come as a surprise. Furthermore, if something were to go wrong and there was a data breach, the public reaction may be particularly difficult to manage.

CDEI intends to focus on understanding and advising on public trust in its next phase of work on data sharing.

Question 10

10. Addressing Trust

Accepted Answer

While CDEI intends to undertake more work in this area, further areas for consideration by government include:

Departments and other public sector bodies should consider what further steps they can take to increase transparency around how they use and share data. This may include publishing a list of data shares with objectives and outcomes.
Transparency could also be improved with publication of details of data sharing requests which have been rejected, including the reasons why.
Departments should review their internal governance arrangements for reviewing and approving data sharing proposals and ensure adequate consideration is given to ethical concerns and public acceptability.
The creation of a centralised legal resource operating cross government which can quickly identify the appropriate legal responses to data sharing projects.

CDEI hypotheses and future work

In its next phase of work, CDEI aims to explore mechanisms which may drive more data sharing in the public interest, while also addressing tenuous trust. We want to consider opportunities for the public sector to promote citizen driven uses of data. But we also want to explore the conditions under which data should be shared in the public interest.

In the first case, CDEI will focus on individual control of data and how government can enable more user driven services. Having direct control may help to address trust by empowering people to make more decisions about how their personal data is used and shared. It may also mean that they can benefit from new products and services, or support innovation.

In the second case, we will explore the way in which data should be shared in the public interest. While individual consent may not be practical, or necessary, the conditions in which such data can take place should be clear. Such clarity would support data controllers seeking to share data, but should also help to provide transparency and understanding to the wider public.

10.1 Promote new citizen driven uses of data

Promoting new citizen driven uses of data could address trust by enabling citizens to use data about themselves to secure immediate benefits. CDEI will explore whether citizen driven data uses in the public sector, by exploiting mobility mechanisms, should be given higher priority.

By identifying potential use cases, we want to test whether such an approach has the potential to strengthen trust by empowering citizens. Within this context, particularly with regard to sensitive data potentially leaving secure environments, there are some key questions to be addressed.

Questions to consider

What can be learnt from open-banking and potentially be applied to personal data held by the public sector?
Are there particular use-cases we can identify where individuals could be incentivised to manage data about themselves?
How can the government best support a broader ecosystem around this data?
How should data mobility be governed, so access to sensitive data is limited to trustworthy institutions?

CDEI will explore whether individuals are likely to become more engaged if they can transfer their data.^{[footnote 44]}

Often only a small amount of publicly held data is valuable to an individual and the rest is of more value when aggregated. For instance, it may be useful for an individual to be able to easily transfer elements of their academic record (eg to provide evidence of qualifications to a prospective employer), but a history of school absences is likely to be of more use for researchers at an aggregated level.

There needs to be a consideration of potential value, for the citizen as well as wider society, against the cost involved in implementing systems enabling individuals to share data. Some records are considered incredibly valuable to an economic sector, for instance health records, and increasing mobility and interoperability may trigger significant innovation. We will work to identify cases where access to services by individuals is hindered by challenges in moving data, and assess whether the application of mobility mechanisms could help to address them.

Alongside supporting future innovation, such sharing mechanisms may also create risks which would need to be addressed. There are often consequences of sharing data which are not immediately apparent or have consequences beyond the individual, and involve important judgments which may not be reasonable to ask of individuals. CDEI will explore the risks of data mobility and what governance is appropriate for different use cases. There may be a need for a form of portability governing body which works to address such risks and limit the possibility of data being shared with inappropriate organisations.

Finally, increased data mobility does not only mean more options for individuals. It also enables interventions on behalf of people where they may not be aware or able to move data for themselves. For instance, the Government’s Smart Data review proposes to allow charities or carers to move vulnerable people on to energy tariffs most appropriate for their usage. There may be particular cases in public sector data when individuals provide access to their data to trusted third parties and authorise them to act on their behalf. This could include promoting the role of data representatives or developing the concept of data trusts, within the context of public sector data.^{[footnote 45]}

CDEI will consider the conditions where someone other than the individual could move data or change someone’s services, and what requirements there should be about notice and consent.

The case studies in this report touch on areas where data is shared for public interest purposes. But there is a lack of agreement and shared understanding of what is publicly acceptable. CDEI will build on this and identify the conditions in which data sharing is in the public interest, while also setting out how to address public trust.

While data protection legislation and common law give the legal basis for data to be shared in the public interest, this must still be done in a way that is publicly acceptable and trustworthy. As part of this work, CDEI will consider the wider governance arrangements and review how decisions to share data could be made and whether there may be a role for an independent third party.

CDEI will consider the safeguards needed to ensure such sharing can be considered trustworthy. This may also include an exploration of the technology available to protect data and individuals’ identities - recognising that this is a fast changing area. By identifying the conditions for public interest data sharing, CDEI would aim to set out a more consistent framework designed to address the current complex and uncertain environment.

Such conditions would need to consider the protection of citizens from privacy invasion, compliance with the law and the protection of particular groups of citizens from harm. Our next stage of work will test these with the public and government with a view to establishing proposals for an obligation to share in these circumstances. CDEI will also seek to understand the impact that opt outs can have on the secondary uses of data which may be in the public interest.

Consent and Opt Outs

The NHS Opt Out was introduced following a review by the National Data Guardian into how giving patients more transparency and control could potentially build trust in data collected by the NHS. The Opt Out enables patients to choose to stop confidential information about them being used for research or planning purposes, except where it is de-identified first.

Such opt outs may create challenges as they could have implications for the representativeness of this data used for secondary purposes. If certain groups tend to opt-out of data sharing, they may not be visible when data is used to make decisions which will directly affect their care (or people like them with similar characteristics). One of the key motivations behind the introduction of the NHS Opt Out was an expectation that it may encourage more use-cases to rely on de-identified or anonymised data, rather than sharing sensitive identifiable information.

Questions to consider

How can the benefits and potential risks of data sharing be identified and assessed in relation to the public interest?
How do you effectively notify individuals/engage the public to understand what is acceptable?
Is it possible to draw a line between research and service delivery? Are there cases when this is not clear-cut?
What are the necessary governance mechanisms?
How can consistency be ensured across the public sector?
Are there different expectations when data is shared outside of the public sector?
What opportunities does new technology present to enable data sharing that protects people’s privacy?

Connected Health Cities and Public Engagement

Connected Health Cities (CHC) is a research collaboration between health and social care commissioners and providers, public health professionals, academics and local government, within four ‘connected city’ regions in North England. It was launched in 2016 as a way to encourage trustworthy sharing of health data for research at a regional level.

The programme uses ‘Data Arks’ which are regional innovation centres for health and social data analysis, where data is used to produce timely, actionable information for the care of the population the cities serve. In a delivery setting, it allowed the regional teams to study NHS care pathways and analyse integrated data, which provided system wide views of patient pathways, from which researchers could identify improvements and put them into practice.

Key to CHC’s strategy is public engagement across these initiatives designed to build a new social licence for data sharing. The National Data Guardian worked with CHC to set up a citizens’ jury and work through case studies to build understanding and, subsequently, clarify what ‘reasonable expectations’ people have in terms of data use. The key message was that if individuals’ data was to be used for improvements, they wanted to be able to see and feel the benefit therefore trust and transparency were essential. Alongside these juries, CHC ran a public education project around the social media campaign #DataSavesLives.

While challenges in the sharing of health data continue to exist, and further work is underway, the insights gleaned from the CHC’s 2020 report on the programme may be relevant when exploring trust in public sector data sharing more broadly.

In cases where individual consent is not required, trust is crucial. If citizens lose trust they may try to withhold data or provide inaccurate data which will undermine the quality of the datasets and could lead to poor research outcomes and poor policy decisions.

To address trust and drive data sharing, a set of criteria for safe sharing for assessment and public interest research purposes is required. This would be distinct from sharing data for ‘operational use’ which has different requirements. Such criteria are likely to be defined in terms of purpose and independent oversight. This report highlights that for the purposes of sharing aggregated data to produce statistics and support much research, strong governance measures (particularly given the ONS’ responsibilities under the Digital Economy Act) are in place, which are likely to help create a trustworthy environment. However, for other uses of data sharing, including the monitoring of a particular initiative by an independent third party, governance mechanisms appear weaker. This may lead to data not being shared, despite there being a strong public interest, or even data being shared without the risks being adequately addressed. Such an inconsistent environment undermines trust.

Identifying the particular conditions for such sharing is important as the boundaries between research, policy development and service delivery are not always clearcut. This risks causing confusion and nervousness and means that guarantees around how data collected will be used and shared may be undermined.

A particular focus would also be in the area of anonymised data sharing, where concerns about privacy and security need to be addressed. Privacy enhancing technologies, while not presenting a silver bullet, may offer new solutions. However it is also necessary to provide reassurances about how the findings of any research or analysis will be used - and to provide guarantees that ethical questions have been adequately addressed. CDEI will explore the concept of creating an independently governed anonymised environment or integrated data infrastructure containing a range of public sector data. The environment could be used to test, for example, the accuracy of an algorithm developed to identify children at risk. But it may also be used to support innovation that could present significant public benefit.

CDEI will consider data sharing within the context of a democratic society and identify the conditions under which citizens should expect data about them to be used and shared without their individual consent. Further work in this area will set out to support those seeking to share and use data in the public interest to do so in a way that is both legal and ethical. Our work in this space is likely to be framed within the public task element of GDPR, and set out how it can be used to maximise the value of data in a way that is trustworthy. A key element will be a consideration of secondary uses of data and individual rights to privacy versus the protection of people’s rights and their safety.

Question 11

11. Appendix 1: Case Studies

Accepted Answer

11.1 Introduction

The case studies in this report identify areas where sensitive data held by the public sector has been successfully shared. CDEI has completed the case studies by undertaking desk research (using publicly available information) as well as talking to people directly involved in the data shares.

11.2 Case Study 1 - Education Data

The National Pupil Database (NPD) holds a wide range of information about students who have attended schools and colleges in England since 2002 (about 21 million individuals). The NPD combines the examination results of pupils with information on pupil characteristics and their education and social care pathways, and is an amalgamation of a number of different datasets, including Key Stage attainment data and Schools Census data, which are linked using a variety of personal identifiers, including unique identifiers where available.

Governance & Transparency

The Department for Education’s data sharing relies upon different pieces of legislation depending on the source of data and the purpose of the processing.

Some of most common legal gateways used for NPD and other data sharing include:

Legislation within section 537A of the Education Act 1996 covers the sharing of pupil data from schools;
Legislation within section 99 of the Childcare Act 2006 covers the sharing of individual child information from early years providers;
Legislation within section 537B of the Education Act 1996 covers the sharing of pupil data from the alternative provision census;
Legislation within section 83 of the Children Act 1989 covers the sharing of children’s services data;
Legislation within section 253A of the Apprenticeships, Skills, Children and Learning Act 2009 covers the sharing of KS4, KS5 or a restricted subset of ILR data for learners in FE colleges.

Other pieces of legislation also come into play at appropriate times, and other data assets may rely on other legislation.

The DfE established the Data Sharing Approvals Panel (DSAP) in May 2018, which built on the earlier Data Management Advisory Panel by broadening focus to cover all third party sharing of individual level data by the Department. The DSAP considers applications made for all external shares of data, including data from the National Pupil Database. It includes officials from DfE (who meet weekly) as well as four external members (who attend monthly to review a sample of cases, cross government data shares, and contribute to strategic issues).

Decisions made on requests for data to be shared are published biannually on the Government website.

Since autumn 2018 most applications by researchers for de-identified data are handled via the ONS Secure Research Service. Under the changes, applicants need to be accredited by the ONS as an “approved researcher”, requiring three years of vocational experience and a mandatory training course. Research proposals must also show that the research is for the ‘’public good’’. This generally means that the research either supports or challenges key decisions which could affect people’s quality of life, mostly within public policy, or help to analyse key social trends. Researchers seeking to work on NPD data in the ONS Secure Research Service must still have their requests approved by the Data Sharing Approvals Panel.

The evidence and data obtained via the processing in NPD provide the Department, education providers, Parliament and the wider public with a clear picture of how the education system is working in order to better target (and evaluate) policy interventions to help meet the Department’s strategic objectives and ensure all children are kept safe from harm and receive the best possible education.

Source: National Pupil Database Data Protection Impact Assessment: Public Summary, Department for Education, May 2019

Purpose of data sharing

	Purpose of Data Sharing	Recipients of Data Sharing
Value creation (research)	Enable research by non-government bodies into the performance of the education system	Data shared with Education Policy Institute (a think tank) looking at achievement gaps
Value creation (service delivery)	Support children’s education e.g. collating exam results	Data shared with awarding bodies
Value creation (product development)	Provide data to open-up the market for organisations providing data analytics tools to schools.	Data shared with accredited bodies producing tools which can be purchased by schools.
Value creation (service delivery)	Support service delivery across Government by sharing with other government departments	Records shared with Home Office e.g. the administration of immigration control under the Immigration Acts.

Public concerns

Public concerns have been raised about NPD information being shared, suggesting that many parents (and pupils) are unaware that data on millions of English schoolchildren can be shared with academics and businesses. Furthermore many of the people included in the dataset will have left the education system several years ago.

An active campaign group continues to call for more transparency over what is shared, including more information for parents and pupils through subject access requests. They also call for an end to the sharing of identifiable and sensitive information to third parties, instead saying researchers could be given access to any sensitive education data in a controlled environment.

NPD is a longitudinal dataset and contains information on individuals who have left the education system. Such a dataset is important as it enables the evaluation of different learner journeys and their impact on outcomes With such longitudinal datasets, it is questionable whether people are aware that such data about them exists and may still be used and shared. Keeping communications open with data subjects for such long term data assets is hard.

The Department for Education provides schools with guidance and suggested wording for privacy notices, and recommends that the notice is made available on the school website for pupils and parents. The guidance explains the privacy notice must be made available or highlighted as part of any data collection process at the start of each school year. Such notices must cover all data processing and not just the processing of data that is sent to the Department for Education.

The Department classes data as set out below which sets out the different governance measures for the five sensitivity levels. From DfE:

Identification risk levels (DfE)

Identification Risk Level	1	2	3	4	5
Identification Risk definitions	Instant identifiers	Meaningful identifiers	Meaningless identifiers	Non-identifiers that have high re-identification associated with it	Other
Governance	All cases go to DSAP board	All cases go to the DSAP board	All cases go to the DSAP board	All cases go to the DSAP board	All cases go to the DSAP board
Typical data items with these classifications	Full Name Full Address including Postcode Email address	UPN ULN Candidate number	NPD Pupil Matching Reference Number	Sibling indicator Full postcode Date of Birth	Larger geographical areas

Note: Risk Level 6: aggregate or suppressed data (such as datasets published by the Department as official statistics). Such data shares do not come to DSAP. Where there are small numbers of individuals within the aggregated data, the appropriate levels of suppression are applied to make sure there is only an extremely remote risk of identification.

The DfE shares data with accredited suppliers who provide data services for schools and local authorities to purchase. Such services enable governing bodies and leadership teams to undertake detailed analysis of school performance and identify where to target school improvement work.

Technically, this is an early data share of performance data files that will then go on to be processed into the NPD.

Analyse school performance use case

	Approach
Objective	To open-up the market and enable accredited suppliers to provide innovative data services for schools/LAs etc. to procure, should they wish to do so. This is subject to a range of strict qualifying criteria including security standards. The Analyse School Performance accreditation programme launched in May 2017 to allow vetted third parties access to underlying pupil level data. Six suppliers receive data under the current ASP accreditation.
Timeline	This was a Ministerial decision - DSAP provided due diligence. The Ministerial decision to share data was made in April 2016. The decision to share data was published in Spring 2018. A legally bound agreement is in place between the Department for Education (DfE) and each of the suppliers who have met DfE’s security requirements.
Responding to demand	Following previous use of early access to performance data by a single supplier, DfE was aware that a potentially wider market existed which could provide a variety of services and a more competitive offer to schools. Data was historically shared with a single supplier (Fischer Family Trust), the aim of this data share was to open a new market. There are 6 accredited suppliers, however there are several other companies providing similar products which support school improvement but not using the early access to performance data.
Key players	DfE and 6 Accredited Suppliers: Fischer Family Trust (FFT), Arbor, Assett for schools, For Schools Support (FSS), Maize Education, ALPs
Identifiable information	Includes highly sensitive data about interactions with children’s services. The data includes instant identifiers.
Mechanism used for sharing the data	Secure file transfer from DfE directly to the accredited suppliers, typically 15 times a year (depending on exact timescales of data being ready this could vary).
Ethical considerations	Strike a balance between providing accurate data to firms developing tools for schools, opening-up the market, with the risk of sharing sensitive data with third party organisations. Sensitive data is shared with suppliers which they then use to produce commercial products. There is therefore a strong focus to ensure the products are ‘value adding’ and not simply making money from ‘selling schools data back to it’.

Barriers encountered and how they were addressed

Barrier	Description
Technical	None identified.
Financial	The sharing project set out to be cost-neutral, but the need to procure external expertise put additional strains on resources.
Cultural	Strike a balance between providing accurate data to firms developing tools for schools, opening-up the market, with the risk of sharing sensitive data with third party organisations.
Legal	New legal agreements (running to 50 pages) were needed and had to be signed by the Department and each of the accredited suppliers. The Department needed to procure external legal support.
Ethical/Public Acceptability	A vocal campaign group speaks out against allowing education data to leave DfE / government systems, and commercial uses of education data. More broadly, there is much recognition of the value and good delivered by re-using pupil level data, provided appropriate controls and security issues are in place.

Addressing Drivers of Trust

Value	To individuals: Limited direct benefit. However, the services developed using the data enable schools and local authorities to put strategies in place to support pupils learning and attainment. To society: The accredited suppliers receiving the data provide innovative formats/presentations of data, supported by underlying complex analytics, above those offered by the department and other data driven services (e.g. target setting). These products can be purchased by schools and local authorities. Purchasers of the products (e.g. schools) can be more confident that the tools have been built using up to date and relatively reliable data, and with methodologies consistent with DfE measures. Schools (e.g. governing bodies and leadership teams) are able to use the tools to review school performance and identify areas for future intervention. There may be wider societal benefits through improvements to the education system. To others: The data sharing opens a new market, with accredited suppliers able to use the data to develop innovative products and compete to sell them to schools and local authorities. Initial market research undertaken by DfE showed that c. 75% of schools procured the leading market supplier’s service and that it generated c. £4m per annum.
Security	Technology: Secure file transfer directly to the organisation. The Recipient must ensure any electronic transfer methods across public space or cyberspace, including third party provider networks must be protected via encryption which has been certified to a minimum of FIPS 140-2 standard or a similar method approved by DfE prior to being used for the transfer of any Departmental Data. Frameworks: For research requests to the NPD, DfE uses the Office for National Statistics (ONS) “Five Safes’’ data protection framework to make sure that the people, projects, settings, data and outputs are safe. In the context of the Analyse School Performance Data share, data recipients must adhere to the DfE Security Standards as set out in Schedule 3 of the Data Sharing Agreement, it states “the Recipient shall have achieved, and be able to maintain, independent certification to ISO/IEC 27001”. Anonymisation: Data containing sensitive and personal information is shared with the recipient (and assessed using the risk framework set out above).
Accountability	Accredited providers must sign a data sharing agreement with DfE. The decision to share the data with accredited suppliers was taken by a minister but reviewed by the Data Management Advisory Panel (which has now been replaced by the Data Sharing Approval Panel). There was a commitment to review each agreement in summer 2019.
Transparency	Details of data shared: Details are included in the DfE External Data Shares list published as part of the transparency returns on gov.uk. It says: “Data is shared to enable suppliers to develop innovative data services to support school improvement for those schools who wish to procure their services” and describes the data shared: “Pupil level progress and attainment data.” Minutes from the DSAP meetings (where decisions are taken to share data) are not published. But summaries of discussions are available and can be requested (often under FOI). Data subjects: Individuals included in the data shared are not directly informed each time the data is shared, rather the combination of privacy notices and transparency publications on .gov.uk are used. The suggested wording in privacy notices used by schools says: “The law allows the Department (for Education) to share pupils’ personal data with certain third parties, including: …organisations connected with promoting the education or wellbeing of children in England.” The details of data shared (i.e. the transparency publication) referred to in the box above are referenced in the suggested wording for privacy notices. Public understanding: Reasonable. DfE argues citizens are likely to understand that DfE holds something like NPD, and can find reasonable information online if interested, with clear links from privacy notices. Detailed knowledge of usage is very likely to be limited. Media reports suggest people are generally unaware of the dataset and that it is shared externally.
Control	Limited scope for consent.

11.3 Case Study 2 - HMRC Data (Legal Gateways)

HMRC possesses a large range of information about different individuals, organisations and bodies. This information may be useful to other Government Departments, agencies and non-government bodies. It can help them to carry out their functions more effectively or undertake work on behalf of HMRC. Parliament may then legislate to allow sharing of confidential information to such bodies. This is known as a ‘legal’ or ‘information’ gateway.

Legal power to share

Section 18(3) of the Commissioners for Revenue and Customs Act 2005 states HMRC’s duty of confidentiality is ‘…subject to any other enactment permitting disclosure.’ There are around 250 such gateways. The procedure to take when disclosing information through a legal gateway is often outlined in jointly agreed documents, e.g. memorandum of understanding or code of practice.

Governance & Transparency

New gateways require parliamentary approval (and scrutiny). However, HMRC is able to commission third parties to work on its behalf and in this case, parliamentary approval is not required.

The legal gateways are listed on gov.uk.

Sharing HMRC Data (examples)

Value creation (service delivery)	Support service delivery across Government by sharing with other Government Departments	DWP’s delivery of Universal Credit is heavily dependent on the information provided by HMRC from the Real Time Information of claimant’s pay and tax details.
Value creation (research)	Make anonymised data available for research purposes (via the HMRC DataLab)	The Datalab is open to researchers from a UK academic institution or government department. HMRC also considers applications from not for profit organisations, such as independent research bodies and now also accepts applications from commercial research groups, but only when commissioned by a Government Department. e.g. Data used by the Institute for Fiscal Studies to support research into the characteristics and incomes of business owners.
Value creation (product development)	Leverage the expertise of non-government bodies to support Government policies	Share data on VAT-registered businesses with accredited credit reference agencies.

Public Concern

Public concerns about undermining the principles of tax data being confidential and not being shared outside of HMRC were expressed when proposals to share with third parties, including credit reference agencies, were unveiled.

In 2007 HMRC admitted to losing computer discs containing details of 25m child benefit recipients. The data breach generated significant media attention.

HMRC shares non-financial VAT registration information with credit reference agencies and financial institutions to promote economic growth through improved credit scoring.

	Approach
Objective	HMRC shares non-financial VAT registration information with credit reference agencies and financial institutions to promote economic growth through improved credit scoring. Data may be used for the purpose of credit scoring, anti-fraud checking, and compliance with other financial regulations.
Timeline	It took 5 years from proposals to share VAT data being published, to the first data sharing data taking place. The Government consulted on proposals for VAT data sharing as part of the HMRC ‘Sharing and publishing data for public benefit’ consultation published on 17 July 2013. The Small Business, Enterprise and Employment Act, enacting the proposals, received royal assent on 25 March 2015. In December 2017, HMRC published guidance for organisations wishing to apply to receive the data. Prior to the consultation HMRC engaged with the Business Information Providers Association (BIPA) representing the credit reference agencies (CRAs) that hold information on businesses to get their assessment of the benefits which might result from release of VAT registration data. HMRC officials concluded that modelling using actual data would provide a significantly more reliable assessment of the potential impact and would be helpful in informing the debate and advice to Ministers. It was decided to run a research project in collaboration with certain CRAs who were members of BIPA, as these organisations held the necessary models. Data was shared with these CRAs (who were commissioned by HMRC).
Key players	HMRC and accredited credit reference agencies (e.g. Equifax and Experian).
Identifiable information	Data includes VAT registration number and contact information (as well as trading name) of 2.1m traders (on the VAT register).
Mechanism used for sharing the data	HMRC Secure Data Exchange Services (protected by encryption). This is now part of HMRC’s business as usual data sharing. Credit reference agencies receive a weekly file containing the VAT register and they can compare this with the previous week’s to track changes.
Ethical considerations	Strike a balance between the risks of sharing sensitive data with commercial organisations with the potential to open-up access to credit and support a service not provided by government.

Barriers encountered and how they were addressed

Technical	Initial teething problems with the data exchange appear to be resolved.
Financial	Credit Reference Agencies must pay an annual subscription fee (around £10k) to meet the costs of development and maintaining the data sharing arrangement.
Cultural	HMRC agreed not to share financial data. However, there was some disagreement around the definition of this. For example, credit reference agencies questioned whether sort-codes could be included in the data share.
Legal	HMRC has a contract in place for each CRA which receives data. It took 5 years from proposals to share VAT data being published, to the first data sharing data taking place. The need for new legal gateways to be approved by Parliament, is a significant barrier to the sharing of HMRC data. The Government consulted on proposals for VAT data sharing as part of the HMRC ‘Sharing and publishing data for public benefit’ consultation published on 17 July 2013. The Small Business, Enterprise and Employment Act, enacting the proposals, received royal assent on 25 March 2015. In December 2017, HMRC published guidance for organisations wishing to apply to receive the data. Prior to the consultation HMRC engaged with the Business Information Providers Association (BIPA) representing the credit reference agencies (CRAs) that hold information on businesses to get their assessment of the benefits which might result from release of VAT registration data. HMRC officials concluded that modelling using actual data would provide a significantly more reliable assessment of the potential impact and would be helpful in informing the debate and advice to Ministers. It was decided to run a research project in collaboration with certain CRAs who were members of BIPA, as these organisations held the necessary models. Data was shared with these CRAs (who were commissioned by HMRC).
Ethical/Public Acceptability	There is a long tradition of HMRC data being confidential and not shared with third-parties - particularly those outside of the public sector. The proposal to share VAT information with third parties was debated in Parliament which provided added external scrutiny, tests the use-case and ensures democratic accountability.

Addressing Drivers of Trust

Value	To individuals (people or businesses): The data sharing may help small businesses and new startups gain access to credit and finance for the first time. It may also give increased access to credit and finance to established VAT-registered businesses. HMRC’s two-year evaluation will explore this further. To society: Economic growth through improved credit scoring.
Security	Technology: The security management systems of the organisation receiving the data need to be accredited to ISO/IEC 27001 standards and certified by an independent accreditation body. This accreditation needs to be renewed annually. Frameworks: HMRC will only agree to the sharing of specified data items, not the onward sharing of the full VAT registration data, except where HMRC is satisfied that the third party is a contractor carrying out a process to support the use of the data by the original applicant. Anonymisation: The data set includes information about businesses (including registration number and name of business). Sanctions: There are sanctions for data misuse, including a criminal sanction for unlawful disclosure of information relating to identifiable businesses.
Accountability	Organisations must complete an application process which takes up to 45 days to process. The application process may include a site visit made by HMRC. The original 5 CRAs receiving the data must agree to participate in a review after the initial 2 year period to evaluate the economic benefits of sharing non-financial VAT registration data.
Transparency	Details of data shared: HMRC publishes a list of its legal data sharing gateways on its website. Data subjects: Subjects may review the privacy notice which explains HMRC’s legal rights to share data (including with credit reference agencies) Public understanding: There was a small amount of media coverage generated when the data sharing agreement was implemented.
Control	Having data shared with credit reference agencies is now a consequence of being VAT registered. An opt-out wasn’t viewed to be necessary given the safeguards put in place to protect data.

11.4 Case Study 3 - Understanding the impact of the Troubled Families Programme

In June 2013, the Government announced plans to expand the Troubled Families Programme for a further five years from 2015/16 and to reach up to an additional 400,000 families across England at a cost of £920 million.

Evaluation of the Programme

A national evaluation of the expanded Troubled Families programme was commissioned in 2015. The aims of the national evaluation have been:

To assess the level and form of service transformation driven by the Programme
To assess the impact of the Programme on the lives of participating families
To assess how the family intervention approach achieves positive changes for families 4 .To assess the fiscal, social and economic benefits resulting from the Programme

As part of aim 2, the impact evaluation, data was collected from nationally held administrative datasets. This data share is the focus of this case study.

Data Collection

The Ministry of Housing, Communities and Local Government (MHCLG) gathered data from upper tier local authorities, as well as across a range of central government departments.

Personal identifiers, such as names and dates of birth, are provided by local authorities to a trusted third party (Office for National Statistics (ONS)). The personal identifiers are matched to Ministry of Justice held offending data (Police National Computer (PNC)), Department for Work and Pensions/Her Majesty’s Revenue and Customs employment and benefit data (Work and Pensions Longitudinal Study (WPLS), Single Housing Benefit Extract (SHBE) and Universal Credit (UC)), Department for Education data on school absence, academic achievement and Children in Need (National Pupil Database (NPD)).

The data includes all families eligible for the Programme, to provide both a Programme Group and a Comparison Group. This is to allow the analysts at MHCLG to estimate the impact of the Programme.

Gathering data for the National Impact Study

(source: MHCLG)

Use Case - Estimating the Impact of the Troubled Families Programme

	Approach
Objective	An impact evaluation led by the Ministry of Housing, Communities and Local Government (MHCLG) of the Troubled Families programme by gathering data about those in the programme and a comparison group. The evaluation exercise includes the sharing of data for research and analysis purposes only.
Timeline	Data sharing agreements took two years to put in place. Data is now shared regularly as the evaluation will continue until Autumn 2020.
Key players	MHCLG led the evaluation, but commissioned ONS as a third party intermediary. Data was collected from 149 local authorities as well as four central government departments. It was estimated that the records of 1.5m individuals would be included in the evaluation dataset.
Identifiable information	Identifiable information is provided by local authorities to ONS including name, address, DOB, gender i.e. the minimum required for matching by each government department, along with local authority pseudonymised individual and family unique identifiers (LA IDs). ONS replace the LA IDs with their own unique individual and family level pseudonymised unique identifier (ONS IDs). This dataset (ONS IDs with identifiable information) is shared with other government departments who then match each record to information held by their department and create a correspondence between their unique identifier (e.g. National Insurance Number) and the ONS ID (i.e. a ‘look-up table’). This look-up table is used by a second team within the Department to find relevant attribute data they hold (e.g. benefit record). The department returns the matched data with only the pseudonymised ONS ID to ONS.
Mechanism used for sharing the data	Common law powers were relied upon for MHCLG to share the data with other government departments. MHCLG had to put separate data sharing agreements in place with each Department and agreements between MHCLG and each of the 149 local authorities. ONS was commissioned as a third party intermediary which gathered the data and shared the final (de-identified) dataset with MHCLG for analysis. MHCLG plan to continue using this mechanism as it is not clear that the Digital Economy Act would make the sharing process any more straightforward. In particular, it seems that the researchers using the data (in MHCLG) would need to be accredited, and to gain approval from the review board.
Ethical considerations	Linking personal data of people within the programme as well as those not included without individual consent (and the data covers whole families rather than individuals). The evaluation team discussed the use of privacy notices with the Information Commissioner’s Office to ensure the approach met the requirements of the Data Protection Legislation. Relying on individual informed consent could be problematic as there is an imbalance of power between the local authority (offering a service) to a vulnerable group (families with multiple complex issues). However, if families do object, local authorities are expected to remove their data at source.

Barriers encountered and how they were addressed

Technical	Good matches with nationally held administrative data are dependent on the quality of the personal data supplied by local authorities. Each government department uses a different methodology for matching the data (their own matching algorithm). This may result in differing match rates. Robust measures required for data security (transfer, storage and data handling). Secure transfer is different for each party. Local Authorities were provided with funding to employ data analysts as data collection was considered important for the programme as a whole (some LAs use predictive analytics on their data). MHCLG Commissioned ONS as a third party intermediary to produce the de-identified dataset and provide stakeholders with reassurance that adequate security measures were in place.
Financial	Government departments carry out this work on behalf of MHCLG and do not charge– they have an interest in the results as well. It is a cross-government initiative.
Cultural	Different Departments had different needs. Each Department needed a separate data sharing agreement. Buy-in of senior colleagues was essential across Departments. There can also be requirements for separate data sharing agreements for different datasets. For example, at the Ministry of Justice on access to Police National Computer data has been agreed.
Legal	Separate data sharing agreements (Memoranda of Understanding) were needed between MHCLG and all other parties (other government departments and LAs). The project design (use of privacy notices to inform families of how their data was being used) was acceptable to the parties involved, but prevented sharing with Health. Agreeing data security measures to meet requirements of data protection legislation. Significant legal expertise was needed to identify legal powers to share data, to advise on how to meet the requirements of the Data Protection Legislation. MHCLG rely on LAs to issue privacy notices to inform families of how their data is being used, rather than relying on informed consent. The project design and data security measures were acceptable to most government departments, but hindered access to health data. MHCLG agreed a Direction with the Department of Health to allow for the matching of data for those on the Programme only.
Ethical/Public acceptability	Obtaining informed consent from families was considered. In some cases local authorities can talk directly to families, but it was considered too burdensome and costly to gain consent from every family, particularly those not currently in touch with services. Privacy notices have therefore been used to inform families - displayed on LA websites and public noticeboards. Where appropriate information about the evaluation has been incorporated into privacy notices/consent forms signed by families joining the Programme. MHCLG tested out the wording of privacy notices with a small group of families and took on board feedback (as to whether the project and the use of privacy notices were acceptable, appropriate language and how detailed they should be). The opt-out was removed post GDPR because it is equivalent to offering consent to take part. People are not given the option to opt-out, but in practice if somebody objects to their local authority their details will be removed from the next data submission. MHCLG work with de-identified data to conduct the analysis.

Addressing Drivers of Trust

Value	To individuals: In the longer-term individuals may benefit by being part of programmes grounded in a sound evidence base To society: To provide both central government departments and local public services with valuable information about the impact and cost benefit of their investment in the Troubled Families Programme and to inform future service reform and investment decisions. To others: MHCLG were hoping to provide a regular flow of information to local authorities, but the quality of some of the data as well as timelags in some datasets have hindered this work. Instead the findings are published regularly and have been fed back in discussions with local authorities with the aim of improving practice.
Security	Technology: Measures to mitigate the risk of data loss or data breach, based on ICO requirements, are in place and agreed with ONS, local authorities and each Government Department involved in the data share. Frameworks: LAs have been issued with guidance by the Troubled Families Team at MHCLG around standards for handling, storing and transferring the data. These are contained in the data sharing agreements. Anonymisation: Identifiable information was gathered from local authorities and shared with other government departments. The main dataset used by researchers contained only de-identified data, including pseudonymised unique identifiers.
Accountability	A Data Protection Impact Assessment sets out the project design, the data being collected and linked. It also sets out the legal powers and how the project meets each principle of the Data Protection Act/GDPR.
Transparency	Publicly available details of the data sharing exercise: The results of the evaluation have been presented to conferences. Privacy notices outlining the details of the projects are publicly accessible. The right to be informed: Privacy notices are issued by local authorities. MHCLG also issued a privacy notice. It is not clear how many people read these. Public understanding: Low public knowledge of the evaluation can be assumed. Little public engagement or media activity was undertaken.
Control	Individuals are not explicitly given the option to opt-out from having data about them shared as the processing is deemed necessary for the performance of a public interest task.

We (EY) estimate that the 55 million patient records held by the NHS today may have an indicative market value of several billion pounds to a commercial organization. We estimate also that the value of the curated NHS dataset could be as much as £5bn per annum and deliver around £4.6bn of benefit to patients per annum, in potential operational savings for the NHS, enhanced patient outcomes and generation of wider economic benefits to the UK.

Source: Pamela Spence, EY Global Health Sciences and Wellness Industry Leader and Life Sciences Industry Leader

Individuals have a right to access their own health records, and in limited circumstances, access to the records of other people. The Government has made a commitment that patients should gain access to their health records within 21 days following a request.

The Government has encouraged the NHS to make better use of technology, so that patients can manage their own healthcare needs, whilst ensuring that data remains safe at all times. It has also committed to making all patient and care records digital, real-time and interoperable by 2020.

Patients also have the right to request that their confidential information is not used beyond their own care and treatment, to have their objections considered, and, where their wishes cannot be followed, to be told the reasons including the legal basis.

On 25 May 2018, NHS Digital launched the national data opt-out programme, a tool that allows patients to choose to opt out of their data being shared outside of the NHS.

Governance & Transparency

NHS Digital (the Health and Social Care Information Centre) are the guardians of patient data, making sure it is protected and handled securely. It is their role to ensure data is only used for the good of health and care, and that patient data is always protected. NHS Digital also provides advice to the health and care sector on keeping data safe.

In December 2018 the Health and Social Care (National Data Guardian) Act 2018 was passed. The law placed the National Data Guardian for Health and Social Care (NDG) role on a statutory footing and granted it the power to issue official guidance about the processing of health and adult social care data in England. Public bodies such as hospitals, GPs, care homes, planners and commissioners of services are expected to take note of guidance that is relevant to them. As will organisations such as private companies or charities which are delivering services for the NHS or publicly funded adult social care. The NDG may also provide more informal advice about the processing of health and adult social care data in England.

Value creation (research)	A study looking at GP records in the Clinical Practice Research Datalink has helped to conclusively show that there is no evidence of a link between the MMR (measles, mumps and rubella) vaccination and autism. GP practices opt to be involved in the CPRD. Data was de-personalised before researchers are given access.
Value creation (service delivery)	Information from individual GP records can be shared electronically with hospitals so that they can provide better care. Summary Care Records (SCR) are an electronic record of important patient information, created from GP medical records. They can be seen and used by authorised staff in other areas of the health and care system involved in the patient’s direct care.
Value creation (product development)	Medical information can be used to develop new AI technology to improve diagnosis and treatment Working in collaboration with clinicians at Moorfields Hospital, DeepMind developed AI technology which can automatically detect eye conditions in seconds and prioritise those patients in urgent need of care, matching the accuracy of expert doctors with over 20 years’ experience.

Public concerns

Patient-doctor confidentiality is one of the cornerstones of medical practice.

From 25 May 2018 patients have been able to choose to stop their confidential information being used for purposes other than their own care and treatment. This choice is known as the national data opt-out.

A Memorandum of Understanding agreed with the Home Office in 2017 provoked a backlash. It allowed NHS Digital to pass information about patients to the Home Office, where the individual was suspected of an immigration offence (only demographic and administrative details were shared). The Government withdrew the MoU in May 2018.

GP patient data “is an incredibly rich source of intelligence that can inform high-quality medical research, and help with planning NHS services, both of which can ultimately benefit patient care… But patients must be comfortable with their data being used in this way, and confident that it will be used appropriately.

Source: Professor Helen Stokes-Lampard, Chair, Royal College of GPs

	Approach
Objective	The Royal Free started a project with DeepMind to help detect acute kidney injury.
Timeline	In November 2016, Royal Free and DeepMind announced a 5 year partnership. The Streams app was rolled out at the Royal Free in early 2017. In July 2019, a service evaluation of the app showed that patient care can be improved, and health care costs reduced, through the use of the app. In September 2019, the DeepMind Health team transitioned over to Google Health.
Key players	Royal Free London NHS Foundation Trust (Data Controller) DeepMind Technologies Limited (Data Processor) - the DeepMind Health team transitioned over to Google Health in September 2019 NHS patients of the Royal Free London NHS Foundation Trust.
Identifiable information	To provide Streams, DeepMind Health (now Google Health) processes approximately 1.7m patient records for the Royal Free in line with the Data Controller’s instructions. Personal information, including name and DOB, is held separately from all other data.
Mechanism used for sharing the data	The data-sharing process was first governed by a partnership agreement signed between DeepMind and the Royal Free London NHS Foundation Trust in September 2015, which was since superseded by a Services Agreement and an Information Processing Agreement signed in November 2016.
Ethical considerations	Tension between sharing personal medical data with a commercial company and the development of new data driven technology which improves patient care.

Barriers Encountered and how they were addressed

Technical	Streams is dependent on the real time feed from the Royal Free London integration engine to provide timely alerts and patient results. The Streams app is dependent on WiFi or mobile data signal.
Financial	None identified.
#Cultural	The partnership proved controversial to some privacy campaigners and attracted significant media attention, not least because of DeepMind’s connection with Alphabet.
#Legal	Royal Free originally took the view that this use of patient data was part of direct care and so explicit consent was not needed. ICO was not satisfied that there was a legal basis for this use of patient data to develop the app and ruled in July 2017 that the NHS Trust had failed to comply with the Data Protection Act - raising concerns about how much patients knew about what was happening. The ICO issued the Royal Free with an undertaking, requiring the Royal Free to take certain remedial steps that included commissioning an independent report into Streams to address its concerns. In 2016 the two organisations agreed, and published, a more comprehensive contract (now used as a model for data sharing with other organisations). The Royal Free also took steps to update its website and communications materials to provide its patients with more information concerning Streams and the relationship with DeepMind. In 2018 an independent report confirmed that the Royal Free has stayed in control of all patient data, with DeepMind confined to the role of “data processor” and acting on the Trust’s instructions throughout, and that DeepMind has only ever used patient data for the purposes of delivering Streams. No issues have been raised about the safety or security of the data. In July 2019, the Information Commissioner’s Office recognised that the Royal Free had completed all actions required in the undertaking and that there were “no further outstanding concerns regarding the current processing of personal data within Streams”.
Ethical/Public Acceptability	DeepMind admitted it was a mistake not to publicise its work when it first began working with the Royal Free in 2015. It went on to proactively announce and make available to the public the contracts for subsequent NHS partnerships. DeepMind also ran a patient engagement strategy as part of its NHS partnerships.

Addressing Drivers of Trust

Value	Individual: The developed app delivers improved care for patients by getting the right data to the right clinician at the right time. The Streams app was developed by DeepMind in close collaboration with clinicians at the Royal Free London NHS Foundation Trust, using mobile technology to automatically review test results in order to determine whether a patient is at risk of developing, or has developed, Acute Kidney Injury. If a risk is detected, Streams sends a secure smartphone alert to a clinician, along with information about that patient’s previous conditions so they can make an immediate diagnosis. Streams puts the data securely into the clinician’s hands rather than keeping it on desktops in silos, thereby allowing it to be viewed on the move so that clinical intervention can take place earlier with the hope of minimising patient deterioration. The information on the Streams App is presented graphically which allows better engagement with the patient who can see their own results vary over time. In July 2019, the results from a peer-reviewed service evaluation of the app at the Royal Free London NHS Foundation Trust were published in Nature Medicine and JMIR. The findings showed that the app improved outcomes, reduced costs and enhanced both clinician and patient experience. Community: Patient information is contained in one place – on a mobile application.This reduces the administrative burden on staff, saving them the time of logging onto multiple desktop systems and means they can dedicate more time to delivering direct patient care Others: The app improves NHS efficiency. Consultants can use the Streams app when on duty and on-call, including when on-call from home. This allows them to deal with issues that arise when they are not in the hospital as they have the relevant information available to them. It also reduces the risk of clinicians sharing information by other communication channels, such as text message or WhatsApp.
Security	The independent review into the partnership, by Linklaters, concluded: “The combination of the access controls to the iPhone itself (particularly the use of a PIN code, encryption and the use of AirWatch) together with the access controls to the Streams App (such as the need to log into Streams using their normal username and password and the timeout on the App) provides a significant degree of protection to the information accessible via the Streams App.”
Accountability	Security and access controls are in place to ensure that data is only accessed by a limited number of verified staff who have a legitimate reason for accessing the data. All access to this data is logged and is auditable.
Transparency	In 2017, the ICO concluded that Royal Free London NHS Foundation Trust had not done enough to inform patients that their information was being processed by DeepMind during the testing phase of the app. Following the investigation, The Royal Free undertook to provide patients and the public with more information. This includes a detailed section and Q&A the Royal Free website and an animation film which explains what happens to their information. The Trust also displayed guidance about how patients can opt out if they do not want their information to be shared with third parties. Patient information leaflets which answer common questions about how information is used and shared have also been distributed across hospitals.
Control	Patients have the right to opt out of having their data processed via Streams, but are informed that this may affect the quality and safety of the care they receive.

Use Case #2: Patient Access

Background

Under GDPR, patients have a right to make a subject access request (SAR) to access their GP record. Such a request must be free of charge and processed within a month. However, Practices may be able to comply with a SAR by offering to provide a patient with online access to their health records, where available.

Since 2015, NHS patients have been able to sign-up to GP online services and use a website or app to:

book or cancel appointments online with a GP or nurse
order repeat prescriptions online
view parts of your GP record, including information about medication, allergies, vaccinations, previous illnesses and test results
view clinical correspondence such as hospital discharge summaries, outpatient appointment letters and referral letters

The service must be available for free to anyone who is registered with a GP practice. However, the services available to patients depend on the system used by their GP practices. As data controllers, GPs also decide how much of the record may be accessed. One of the largest systems is Patient Access (provided by EMIS).

	Approach
Objective	Make it easier for patients to view and share their GP medical records with other players in the integrated health system e.g. local authorities, charities providing care services, physiotherapists etc.
Timeline	Since 2015, all NHS patients have been able to sign-up to GP online services.
Key players	GP practices are the data controllers Patients (8 million patients use Patient Access) Non-NHS care providers EMIS Group plc (a healthcare clinical provider delivering software systems and services to the primary, community and secondary care sectors).
Identifiable information	Yes, GP records containing identifiers and sensitive medical information.
Mechanism used for sharing the data	In many cases an app is available enabling patients to view (sometimes limited elements) of their record and share this with others.

Barriers encountered and how they were addressed

Technical	Different practices use different systems which can create challenges around interoperability.
Financial	The Patient Access software requires patients to prove their identity by visiting their GP Practice to obtain an access code. Other software providers use online digital checking services, but this comes at a cost per patient (paid by the software provider).
Cultural	Medical records are viewed as being particularly sensitive, but have also been treated as records that are owned by clinicians, rather than information which individual patients can control.
Legal	GPs are the data controllers and can therefore restrict how much of the patient record can be viewed and shared. As data controllers, GPs make decisions about which organisations access the patient records they hold. GP practices are expected to consider which organisations in the local health community are providing care to their patients and whether information sharing would improve the delivery of care, and is necessary for direct care.
Ethical/public acceptability	Patients may not have the knowledge to interpret the records correctly which may cause unnecessary anxiety or misunderstanding.

Addressing Drivers of Trust

Value	Individual: Patients can make appointments via the app and share parts of their GP record with other providers to support their care and treatment. Community: Providing patients with their GP records through mechanisms like Subject Access Requests can be bureaucratic and costly. The digital system makes this easier and also provides a clear audit trail.
Security	None identified.
Accountability	GP practices are the data controllers and are responsible for the data shared.
Transparency	Details of data shared: GP practices can control how much of a patient record individual patients are able to use view and share. Data subjects: Patients can track and control who is able to see their medical record. Public understanding: Patients need to sign-up to use the digital service.
Control	By providing patients with more direct control over the sharing of their medical information individuals may be empowered and better placed to give consent - as an alternative to implied consent. The GP is the data controller and can decide on the level of information to be made available at an individual level. Some GPs have made the decision to turn on some elements of records (summary care, GP consultation summaries but no correspondence, for instance) for all patients and not currently to provide any more detail. Some GPs provide this partial access to all patients but will provide further information at patient request and after review of the record. A few GPs have made the decision to provide complete access for all. There is also the option of Proxy Access (currently being rebranded as Family and Carer access). This allows a third party to act on behalf of a patient (with their express permission, confirmed by the practice, if they have capacity to give consent). For GP practices using such software, patient records are held in a data centre. Patients cannot opt out of having their data held in the data centre; if a practice agrees to stream data this applies to all records. However, having patient records stored in the data centre does not automatically mean that other organisations can view them; by default organisations should only be able to see their own records even though they are held in the same data centre.

Question 12

12. Appendix 2: History of Government Initiatives

Accepted Answer

Efforts to drive more data sharing are not new. Neither are concerns about privacy harms. By exploring previous attempts to increase sharing, we can put the current situation in context and understand how past initiatives have attempted to move this debate forward.

This is a history of general public sector approaches to data, and does not aim to cover sector-specific data policy, such as within health, tax, or education.

12.1 Modernising Government (1999-2005)

In 1997, the new government entered office with a plan for reform of public services to take advantage of the “information age”. The focus was on the creation of integrated user-focused public services, all accessible digitally within the following decade.

Modernising Government 1999

When it came to the role of government data, the Modernising Government white paper in 1999 called out risk aversion and organisational siloing, and pointed to the creation of data standards and the need for a clear legal basis for sharing data between departments. At the same time, the government was aware of potential privacy issues with this new initiative, highlighting that all data sharing should be transparent, that privacy enhancing technologies would be used, and the Data Protection Registrar (the forerunner of the ICO) would have powers to assess all these government systems.

The Privacy and Data Sharing report by the Cabinet Office’s Performance and Innovation Unit expanded on this vision of data-sharing as a way to realise more joined-up, targeted and efficient government services, and highlighted specific missed opportunities. The examples discussed included identifying children at risk, giving individuals access to health records, and sharing legal aid eligibility information to the Ministry of Justice from the Department of Work and Pensions. The report surveyed civil servants to understand their experiences of barriers to sharing and found both legal uncertainty and technical issues as the primary barriers, combined with a lack of funding.

The most substantive recommendation was for primary legislation that would allow public bodies to share any data with individual consent, and to share without consent using secondary legislation with safeguards and parliamentary scrutiny. The report also proposed a “Chief Knowledge Officer” in departments and Local Authorities to look beyond legal compliance and plan the sharing and use of data strategically.

The report also highlighted the necessity of public trust and adequate safeguards to achieve these goals. In response the Public Services Trust Charter was drafted, which set guidelines for transparency and consent wherever possible in public sector data sharing. It was proposed that services following the charter could have a kite-mark.

Department for Constitutional Affairs (DCA) 2003-2005

The responsibility for this work fell to the Department for Constitutional Affairs due to its responsibility for Data Protection and Freedom of Information legislation. DCA ran two exercises: first, a public opinion survey to investigate whether individuals understood what was happening with their data; second, a legal analysis to understand the legislation that was necessary to enable this data sharing.

From its public survey, DCA found that most people were unaware of which data was collected or when they did, they were concerned that they did not know what it was used for, and that they had no control over it. However when presented with specific data sharing scenarios, the majority of people felt it was acceptable.

Through its legal analysis, DCA discovered that the legal barrier to data sharing was often one of perception, and from a misunderstanding of what was covered by the Data Protection Act. DCA took the view that specific legislation was not needed to enable sharing with consent. However, they still planned to draft a data sharing bill to give Secretaries of State a general power to set up legal gateways for data sharing via secondary legislation when they were sharing without consent.

Such a bill never progressed. Instead DCA work in this area shifted to focusing on the Public Service Guarantee, and associated toolkit, giving the public sector and individuals more clarity on the existing data sharing rules.

12.2 Transformational Government (2005-2010)

Data sharing became a focus again as part of the “Transformational Government: Enabled by technology” agenda after the 2005 election. The strategy, owned by HM Treasury, aimed to consolidate the mass of IT projects and websites procured in the Modernising Government era. The focus was on people-centred design, shared services, and increasing recruitment of IT professionals into government.

As part of this strategy, a Data Sharing Ministerial Committee (MISC 31) was set up. It released a position statement that “information will normally be shared in the public sector, provided it is in the public interest”. This statement prompted some backlash from interest groups and legal experts, as it was seen as a move away from a policy where data sharing required specific justification.

In 2006 the Committee released an Information Sharing Vision Statement, focusing heavily on the potential of data sharing, from protecting vulnerable people to fighting crime. The statement also discussed trust, but solely in terms of confirming that any project would need to be in accordance with the Data Protection Act and work closely with the ICO.

The timing of these changes coincided with other centralised data projects such as the planned development of the National Identity Register designed to underpin Identity Cards, creating a single source of truth for citizen’s personal details within government. These projects provoked significant public and political backlash and did not progress beyond 2010.

In 2007, Prime Minister Gordon Brown made a speech promoting a number of policy proposals on various individual rights issues. One of the measures announced in this speech was an independent review of how data is governed, specifically the sharing of data across government. Soon after the speech, HMRC lost two discs of personal data covering 25 million citizens, leading to increased scrutiny of government data handling and framing this policy work.

This Data Sharing review was led by the Information Commissioner, Richard Thomas, and the Director of the Wellcome Trust, Dr Mark Walport, and became known as the Thomas-Walport review.

While the report found some evidence of legal barriers, like the DCA previously, they saw other barriers as more important. In the words of the report:

There were few specific examples of situations where essential data sharing was being prevented by the legal framework. . . . The barriers, therefore, are most often cultural or institutional – an aversion to risk, a lack of funds or proper IT, poor legal advice, an unwillingness to put the required safeguards in place or to seek people’s consent.

Source: Data Sharing Review Report, p. 46

To address ambiguity and risk-aversion, the report called for a statutory Data Sharing code of practice from the ICO. It also pressed for greater accountability within organisations through the use of Privacy Impact Assessments, as well as increased transparency through better privacy notices, increased individual access to data, and explicit consent wherever possible.

Despite highlighting cultural barriers to sharing, the paper proposed major legislative change for a fast-track data sharing power. This would allow Secretaries of State to use secondary legislation to amend or repeal primary legislation or create a new power when it was necessary to unblock data sharing.

The report also called for the creation of ‘safe havens’ for academic research, where data is de-identified as much as possible, access is only available to approved researchers, and there would be heavy fines for misuse. In the government response, they pointed to a number of systems, including the ONS Secure Data Service, which were already in place, or actively in development.

Coroner’s and Justice Act 2009

The recommendations for fast-track data sharing were originally included as part of the Coroner’s and Justice Act 2009 through an amendment of the Data Protection Act. However they received opposition from a number of civil society groups including a number of medical professional bodies worried about the sharing of NHS data.

The legislation was also sharply criticised by the Joint Committee on Human Rights, which highlighted how the powers were very broad, and no safeguards were included except an implicit understanding that usage would be compatible with the Human Rights Act.

These items were subsequently dropped from the legislation. The only measure from the Thomas-Walport report that was included in the final Act was the requirement for the ICO to create a Data Sharing code of practice, which was subsequently published in 2011.

12.3 Government ICT Strategy (2010-present)

Under the new coalition government, there were calls for a “revolutionary” shift in how government IT functioned.

Government Digital Service 2010 (GDS)

The Government Digital Service was set up as a way to organise IT projects across government and move away from siloed legacy systems. Overall, the goal of GDS was to create “government as a platform”, and data was an important part of the central standards needed to drive this. It was felt that custom-made IT systems made it impossible to share data, and in some cases even left rights over that data in the hands of private companies.

This policy meant that data standards were included in GDS guidance on procurement of new IT systems. As well as this guidance, GDS created a data register in an attempt to standardise how departments referred to the same information by creating central core datasets of non-personal information, for instance a list of countries, that are then accessible to be referenced across departments.

When it came to personal data, GDS did not look to centralise management as the government was no longer proceeding with the creation of a National Identity database. Instead they started development on a distributed system for verifying an individual’s identity across government departments using trusted third parties, known as GOV.UK Verify.

Case Study: Document Checking Service

GOV.UK Verify is a standardised service which verifies an individual’s identity to access government services, for instance when applying for Universal Credit or a state pension, accessing a personal tax account, or viewing driving license information. The service relies on a handful of certified third-parties—either dedicated companies, or services run by banks or the post office—in order to overcome data sharing restrictions within government and avoid a centralised database. Instead, these third parties each use different sources to verify an identity, choosing between mobile phone providers, credit agencies, or a UK driving license or passport, and the government department only receives the verification notice.

Verify had the goal to create a single identity assurance system across government, but so far it has not achieved its ambitions. It has few government departments committed to the program, and has achieved less than one-sixth the usage originally predicted. The Public Accounts Select Committee characterised it as a failure of the Cabinet Office and GDS to get sufficient buy-in from departments to build a product that fit their needs.

Open Data White Paper 2012

GOV.UK Verify was part of a broader push in data strategy towards decentralised management of personal data. In the vein of the ‘Big Society’ agenda, the focus of data policy moved from creating joined-up government work to instead enabling the private sector to innovate using valuable datasets controlled by the government. In the Government ICT Strategy 2011, data was referenced as a way to “encourage businesses and social providers to develop new market opportunities.”

A prominent intervention was the “Open Data: Releasing the potential” White Paper in 2012. The priority in the paper was the use of data to hold government accountable, but it also touched on the need for individuals to have access to their own data to increase citizen choice and drive innovation.

While there was a renewed focus on making data accessible to citizens and private organisations, some proposals on government use of data were continued from the Thomas-Walpot review. They called for the need for personal data to be used for research through ‘safe havens’, and the report recognised administrative, legal and cultural barriers to inter-departmental sharing. In these areas the paper awaited a report from the new Administrative Data Taskforce to look into how data-sharing could increase efficiency in public services.

Improving access 2012

The Administrative Data Taskforce released its final report on how personal data can be used for research and evidence-based policy in the UK.

It called for secure data centres to be set up for nationally accredited researchers, and the creation of a new generic legal gateway for sharing public sector administrative data for research purposes, based on the proposals in the Thomas-Walport review.

Shakespeare review 2013

At the same time as the Open Data White Paper was being written and the Administrative Data Taskforce set up, the government set up a Data Strategy Board chaired by Stephan Shakespeare, the CEO of YouGov. This board was tasked with exploring the economic opportunities of data and advising the Cabinet Office minister on how to create economic growth from public sector data.

Shakespeare ran an independent review of Public Sector Information, including recommendations for how to maximise its value for the economy and take advantage of the next phase of the digital revolution centred on data-driven technology. The goal was to move the government’s attitude to data ‘from a transparency policy, to a growth strategy’.

Key to Shakespeare’s recommendations was its focus on citizen ownership of public sector data, and the need to focus on creating value for them.

midata

midata was an initiative set-up by the government in 2011 to coordinate portable personal data across industry. Its goal was to increase competition and consumer choice, while giving individuals more control and trust in the use of their personal data.

The initial program focused on personal current accounts, credit cards and credit reports, energy, and mobile phone companies. Users could download their details in a standardised format, and upload it to other providers or comparison services.

This voluntary system never became standard practice due to limited uptake and large-scale impact. Eventually the Competition and Markets Authority enforced much deeper interoperability through Open Banking—a more successful model which may be replicated in other regulated industries.

The report also called for broader use of ‘safe havens’ to enable a sandbox of public sector data for external users. The attitude was that the perfect should not be the enemy of the good, and that data should be put into the open, even if imperfect. Instead of relying on strict guidelines for data openness, ethical use should focus on increasing penalties for those misusing data: ‘impose consequences on the burglar not the builder’ (p.14-15).

All of these recommendations fell under a wider call for a National Data Strategy to coordinate the use of public sector data.

Law Commission Report 2014

Based on suggestions from Chief Police Officers, the Law Commission looked at the issue of data sharing in their eleventh programme of law reform, approved by the Lord Chancellor in 2011. The Data Sharing Between Public Bodies report looked at the legal framework for data sharing, and the disparate interpretations and practices based on that framework.

The report agreed with prior reports that strict legal barriers are often not the problem, but said the “unnecessary degree of legal complexity” created confusion and impeded transparency into what was happening. It also led to the creation of unnecessary new gateways as a way to provide certainty for a specific project. They called for a reduction in the number of legal gateways, guided by high-level principles about what is appropriate to share, rather than specific gateways made for a particular project.

To develop on this report, the government looked to engage in “open policy making” by publishing a Data Sharing discussion document in 2014 on potential legislation, and engaging with a number of civil society groups with privacy and civil liberties concerns to hear how they would amend the proposals.

This discussion document focused on three use-cases that the government wanted to unlock: research and statistics through the ONS, tailored public services, and better management of fraud, error and debt.

Through engagement with civil society groups, some safeguards were proposed. The new legislation would only allow data sharing for specified public bodies; there were requirements that public service goals should be to the benefit of the individual and not punitive; and statutory codes of practice were required for every use-case.

Better use of data 2016

This process culminated in proposed legislation included in the Queen’s Speech 2016, alongside a “Better use of data” public consultation on these proposals. The response to the policy was generally positive, but highlighted the need for clear accountability and transparency mechanisms, especially in use-cases involving individuals. There were concerns about data being shared to non-public bodies. A number of people also called for citizens to be able to control their own data, change it and revoke consent at any point.

Digital Economy Act 2017 (DEA)

This legislation was eventually included in the Digital Economy Act 2017 which established new legal gateways for data sharing.

In the DEA, any data sharing project must be pursuing one of four possible purposes:

The improvement, targeting, or monitoring of a public service by specific public bodies, as long as it has an approved objective aimed at improving the wellbeing of an individual or household
The sharing of civil registration data (births, marriages and deaths) for the use of other public services
To identify or act on fraud, error or debt in public finances.
To carry out research that is in the public interest, with projects and researchers approved by the Statistics Board

Each of these purposes is similarly regulated with statutory codes of practice, requirements for data sharing agreements, and a public register of projects.

The initial public service objectives were to identify households struggling with multiple disadvantages and to address fuel and water poverty.

Data Sharing and the Troubled Families Programme

A local authority could look to access data held by the local police force and school to identify individuals or households eligible for support under the troubled families programme. The local authority has a lawful power to share data under the Act as the proposed information sharing is consistent with the multiple disadvantages objective and the bodies that the local authority wishes to share data with are present on Schedule 4. As the power is permissive, the local authority will still need to agree with the other bodies to share information for this purpose and draw up an appropriate data sharing agreement and ensure that it is compliant with the Data Protection Legislation. The consent of citizens is not required.

Data Protection Act 2018 (DPA)

With the introduction of GDPR in 2018, new regulations were enacted for public sector data use, alongside increased public awareness of data protection.

One of the requirements under GDPR is that public authorities/controllers must have a valid legal basis for processing/sharing personal data. Public authorities are often able to use the ‘public task’ basis rather than explicit consent, as it allows for either carrying out a specific task in the public interest laid down by law or exercising official authority which is laid down by law. The DEA is one example of a legal power to share data, however the GDPR still requires that any data sharing is transparent to the data subject, and is proportionate to a limited purpose.

ICO Code of Conduct for Data Sharing

Misconception: We can only share data with people’s consent.

Reality: Not always. You can usually share without consent if you have a good reason to do so. However, there are some cases where the impact on individuals might override your interests in sharing, in which case you might need to ask for their consent.

You should bear in mind ethical factors in addition to legal and technical considerations when deciding whether to share personal data.

Source: Data Sharing Code of Practice - Draft Code for Consultation (2019)

12.4 National Data Strategy (planned for 2020)

This strategy is being developed by DCMS. It will look across the private and public sectors to see how policy can improve access, efficiency, and trust. The development of the Strategy is focused around the themes of People, the Economy, and Government.

Ipsos MORI, Public attitudes to the use and sharing of their data (2014) ↩
Deloitte and Reform, The State of the State 2017-2018 ↩
30% of people trusted the central government to use personal data ethically in a 2019 survey by the Open Data Institute and YouGov. ↩
OECD, Rebooting public service delivery, how can open government data help to drive innovation? (2015) ↩
The Deloitte and Reform report found that the top cited reason for lack of trust in the government’s use of data was “I don’t feel I have control over my personal data”. The third most cited reason was “They do not necessarily have my best interests at heart”. ↩
ICO, What is personal data? ↩
Ibid. ↩
Note that an individual would still have the right to access and transfer their data even if data sharing arrangements were in place. The ICO’s Data Sharing Code will explain this interplay between data sharing and individual information rights. ↩
iStandUK, Identity and Attribute Exchange ↩
Referenced in this letter to the Secretary of State for DCMS from independent research organisations ↩
The Digital Economy Act includes public service delivery powers to provide a legal basis for some of these use-cases. ↩
Note that from 1st April 2020 NHS Digital is expected to hold prescription data so the processes outlined may change. ↩
Reform, Sharing the Benefits: How to use data effectively in the public sector (2018); National Audit Office, Challenges in using data across government (2019); Public Accounts Committee, Challenges in using data across Government inquiry (2019) ↩
ICO, [Data Sharing code of practice - Draft code for consultation] (https://ico.org.uk/media/about-the-ico/consultations/2615361/data-sharing-code-for-public-consultation.pdf) (2019) ↩
See for instance, the Data Glossary for Connected Health Cities ↩
National Audit Office, Challenges in using data across government (2019) ↩
Ibid. ↩
Richard Thomas and Mark Walport, Data Sharing Review (2008); ↩
Royal Society, Protecting Privacy in Practice (2019) ↩
UK Statistics Authority, Data Ethics ↩
NHSX, Artificial Intelligence: How to get it right (2019) ↩
Open Data Institute and YouGov, Attitudes towards data ethics (2019). ↩
National Data Guardian, Consultation response (2019) ↩
Department of Health, Your Data: Better Security, Better Choice, Better Care (2017) ↩
Note that the data sharing must be necessary for those purposes. If there’s another way to achieve the same end that interferes less with privacy it’s not “necessary”. ↩
See common misconceptions of GDPR e.g. ICO Data Sharing code of practice (pg 13) ↩
ICO, Consent ↩
ICO, Public Task ↩
Law Commission, Data Sharing Between Public Bodies (2014) ↩
Technically, this is an early data share of performance data files that will then go on to be processed into the NPD. ↩
Department for Education, Data sharing and approval panel - Terms of Reference (2019) ↩
ICO, Royal Free - Google DeepMind trial failed to comply with data protection law (2017) ↩
Subsequent Education Acts gave the department the powers to collect data from schools and local authorities. ↩
Patients have a right to access their GP record, but in some practices they would need to make a formal request rather than have immediate access upon enrolment. ↩
Including, for example, the loss of computer discs containing the details of 25m child benefit recipients by HMRC in 2007. ↩
DHSC subsequently released a framework and set of principles designed to ensure the value of these research projects is realised for patients and the public. ↩
ICO, Royal Free - Google DeepMind trial failed to comply with data protection law (2017) ↩
Minutes of the DSAP meetings may be obtained via Freedom of Information Requests. ↩
defenddigitalme, About the campaign ↩
Open Data Institute and YouGov, [Attitudes towards data ethics[(https://theodi.org/article/nearly-9-in-10-people-think-its-important-that-organisations-use-personal-data-ethically/) (2019). ↩
Ipsos MORI, Public attitudes to the use and sharing of their data (2014) ↩
Deloitte and Reform, The State of the State 2017-2018 ↩
There is a wealth of academic literature on the topic of trust, for the purposes of this paper, we have adopted a high-level approach, but recognise that a second-order analysis as set out by Mariarosaria Taddeo (2010), could be more comprehensive. Other literature CDEI has reviewed includes: Understanding Patient Data, Involve & Carnegie Trust, Data for Public Benefit: Balancing the risks and benefits of data sharing (2018), UK; Infocomm Media Development Authority of Singapore & Personal Data Protection Commission, Trusted Data Sharing Framework (2019), Singapore; Data Futures Partnership, A Path to Social Licence: Guidelines for Trusted Data Use (2017), New Zealand; Open Data Institute, What organisations need in order to share more data: our research (2018), UK; Institute for Government, Sharing and safeguarding data in government (2018), UK. ↩
Note that individuals have the right to data portability under the GDPR. ↩
CDEI would seek to build on the insights and expertise developed in recent work led by the Open Data Institute on data trusts. ↩

Cookies on GOV.UK

1. Key findings

2. Areas to explore for further work

3. Executive summary

3.1 Rationale

3.2 Barriers to data sharing in the public sector

3.3 Public trust

3.4 Next steps

Promoting citizen-driven data uses

Setting out the conditions for public interest data sharing

4. Glossary

5. Introduction & Context

5.1 CDEI and Data Sharing

5.2 Personal Data in the Public Sector

Provision of public services to individuals

Planning, managing and regulating public services and national infrastructure

Developing new policies

Monitoring

Evaluating existing policies

Research

5.3 Common barriers to data sharing

6. A brief history of government data sharing initiatives

7. The data sharing environment

7.1 Promoting safe and ethical data sharing

Five Safes

Protecting Individuals’ Identities

Development of Ethical Frameworks

UK Statistics Authority: Data Ethics Committee

National Data Guardian for Health and Social Care

Inconsistent Approaches

7.2 Legal measures

Data Protection Act 2018 (DPA)

Commissioners for Revenue and Customs Act 2005

Statutory gateways

Digital Economy Act 2017

8. Our Case Studies

8.1 Overview

Education Data Shared with Approved Suppliers (Analyse School Performance)

VAT Register Shared with Credit Reference Agencies

Troubled Families Evaluation

Sharing Patient GP Records

DeepMind Medical Diagnosis

8.2 Barriers Encountered and Solutions Applied

8.3 The Barriers in Detail

Overcoming Legal Confusion

Overcoming Technical Constraints

Overcoming Cultural Barriers

Addressing public trust

8.4 What can be done?

CDEI and Public Trust

9. Tenuous Trust & Data Sharing

9.1 The importance of trust

9.2 Elements of trustworthy data sharing

10. Addressing Trust

10.1 Promote new citizen driven uses of data

10.2 Setting conditions for public interest data sharing

11. Appendix 1: Case Studies

11.1 Introduction

11.2 Case Study 1 - Education Data

Governance & Transparency

Sharing data from NPD (examples)

Public concerns

General Ethical Considerations of Sharing Data from NPD

Analyse School Performance Use Case - Sharing with Accredited Suppliers

11.3 Case Study 2 - HMRC Data (Legal Gateways)

Governance & Transparency

Public Concern

Use Case - Sharing HMRC VAT Data with Credit Reference Agencies

11.4 Case Study 3 - Understanding the impact of the Troubled Families Programme

Evaluation of the Programme

Use Case - Estimating the Impact of the Troubled Families Programme

11.5 Case Study 4 - Sharing Medical Records

Governance & Transparency

Sharing medical records

Public concerns

Use Case #1 - NHS (Royal Free) Sharing Data with DeepMind (Google)

Use Case #2: Patient Access

Background

12. Appendix 2: History of Government Initiatives

12.1 Modernising Government (1999-2005)