Standard

Information Commissioner's Office: Registration Inbox AI

Published 6 July 2022

name tier and category description entry (please enter all the required information in this column)
Name Tier 1 - Overview Colloquial name used to identify the algorithmic tool. Machine learning algorithm to categorise emails sent to the ICO’s registration inbox
Description Tier 1 - Overview Give a basic overview of the purpose of the algorithmic tool.

Explain how you’re using the algorithmic tool, including:

* how your tool works
* how your tool is incorporated into your decision making process

Explain why you’re using the algorithmic tool, including:

* what problem you’re aiming to solve using the tool, and how it’s solving the problem
* your justification or rationale for using the tool
* how people can find out more about the tool or ask a question - including offline options and a contact email address of the responsible organisation, team or contact person
This algorithmic tool helps to categorise emails sent to the ICO’s registration inbox and sends out auto-replies in specific cases. The registration inbox generally receives queries from organisations or sole traders who are registered, or are looking to register, with the ICO. Every organisation or sole trader who processes personal information needs to pay a data protection fee to the ICO, unless they are exempt. Typical queries may be about registering with the ICO, how to make a  payment, or how to change or update details about a registration.

The ICO’s registration team often manages high volumes of queries. Assigning members of the team to respond to every request takes time and can lead to high caseloads for Data Protection Fees Officers (DP Fees Officers).

To reduce the burden on caseloads and ensure we respond in a timely manner to customers, the ICO, alongside selected third parties, have developed an algorithmic tool that categorises emails that are sent to the inbox and sends out auto-replies in specific cases. The algorithm takes into account the content of the email being sent to the inbox and detects whether it is a request about changing a business address. In cases where it detects this kind of request, the algorithm sends out an autoreply that directs the customer to a new online service and points out further information required to process a change request. Only emails with an 80% certainty of a change of address request will be sent an email containing the link to the change of address form.
URL of the website Tier 1 - Overview If available, provide the URL reference to a page with further information about the algorithmic tool and its use. This facilitates users searching more in-depth information about the practical use or technical details.This could, for instance, be a local government page, a link to a GitHub repository or a departmental landing page with additional information. n/a
Contact email Tier 1 - Overview Provide the email address of the organisation, team or contact person for this entry. For additional information, read the ICO’s privacy notice, contact the ICO’s data protection officer at dpo@ico.org.uk, or call our main helpline  on 0303 123 1113.
1.1 Organisation/ department Tier 2 - Owner and responsibility Provide the full name of the organisation, department or public sector body that carries responsibility for use of the algorithmic tool. For example, ‘Department for Transport’. Information Commissioner’s Office (ICO)
1.2 Team Tier 2 - Owner and responsibility Provide the full name of the team that carries responsibility for use of the algorithmic tool. Business Services
1.3 Senior responsible owner Tier 2 - Owner and responsibility Provide the role title of the senior responsible owner for the algorithmic tool. Project Manager for the Strategic Change and Transformation department
1.4 Supplier or developer of the algorithmic tool Tier 2 - Owner and responsibility Provide the name of any external organisation or person that has been contracted to develop the whole or parts of or the algorithmic tool. The algorithm was developed by ICS.AI and Microsoft.
1.5 External supplier identifier Tier 2 - Owner and responsibility If available, provide the Companies House number of the external organisation that has been contracted to develop the whole or parts of or the algorithmic tool. You can get a company’s Companies House number by finding company information or using the Companies House API ICS.AI: 1134680
1.6 External supplier role Tier 2 - Owner and responsibility Give a short description of the role the external supplier assumed with regards to the development of the algorithmic tool. Members of ICS.AI were contracted to develop the algorithm for the ICO. It is not shared with any other organisation.
1.7 Terms of access to data for external supplier Tier 2 - Owner and responsibility Details the terms of access to (government) data applied to the external supplier. ICS.AI acts as data processors which means that they cannot do anything with information we provide them unless we have instructed them to do  it. They will not share information with any organisation apart from us. They will hold it securely and retain it for the period we instruct.
2.1 Scope Tier 2 - Description Describe the purpose of the tool in terms of what it’s been designed for and what it’s not been designed for. This can include a list of potential purposes that the tool was not designed to fulfil but which could constitute possible common misconceptions in the future. This algorithmic tool has been designed to inspect emails sent to the ICO’s registration inbox and send out autoreplies to requests made about changing addresses. The tool has not been designed to automatically change addresses on the requester’s behalf. The tool has not been designed to categorise other types of requests sent to the inbox.
2.2 Benefit Tier 2 - Description Describe the key benefits that the algorithmic tool is expected to deliver, and an expanded justification on why the tool is being used. Reading an email, understanding the request being made and adequately responding takes time. If more emails were received than the amount that could be sufficiently answered, a backlog would develop. This would lead to delayed responses and reduced customer satisfaction. In a significant proportion of emails received, a simple redirection to an online service is all that is required. However, sifting these types of emails out would also require time if done by a human. The algorithm helps to sift out some of these types of emails that it can then automatically respond to. This enables greater capacity for DP Fees Officers in the registration team, who can, consequently, spend more time on more complex requests.

Key benefits:

* Improves efficiency in answering requests sent to the registration inbox
* Helps to avoid backlog of emails
* Provides case officers with more time to respond to more complex requests
2.3 Alternatives considered Tier 2 - Description Provide, where applicable, a list of non-algorithmic alternatives considered, or a description of how the decision process was conducted previously. This process was previously done manually by reading and responding to emails.
2.4 Type of model Tier 2 - Description Indicate which types of methods or models the algorithm is using. For example, expert system, deep neural network and so on. The classification model uses a Naïve Bayes classifier to determine the context of a request.
2.5 Frequency of usage Tier 2 - Description Provide information on how regularly the algorithmic tool is being used. For example the number of decisions made per month, the number of citizens interacting with the tool, and so on. The tool classifies approximately 23,000 emails a month.
2.6 Phase Tier 2 - Description Describe the phase in which of the following stages or phases the tool is currently situated: - idea - design - development - production - retired This field includes date and time stamps of creation and any updates. The tool has been in production since 19 May 2021.
2.7 Maintenance Tier 2 - Description Give details on the maintenance schedule and frequency of any reviews. For example, specific details on when and how a person reviews or checks the automated decision. The tool is reviewed monthly to understand where it may be missing some categories of request.
2.8 System architecture Tier 2 - Description If available, provide the URL reference to documentation about the system architecture. For example, a link to a GitHub repository image or additional documentation about the system architecture. See a diagram of the system’s architecture.
3.1 Process integration Tier 2 - Oversight Explain how the algorithmic tool is integrated into the decision-making process and what influence the algorithmic tool has on the decision-making process. Give a more detailed and extensive description of the wider decision-making process into which the algorithmic tool is embedded. The algorithmic tool does not make any decisions, but instead provides links in instances where it has calculated the customer has contacted the ICO about an address change, giving the customer the opportunity to self-serve.
3.2 Provided information Tier 2 - Oversight Describe how much and what information the algorithmic tool provides to the decision maker. n/a as the process is fully automated
3.3 Human decisions Tier 2 - Oversight Describe the decisions that people take in the overall process, including human review options. There is no manual intervention in the process - the links are provided to the customer in a fully automated manner.
3.4 Required training Tier 2 - Oversight Describe the required training those deploying or using the algorithmic tool must undertake, if applicable; For example, the person responsible for the management of the tool had to complete data science training. n/a - no additional training required
3.5 Appeals and review Tier 2 - Oversight Provide details on the mechanisms that are in place for review or appeal of the decision available to the general public. n/a - No need for review or appeal as no decision is being made. Incorrectly classified emails would receive the default response which is an acknowledgement.
4.1 Source data name Tier 2 - Information on data If applicable, provide the name of the datasets used. n/a - no specific name of dataset
4.2 Source data Tier 2 - Information on data Gives an overview of the data used to train and run the algorithmic tool. It will also specify whether data is used for training, testing, or operating. It should include which categories of data - for example ‘age’ or ‘address’ - which were used to train the model and which are used as input data for making a prediction. The model was trained on a dataset that was collected from emails being sent to the ICO’s registration inbox. We provided information about this purpose in our privacy notice on our website. Data collected includes:

* Email address
* Subject title
* Contents of the email, which may contain information relating to registration queries around address of trader/trading name/contacts for the registrations/payment categories or any other information they included in the email body.

Email header information is removed and not processed by machine learning text classification service.
4.3 Source data URL Tier 2 - Information on data If available, provide a URL to the dataset. n/a - No URL available
4.4 Data collection Tier 2 - Information on data Gives information on the data collection process, including the original purpose of data collection. The model was trained on a dataset that was collected from emails being sent to the ICO’s registration inbox
4.5 Data sharing agreements Tier 2 - Information on data Provides further information on data sharing agreements in place. n/a - No further data sharing agreements in place
4.6 Data access and storage Tier 2 - Information on data Provide details on who has or will have access to this data, how long it’s stored, under what circumstances and by whom. Data access for the training data was restricted to those contracted at ICS.AI by the ICO. ICS.AI completed the work within an ICO-managed environment and did not hold or process any data outside of that  environment. All training data was deleted after the model was created  from the Azure Database while the original email will remain in the registration inbox for 12 months.
5.1 Impact assessment name Tier 2 - Risk mitigation and impact assessment Provide the name and a short overview of the impact assessment conducted. Machine Learning – Text classification – Data Protection  Impact Assessment
5.2 Impact assessment description Tier 2 - Risk mitigation and impact assessment Give a description of the impact assessments conducted. This assessment considered the risks to individual rights and freedoms caused by the introduction of this algorithmic tool and provided details on the organisational and technical measures taken to reduce any risks identified.
5.3 Impact assessment date Tier 2 - Risk mitigation and impact assessment Provide the date in which the impact assessment was conducted. Date completed: 19th May 2021
5.4 Impact assessment link Tier 2 - Risk mitigation and impact assessment If available, provide a link to the impact assessment. n/a
5.5. Risk name Tier 2 - Risk mitigation and impact assessment Provide an overview of the common risks for the algorithmic tool. 1. A customer receives an incorrect response because of the automated email response.

2. Failure to provide transparency information prior to starting the model training process
5.6 Risk description Tier 2 - Risk mitigation and impact assessment Give a description of the risks identified. 1. A customer receives an incorrect response because of the automated email response.

2. Failure to provide transparency information prior to starting the model training process
5.7 Risk mitigation Tier 2 - Risk mitigation and impact assessment Provide an overview of how the risks have been mitigated. 1. The classification scope is limited to a change of address and a generic response stating that we have received the customer’s request and that it will be processed within an estimated timeframe. Incorrectly classified emails would receive the default response which is an acknowledgement. This will not have an impact on personal data. Only emails with an 80% certainty of a change of address request will be sent an email containing the link to the change of address form.

2. The ICO’s privacy notice has been updated to inform customers of the additional use of the data for training purposes including machine learning.
The diagram shows the algoritm system’s architecture.