Guidance

Development Phase Activity Booklet (text only)

Published 5 June 2025

These materials were produced by The Alan Turing Institute through an extended engagement with personnel at the UK Ministry of Justice. 

Ethics Process 

Introduction to the SAFE-D Framework 

  • Read the Introduction Booklet and familiarise yourself with the Project Lifecycle Model. 

SAFE-D Principles Booklet 

  • Read the SAFE-D Principles Booklet and familiarise yourself with the principles and their core attributes.  

Design Phase Activity Booklet 

  • SAFE-D Identification Workshop Exercise 
  • Litmus Test 
  • Stakeholder Engagement Worksheet 
  • Additional activities where relevant 

Development Phase Activity Booklet 

  • SAFE-D Reflection Workshop Exercise 
  • Development Phase Questionnaire  
  • Stakeholder Engagement Worksheet 
  • Additional activities where relevant 

Deployment Phase Activity Booklet 

  • SAFE-D Assurance Workshop Exercise 
  • Deployment Phase Questionnaire 
  • Stakeholder Engagement Worksheet 
  • Additional activities where relevant 

Model Development Phase 

Model Development 

Covers technical tasks such as training, testing, and validating the machine learning model to ensure it’s suitable for its purpose. 

Lower-level lifecycle stages and their ethical significance

Model Selection & Training 

This stage involves choosing one or more algorithms for training a model. Factors influencing this decision include: 

  • Access to computational resources 
  • Predictive performance of the model 
  • Characteristics of the data 

Ethical significance: 

  • While there are technical reasons for responsibly selecting and training a model, a key aspect is the model’s interpretability and explainability. 
  • Generally, more complex models (like convolutional neural networks) are harder to interpret compared to simpler ones (like linear regression). 
  • The choice of algorithm should align with the model’s intended use and the system it will be part of. 

Model Testing & Validation 

Model testing and validation involves evaluating a model using various metrics, including its accuracy on new data that wasn’t part of the original training set. 

Ethical significance: 

  • When a dataset is divided into training and testing sets (internal validation) or when a model is assessed with completely new data (external validation), there are opportunities to evaluate more than just performance. 
  • Testing a model’s generalisability to new contexts helps ensure it is both sustainable and fair, meaning it should maintain similar accuracy and performance levels even when validated externally. 

Establishing Monitoring Tests 

Effective monitoring of a model’s performance in its runtime environment relies on setting up metrics to check if the model is functioning within its intended parameters. 

Ethical significance: 

  • While tests typically measure accuracy and global interpretability, it’s also important to assess system-level performance, such as efficiency and resource usage. 
  • The team responsible for the model will need their expertise for this stage, along with input from systems engineers, software engineers, and end users. 
  • This stage is crucial for establishing collective responsibility throughout the project’s lifecycle. It occurs at an important point in the project and relies on clear and accessible communication among all team members. 

Model Documentation 

This task involves documenting both formal and informal aspects of the model and its development process, including: 

  • Data sources and summary statistics 
  • The model used  
  • Evaluation metrics 

Ethical significance: 

Clear and accessible documentation is vital for responsible project governance for several reasons: 

  • It ensures results can be reproduced and supports values like public accessibility in open research 
  • It promotes accountability and transparency in decision-making 
  • It helps individuals affected by the technology seek redress for any harms that may occur during its design, development, or deployment 

Development Phase SAFE-D Reflection 

Workshop Exercise: SAFE-D Development Checkpoint 

Activity Overview 

Now that you’re in the Project Development Phase, it’s important to reflect on the SAFE-D Principles and assess how your project aligns with them. Use the following steps and prompts to guide your discussion and documentation. 

Steps: 

1) Review previous work 

  • Look back at the answers and action strategies you developed during the Design Phase workshop activities to refresh your memory  

2) Use the Miro board 

  • Track any changes, progress, and new questions that have arisen in the Development Phase 
  • You can create a new section on the board, or add post-its in a different colour to your original answers 

3) Evaluate SAFE-D goals & objectives 

  • Revisit the objectives you set during the SAFE-D Specification exercise. Assess whether these strategies are working effectively or if they need adjustments 

Prompt questions for discussion: 

  • What changes have occurred since the Design Phase that impact the SAFE-D Principles? 
  • How has the project progressed in relation to the ethical principles? Are there any areas where you feel you have made significant progress? 
  • Have any new questions or concerns emerged regarding the SAFE-D Principles as the project develops? 
  • Are the strategies you implemented in the Design Phase proving effective? What challenges have you encountered? 
  • Have you received any feedback from stakeholders that may influence your assessment of the SAFE-D Principles? 

Documentation 

Make sure to document your reflections and discussions in a clear format. This will not only provide a record of your progress but also facilitate ongoing communication among team members and stakeholders. 

Importance of Reflection 

Reflecting on the SAFE-D Principles at this stage is crucial for maintaining ethical integrity throughout the project. By actively assessing your project’s alignment with these principles, you can identify potential risks and ensure that ethical considerations are prioritised as you move forward. 

Development Phase Questionnaire 

The Development Phase Questionnaire is the next formal checkpoint in the ethics process.  

Steps: 

Use previous insights 

  • Leverage the insights and discussions from your SAFE-D workshops and Litmus Test to inform your responses in the assessment 

Assign team members 

  • Consider nominating different team members to collaborate on various sections of the assessment. This allows for diverse input and thorough exploration of each principle 

Complete the questionnaire 

  • The Development Phase Questionnaire consists of a series of questions organised around the SAFE-D principles. Each principle has 3-4 questions that require a simple Yes/No answer, along with justification or evidence 

Justification and evidence 

  • Ensure that for each answer, you provide sufficient justification or evidence. This could include references to workshop insights, project documentation, or stakeholder feedback 

Track changes and decisions 

  • Remember that the purpose of this assessment is not to hinder innovation but to help identify and mitigate risks early in the project lifecycle 

Document the assessment 

  • After completing the assessment, compile and document the responses. This will serve as a useful reference for future stages of the project and facilitate transparency with stakeholders 

Sustainability 

2.1a: Is the use case free from any risks to fundamental rights and freedoms? [Safety] 

2.1b: Are there sufficient security measures and access controls in place? [Security] 

2.1c: Have you tested the model for robustness to data drifts and adversarial attacks? Please answer N/A if this is a one-off analysis or will not be used with new data [Robustness] 

2.1d: Are the outputs of the model consistent and aligned to expectations? [Reliability] 

2.1e: Is the performance within acceptable thresholds? [Accuracy & performance] 

Accountability 

2.2a: Are the escalation pathways and approval pathways clear for this project? i.e. who is approving this model? [Answerability] 

2.2b: Are the roles and responsibilities of the project team and wider stakeholders clearly defined? [Answerability] 

2.2c: Have you liaised with the data owner(s) on any limitations, quality issues or bias in the data set? [Clear data provenance & lineage] 

2.2d: Do you know who the data owners are and have clear data documentation of all features? [Clear data provenance & lineage] 

2.2e: Is the documentation on the model accessible and easy to understand? [Accessibility] 

2.2f: Have you checked that any engineering or selected features are appropriate with the relevant domain experts? [Auditability] 

2.2g: Have you signed off the performance and error rates with the appropriate stakeholder(s)? [Auditability] 

2.2h: Are all decisions made documented with justifications? (For example, any data input methods and why they were chosen) [Auditability] 

2.2i: Are all relevant actions and analyses traceable and reproducible? [Traceability] 

Fairness 

2.3a: Is the project data collection strategy returning a representative sample of the population? [Non-discrimination] 

2.3b: Is there a sufficient sample in any subgroup of interest for this analysis? [Non-discrimination] 

2.3c: Have you considered including diverse voices and opinions in the design and development process, e.g. through engaging with civil society and affected communities? [Diversity & inclusivity] 

2.3d: Have any features added by developers that could be affected by personal judgement been documented and mitigated? [Bias mitigation] 

2.3e: Have you identified and mitigated potential proxies in your data, i.e., features in your data that may be associated with unjustifiable sources of differences in outcome, e.g., postcode with race, income with gender? [Bias mitigation] 

2.3f: Has the model been calibrated across different metrics for each group and mechanism? [Bias mitigation] 

2.3g: Is the model free of all potential feedback loops that may worsen existing inequalities? [Equality] 

Explainability 

2.4a: Have you considered all the stakeholders that might need an explanation, either global (model level) or local explanation (individual level)? [Implementation & user training] 

2.4b: Are the users sufficiently informed and knowledgeable to understand the explanation? [Implementation & user training] 

2.4c: Can you clearly communicate how each feature was calculated or collected? [Interpretability] 

2.4d: Are you able to create clear global and local explanations for each prediction? [Interpretability] 

2.4e: Has the explainability of the model remained consistent as the size of the data set and the number of features has increased? [Interpretability] 

2.4f: Where a complex model is selected, is the rationale documented as to why a simpler model cannot be used? [Responsible model selection] 

2.4g: Is the explanation accessible and easy to understand for each stakeholder? [Accessible rationale explanation] 

2.4h: Is the explanation reproducible? [Reproducibility] 

Data Responsibility 

2.5a: If using historical data, are all data sets being used timely and applicable for deriving future insights? [Timeliness & recency] 

2.5b: Have you ensured adherence to existing laws and regulations, including data protection law, privacy standards, and public sector duties? [Legal & organisational compliance] 

2.5c: Have all the potential risks, assumptions, and limitations of this use-case been reviewed and signed-off? [Legal & organisational compliance] 

2.5d: Are any of the recorded features free from human judgement? If no, have they been recorded and mitigated? Source integrity & measurement accuracy] 

2.5e: Are there clear policies in place for data storage, sharing, documentation, and processing? [Responsible data management] 

2.5f: If any data sets are linked and shared, have data protection principles been considered? [Responsible data management] 

2.5g: Do the available data set(s) have the adequate sample size, representativeness, and contextual relevance for the use case at hand? If not, are the limitations documented? [Adequacy of quality & quantity] 

Contact Us

DataEthics@justice.gov.uk