Guidance

Outcomes from 2023 implementation plan

Updated 23 January 2024

Appendix: Detailed implementation plan with outcomes from 2023 plan

Tools

Analyst leaders will:

Action 2023 activities Status Success criteria / metrics Notes
Work with security and IT teams to give analysts access to the right tools Ensure that the RAP MVP is supported on DfT computers and platforms Completed The tools required for RAP Minimum Viable Product (Appendix B) are available on recommended analytical platforms by the end of 2023. R and Python IDEs and notebooks are available to users in a more streamlined way, with appropriate and up-to-date packages.
  Ensure that all analysts have access to appropriate R and Python platforms Completed Clear pathways are available for analysts to obtain access to appropriate R and Python platforms which meet the RAP MVP standards. Guidance documents explain the process for requesting any coding tools. R access is granted via super user in an automated request process.
  Streamline analyst access to version control software such as Github Completed Analysts can request Github Enterprise licences directly through their IT Focal Points for rapid access. This is now an automated process which allows analysts to receive appropriate permissions and a Github invite within minutes of a service request being raised
Work with security and IT teams to develop platforms that are easy for analysts to access, flexible and responsive to the needs of analysts Ensure that analyst use of key RAP tools is supported by technical resource Completed Feedback process is in place for analysts to report issues and make new feature requests and is appropriately documented. Written documentation of known issues and fixes is created and maintained for users to refer to. All teams can make requests via their CRAN representatives, and documentation is maintained as part of a CRAN wiki
  Write DfT-wide coding guidance on which analytical tools to use and when Completed Central coding guidance is available to outline in plain language what coding tools are available and their appropriate usage. Guidance can be found in this chapter of the strategic reproducible analysis handbook
  Ensure coding platforms continue to meet analytical user needs Completed Feedback process is documented for analysts to report issues and new feature requests. An appropriate process for reporting these to Digital and ensuring work is completed in a timely manner is in place. Super users for each coding tool are listed in the CRAN wiki, and language-specific channels are available in the CRAN network. Experts can then guide users into the most appropriate solution for their problem.
Work with security, IT, and data teams to make sure that the data analysts need are available in the right place and are easy to access Engagement between analysts and DDaT colleagues as part of Transport Data plans Completed Data and analysis teams continue to feed into beta testing phase of Transport Data planning. Beta testing phase of this work has been completed successfully.

Analysts will:

Action 2023 activities Status Success criteria / metrics Notes
Use open-source tools where appropriate Develop coding guidance which prioritises open source tools Completed Central coding guidance is available to outline in plain language the advantages of open source coding tools, their appropriate use, and availability within DfT. Guidance can be found in this chapter of the strategic reproducible analysis handbook
  Transform 5 analysis workflows to RAP workflows using open source tools Completed At least 5 existing analysis workflows are converted to using open source tools See table below for details of completed projects
  Ensure existing guidance emphasises the importance of version control tools such as Git/Github Completed Where existing RAP guidance emphasises version control as a nice to have, or suggests version control methodologies other than Git/Github, update this to reflect a ‘Git as Default’ stance. All guidance (for example, the R Cookbook) has been updated to reflect this stance.
Open source their code Develop coding guidance on open sourcing code Completed Central coding guidance is available to explain the advantages and risks of open sourcing coding, how to open source code safely and effectively. Guidance can be found in this chapter of the strategic reproducible analysis handbook
  Offer Github Technical Lead training to ensure the capability to responsibly open source code Completed Github Technical Lead training is run at least once in 2023, and ongoing community support is available, to ensure the capability to responsibly open source. Two cohorts of Github technical lead training have been run in 2023, and this training is now available alongside Intermediate Github training to build capability in version control tools further.
Work with data engineers and architects to make sure that source data are versioned and stored so that analysis can be reproduced Engagement between analysts and DDaT colleagues as part of Transport Data plans Completed Data and analysis teams continue to feed into beta testing phase of Transport Data planning. Beta testing phase of this work has been completed successfully.

Capability

Analyst leaders will:

Action 2023 activities Status Success criteria / metrics Notes
Ensure their analysts build RAP learning and development time into work plans StatsAID team to develop tools/guidance to record efficiency and quality improvement as an outcome of RAP products. Not started Tools and guidance in place to help analyst leaders record metrics around RAP efficiency and quality, and use these figures with confidence in work planning. NA
  Encourage analysts to devote learning and development time to developing essential RAP/coding skills Completed Guidance on minimum essential RAP and coding skills are developed for analysts. A “core coding skills” descriptor has been developed with support from the analyst community, and forms the basis of a learning pathway
Help their teams to work with DDaT professionals to share knowledge. DDaT and statistician/data analyst forums to continue Completed Monthly analyst/digital forums continue to be held in 2023 to ensure collaboration between divisions. These monthly meetings were held regularly throughout 2023 with attendance from Digital and Analysts
  Coffee and Coding activities are open to both DDaT and Analysts Completed Coffee and Coding distribution lists include Digital colleagues for both sharing invitations and for calls for presentations. NA

Analyst managers will:

Action 2023 activities Status Success criteria / metrics Notes
Build extra time into projects to adopt new skills and practices where appropriate Will engage with supporting teams where appropriate to ensure capability building in new coding/RAP projects In progress Statistical team managers have a good understanding of the resources available from the StatsAID team to facilitate capability building in RAP projects. The StatsAID team have developed promotional material for the support they can offer to teams, and circulated them across statistics teams. This will be repeated more broadly in subsequent years.
Learn the skills they need to manage software C&C will develop training for analyst managers to ensure they have an understanding of key analytical tools In progress Training for analyst managers is developed. This training is run at least once in 2023 to give an understanding of key analytical tools (R, Github). At least 75% of statistics team analyst managers report having attended the session, and feel more confident in their ability to manage software. The scoping phase of this project indicated a far broader range of coding topics that analyst managers were interested in developing skills in. This formed the basis of guidance documentation and initial training covering “developing coding capability in analyst teams”. Further sessions based on this guidance will be developed in 2024

The DFT RAP community will:

Action 2023 activities Status Success criteria / metrics Notes
Deliver mentoring and peer review schemes in their organisation and share good practice across government Implement a code reviewing network across DfT Completed A code reviewing network has been established and analysts are aware of its existence and function. A code reviewing network has been created and advertised in live sessions and via Teams posts.
  Offer coding review training opportunities Completed C&C have made coding reviewing workshops and/or other training available to analysts in 2023. Code reviewing workshops have run 8 times across 2023. Alongside this, work is beginning to extend support to cover both Python-focussed code reviewing, and code QA as distinct from code reviewing.

Analysts will:

Action 2023 activities Status Success criteria / metrics Notes
Learn the skills they need to implement RAP principles C&C continue to offer a range of learning opportunities, including formats that allow analysts to access training on demand Completed C&C will run learning opportunities on an at least monthly basis throughout 2023. At least 75% of these learning opportunities will later be available on demand through sharing resources and recordings of sessions. Attendance at these sessions will be monitored to ensure they meet community needs. More than 10 learning and development courses have been developed, covering topics across R, Github, BigQuery, good practice, and PowerBI. On average, at least 4 training sessions on a variety of topics have run each month throughout 2023. All content is now available in on-demand format.
  Develop suggested training program for analysts to undertake before starting RAP projects Completed C&C will record a list of existing training resources applicable to new RAP projects for both beginner and refresher levels. There is a record of all training sessions that run, as well as a calendar of events to allow analysts to effectively plan their time.

Culture

DfT will:

Action 2023 activities Status Success criteria / metrics Notes
Choose leaders responsible for promoting RAP and monitoring progress towards this strategy within organisations Our senior sponsor for delivering Reproducible Analytical Pipelines is Gemma Brand, Head of Profession for Statistics. Completed Senior leader responsible for RAP chosen Our senior sponsor for delivering Reproducible Analytical Pipelines is Gemma Brand, Head of Profession for Statistics.
Form multidisciplinary teams that have the skills to make great analytical products, with some members specialised in developing analysis as software No activities planned for 2023. Not started No activities planned for 2023 NA

DfT Analyst leaders will:

Action 2023 activities Status Success criteria / metrics Notes
Promote a ‘RAP by default’ approach for all appropriate analysis DfT senior leaders understand the importance of ‘RAP by default’ for their teams Completed Conversations held with DfT analytical senior leaders to gauge their existing knowledge of RAP and its use and benefits. Support and guidance is developed to address any common misunderstandings. Session held with senior analytical leaders from across DfT discussing the RAP strategy and its application to their divisions. Specific areas where more guidance and support are required were highlighted, and built into future work plans.
  Develop RAP for adhoc analysis guidance Completed Central coding guidance is available to outline the utility of RAP approaches in adhoc analysis, and clear explanations of which aspects of RAP are appropriate for differing types of analysis. Guidance can be found in this chapter of the strategic reproducible analysis handbook
Write and implement strategic plans to develop new analyses with RAP principles, and to redevelop existing products with RAP principles No activities planned for 2023. Not started No activities planned for 2023 NA
Lead their RAP champions to advise analysis teams on how to implement RAP Ensure that all DfT divisions have a nominated local RAP champion. Completed This approach will be piloted in statistics, with all divisions having at least one local RAP champion, and members of that division are aware of them. Other professions will be encouraged to nominate a local RAP champion too. All divisions in statistics have nominated a local CRAN representative, with many other analyst teams beyond statistics choosing to do so.
help teams to incorporate RAP development into workplans The StatsAID team will provide central mentoring support and guidance for teams wanting to incorporate RAP into workplans In progress A central mentoring, support and guidance offer is in place and teams are aware of this offer. The StatsAID team have developed promotional material for the support they can offer to teams, and circulated them across statistics teams. This will be repeated more broadly in subsequent years.
  Support teams to make use of Github features such as labels, project boards and teams to monitor, explore and promote ongoing and complete RAP projects across the department Completed The RAP champions will develop and promote guidelines and processes for using Github as a monitoring, showcasing and prioritisation tool for RAP projects. Central repository tagging guidance is available, and the Github API has been used to create automatic monitoring reports based on this.
Identify the most valuable projects by looking at how much capability the team already has and how risky and time-consuming the existing process is Develop prioritising RAP guidelines for analytical leaders Completed Central coding guidance is available to outline key considerations when prioritising RAP projects and analytical leaders are aware of these. Guidance can be found in this chapter of the strategic reproducible analysis handbook

DfT RAP champions will:

Action 2023 activities Status Success criteria / metrics Notes
Support leaders in their organisation in delivering this strategy by acting as mentors, advocates and reviewers No activities planned for 2023 Not started No activities planned for 2023. NA
Manage peer review schemes in their organisation to facilitate mutual learning and quality assurance Implement a code reviewing network across DfT Completed A code reviewing network has been established and analysts are aware of its existence and function. A code reviewing network has been created and advertised in live sessions and via Teams posts.

DfT Analyst managers will:

Action 2023 activities Status Success criteria / metrics Notes
Evaluate RAP projects within organisations to understand and demonstrate the benefits of RAP Develop prioritising RAP guidelines for analytical leaders Completed Central coding guidance is available to outline key considerations when prioritising RAP projects and analytical leaders are aware of these. Guidance can be found in this chapter of the strategic reproducible analysis handbook
Mandate their teams use RAP principles whenever possible Analytical managers will ensure that they are aware of best practice resources available within DfT and will promote them to their teams Completed Analytical managers and analysts within statistics are aware of the location and contents of all best practice resources. Best practice resources have been centralised and signposted through the Analytical Leader coding pathway.

DfT Analysts will:

Action 2023 activities Status Success criteria / metrics Notes
Engage with users of their analysis to demonstrate the value of RAP principles and build motivation for development Determine most appropriate way to engage with users about RAP Completed Statistical Dissemination team to decide on and publicise most appropriate way to engage with users on this (for example, TSUG, Twitter) Annual updates will be shared on GOV.UK via the RAP strategy, and senior analysts encouraged to share this content via LinkedIn
deliver their analysis using RAP Will contribute to at least 5 RAP projects in 2023 Completed 5 RAP projects successfully completed in 2023. See table below for details of completed projects

Appendix: list of 2023 RAP projects and outcomes at DfT

Project Description Outcome
Road traffic ATC data ingest Automated process to collect and transform near real-time ATC data from contractors FTP server to BigQuery, using Cloud Run and associated services. All ATC data is now being collected and stored in BigQuery using a bespoke data engineering pipeline developed in collaboration with the Traffic Surveys team. This includes data processing audit logs of what happens when and why.
National Travel Survey Automate creation of accessible publication tables. All National Travel Survey tables, including National Travel Attitudes Study tables and those which feed into other publications such as disability statistics, have been made accessible, and production is automated using SQL and R. Code for the tables and recent statistical releases is available on GitHub.
Congestion and road safety statistics Transfer of data ingest, analysis and some quality checks to BigQuery as part of GCP transfer beta project. Processing time changed from weeks to less than one day. QA checks now allow graphical outputs and easier to see trends for data quality and publication purposes. We can get back to data providers with questions sooner to allow for greater revision time.
Taxi and light rail statistics End-to-end automation of data ingest, validation and analysis process in R, and implementation of version and quality control of code in GitHub. Taxi statistics - data validation, processing, table production and chart production mostly automated. Code on Github and version controlled. Light rail statistics – data validation, table production and chart production automated. Code on Github. Further development planned to improve code efficiency and version control.
National Highways and Transport (NHT) survey Automation of data ingest and analysis. Data ingest automated, data analysis partially automated with some further development planned.
Active travel statistics A program of coding improvements including production of accessible tables, HTML bulletin content, and analysis of NTS data in R, pipelining of data in GCP, and version and quality control of code in Github. Successfully automated processes for data processing, validation, chart production and creation of accessible tables. All code is now version and quality controlled on Github.
Rail statistics Completing a large RAP project in R to automate data preparation, quality assurance, and visualisations of data used in annual Rail Passenger Numbers publication. Whole process automated in R. Now moving to use GCP and GitHub alongside R to enable whole team ownership. An effort is now being undertaken to improve the scope and performance of the quality assurance automated process. Also future plans include the development of a R-Shiny dashboard.
Aviation statistics Modernisation of existing coded processes, including migration and refactoring of R code, implementation of version and quality control of code in GitHub, and improvements to data storage and processing using GCP and R The code used to run our regular data processing and to produce our internal monitoring materials has been migrated to Cloud-R and GitHub, improving reliability and auditability.
Road traffic statistics Merging existing daily and quarterly processes into a single coded process for cleaning and aggregating data. This will include development of SQL and R/R Shiny code to replace all existing processes, and make use of Github version and quality control. A new process has been developed, using SQL and R and R Shiny, with Github version and quality control. The new process was used for the quarterly publications released during 2023 and will be used for the [daily publication in 2024].
National Highways real-time data project Building on previous success of data processing in BigQuery, further development work will refactor code to improve efficiency, data cleaning and coverage. BigQuery code has been quality assured and updated to aid future use of this dataset.
Port freight statistics Automate data visualisations, release commentary and quality assurance of tables in the quarterly port freight publication. This will include developing new code in both SQL and R, to improve the timeliness and quality of data checks and release production. Automation of commentary and visualisations is complete, reducing time taking to produce the quarterly port freight publication. Imputation code has also been moved into R to speed up processing and updating each quarter.
Shipping fleet statistics Automate data visualisations and release commentary in the shipping fleet publication. Automation of commentary and visualisations is complete, reducing time taking to produce the Shipping Fleet publication and reducing the number of typing errors found at the quality checks stage.
People analytics Continuing to improve processes across data storage, analysis and publication. This includes moving data storage from legacy Access and Excel-based systems into GCP, further developing code-based solutions for analysis, and publication of data in accessible ODS and HTML formats. We have redesigned our process for producing the Equality Monitoring tables, both: changing the data source to a more stable version used to produce Annual Civil Service statistics, thereby saving stakeholders time producing bespoke data sources and removing dependence on unstable MS Access systems; and building a new data cleaning process in R with greater reliability and improved transparency through GitHub documentation.