Guidance
Data scientist: skills they need
Updated 2 January 2019
© Crown copyright 2019
This publication is licensed under the terms of the Open Government Licence v3.0 except where otherwise stated. To view this licence, visit nationalarchives.gov.uk/doc/open-government-licence/version/3 or write to the Information Policy Team, The National Archives, Kew, London TW9 4DU, or email: psi@nationalarchives.gsi.gov.uk.
Where we have identified any third party copyright information you will need to obtain permission from the copyright holders concerned.
This publication is available at https://www.gov.uk/government/publications/data-scientist-skills-they-need/data-scientist-skills-they-need
This content is part of the Digital, Data and Technology (DDaT) Capability Framework which describes the DDaT roles in government and the skills needed to do them.
1. What a data scientist does
A data scientist is proficient in data science.
Data scientists:
- have recognised technical ability in a number of data science specialisms and provide detailed technical advice on their area of expertise
- promote and present data science work both within and outside of the organisation
- engage with stakeholders and champion the value of data science work
- line manage and mentor junior data scientists
- manage small project teams
2. What skills they need
A data scientist science needs specific technical skills.
All roles have essential skills, and some have desirable skills.
Each skill has one of 4 skill levels associated with it:
- Expert
- Practitioner
- Working
- Awareness
2.1 Essential skills
Skill | Description of the skill | Skill level | What the skill level means |
---|---|---|---|
Applied maths, statistics and scientific practices | Understands how algorithms are designed, optimised and applied at scale. Can select and use appropriate statistical methods for sampling, distribution assessment, bias and error. Understands problem structuring methods and can evaluate when each method is appropriate. Applies scientific methods through experimental design, exploratory data analysis and hypothesis testing to reach robust conclusions. | Practitioner | Understands and can help teams apply a range of practices. Develops deeper expertise on a narrower range of specialisms. Starts to apply emerging theory to practical situations. |
Data engineering and manipulation | Works with other technologists and analysts to integrate and separate data feeds in order to map, produce, transform and test new scalable data products that meet user needs. Has a demonstrable understanding of how to expose data from systems (for example, through APIs), link data from multiple systems and deliver streaming services. Works with other technologists and analysts to understand and make use of different types of data models. Understands and can make use of different data engineering tools for repeatable data processing and is able to compare between different data models. Understands how to build scalable machine learning pipelines and combine feature engineering with optimisation methods to improve the data product performance. | Practitioner | Can work with data engineers to map, produce, transform and test new data feeds for data owners and consumers, selecting the most appropriate tools and technologies. Can lead ad hoc data exploration in a wide variety of data serialisation and storage formats, from across the business, for data consumers. |
Data science innovation | Recognises and exploits business opportunities to ensure more efficient and effective ways to use data science. Explores ways of utilising new data science tools and techniques to tackle business and organisational challenges. Demonstrates strong intellectual curiosity with an interdisciplinary approach, drawing on innovation in academia and industry. | Practitioner | Displays strong intellectual curiosity and proactively explores areas of innovation in both government and industry. Can identify the business value for innovation within their organisation. |
Developing data science capability | Continuously develops data science knowledge, utilising multiple sources. Shares data science practices across departments and in industry, promoting professional development and use of best practice across all capabilities identified for data scientists. Focuses on recruitment and induction of data scientists. | Practitioner | Promotes and monitors continuous professional development within their organisation. Develops knowledge of cutting-edge techniques and shares knowledge. Propagates data science capability across the organisation. |
Domain expertise | Understands the context of the business, its processes, data and priorities. Applies data science techniques to present, communicate and disseminate data science products effectively, appropriately and with high impact. Uses the most appropriate medium to visualise data to tell compelling and actionable stories relevant for business goals. Maintains a user focus to design solutions that meet user needs, taking account of agreed cross-government ethics standards. | Practitioner | Can create data science products which are proportionate to the business benefit and achieve significant impact. Presents analysis and visualisations in clear ways to communicate complex messages. Develops data science communication skills within their team. Supports the evolution of data governance and takes responsibility for applying the data science ethical framework in their business area. |
Programming and build (data science) | Uses a range of coding practices to build scalable data products that can be used by strategic or operational users and can be further integrated into business systems. Works with technologists to design, create, test and document these data products. Works in accordance with agreed software development standards, including security, accessibility and version control. | Practitioner | Designs, codes, tests, corrects and documents moderate to complex programmes and scripts from agreed specifications and subsequent iterations, using agreed standards and tools. Collaborates with others to review specifications where appropriate. |
Understanding analysis across the life cycle (data science) | Understands the different phases of product delivery and is able to plan and run the analysis for these. Able to contribute to decision-making throughout the lifecycle. Works in collaboration with user researchers, Developers and other roles throughout the lifecycle. Understands the value of analysis, how to contribute with impact and which data sources, analytical techniques and tools can be used at each point throughout the lifecycle. | Practitioner | Understands and can help teams apply a range of techniques to analyse data and provide insight. Is proactive and can present compelling findings that inform wider decisions. Starts to apply innovative approaches to resolve problems. |
3. Civil Service Success Profiles Framework
The Civil Service uses The Success Profiles Framework to assess candidates during recruitment.
It is a flexible framework, used to assesses a range of experiences, abilities, strengths, behaviours and technical/professional skills required for different roles.
Find out more about Success Profiles.
4. Other roles in data science
There are 4 other role levels in data science: