Digital and technology skills

Operations engineering

Running production systems and helping development teams build secure software that's easy to operate and scale.

Operations engineering involves expertise in areas such as infrastructure, configuration management, monitoring, deployment, operating systems and end-user device management.

Some relevant roles: operations engineers, systems administrators, web operations engineers, technical architects

Digital impact on ops

This involves understanding:

  • how operations engineering supports front-end digital services
  • how digital skills in the Essentials for digital specialists skills group apply to operations (eg being familiar with the government digital transformation agenda, user-centred design, agile delivery and open standards)

Hosting and cloud

Understanding hosting and cloud technologies

Having a strong understanding of all fundamental elements of hosting and cloud technologies, including:

  • design
  • planning
  • security and compliance
  • integration
  • provisioning
  • cloud storage
  • virtualisation

Choosing a cloud hosting service

This involves understanding:

  • cloud hosting services and the types available, in particular Platform, Infrastructure and Software As A Service (PaaS, IaaS, SaaS)
  • the government’s ‘Cloud First’ policy
  • why PaaS, IaaS, SaaS should be considered before other kinds of solutions

Maintaining cloud services

Creating cloud service environments

Providing, building and deploying SaaS, IaaS, and PaaS environments.

Managing cloud services

Understanding how to manage the capacity of the services and how that impacts on cost.

Deployment pipelines

Understanding the end-to-end deployment pipeline. Knowing how it works and how each element works together will have implications for configuration management and the automation of the build, test and release processes.

Automated deployment

This involves:

  • understanding the deployment process through which code goes from the version control system to production
  • automating that deployment process.
  • using and promoting the ‘little and often’ principle of deployment

Automated deployment forces you to fully understand the end-to-end deployment process. It also means that code is fully tested, and bugs are fixed so that releases become frequent, low-risk and almost boring events.

Configuration management

Maintaining environmental consistency

Recognising the importance of maintaining consistency between development and deployment environments.

Using consistent configuration tools

Using the same configuration management tools for the deployment and production environments to avoid versions working in test that may not work in production.

Considering open source configuration tools

Considering the use of open source configuration management tools (eg CFEngine, Chef, Puppet).

Creating flexible systems

Breaking down restrictive manual processes (eg over-restrictive change management) in order to build agile and flexible software systems.


Planning for the transition of services between environments and/or suppliers and acting on that plan.

Service integration

This involves:

Setting up a shared sandbox

Setting up a shared sandbox testing environment as part of the deployment pipeline. This ensures that everyone working on the design, development or maintenance of a service has a clear, easily accessible place to review the latest version of the software.

Load testing

Conducting load testing, simulating certain types of Denial of Service attacks (eg Distributed Denial of Service attacks) so you can ensure sites and applications work under realistic load (traffic) conditions.

Commit stage

This involves:

  • checking into a version control system
  • understanding and setting up tests that can check the quality of code for compile errors and test failures at the commit stage (this ensures that code is ready to be released to the shared sandbox environment)

Service capability reviews

Carrying out service capability reviews to ensure they are meeting key performance indicators such as performance, availability, etc.

Matching user needs to devices

This involves:

  • articulating user needs in relation to end user devices
  • having the technical understanding of a variety of products in order to match the needs of users to a range of appropriate devices

Email and collaborating platforms

This involves knowing:

  • the range of email and collaboration platforms available to government (eg Google Apps, Office 365 and Exchange)
  • the benefits/risks of each when choosing solutions

Telephony and data

Understanding changes in telephony and the market shift away from fixed lines towards WiFi and mobile technologies. This is particularly important in a context of enabling a more mobile civil service workforce.

Learning resources:

In its section on agile, the Service Design Manual includes a subsection on continuous delivery.

Computer Weekly published a useful article on how to set up development operations.

The Service Design Manual includes a description of the web operations skills necessary for developing secure, maintainable and available systems - as well as a web ops job description. It also assembles a range of user stories for web operations, which is a useful starting point when understanding the scope of infrastructure work. It also lists a collection of guidance for operating a service.