Uptime and availability: keeping your service online

Give feedback about this page

From:: Technology community (web operations)

You must build and run your digital service in a way that means it’ll work for users when they need it.

This could mean you need it to be available 24 hours a day and 365 days a year.

To achieve the level of uptime you need involves good planning and design.

Design to maximise uptime

Your service should have more possible states than simply ‘on’ or ‘off’. For example you can:

design your components so that they fall back to minimal functions if something goes wrong
introduce a read-only mode where users can look at information but not change it
build in redundancy and avoid single points of failure (eg having only one vendor could be a single point of failure)
use more than one web server and allow load-balancing between servers, to avoid servers failing
use database systems which spread data and queries across a cluster, to minimise database crashes

If your service relies on a third party service and this goes down, you can queue information and process it later.

Issues that can affect uptime

Underlying infrastructure availability

Your service’s availability is dependent on the availability of many systems, potentially with multiple suppliers. You may not have a relationship with all of these suppliers. This can become complicated.

Example

Your application is maintained by your development team. The application relies on an application server or database provided by another team in your department. The server or database runs on an infrastructure as a service platform provided via a contract with a commercial supplier. The infrastructure as a service platform relies on network connectivity and power from utilities that you have no direct contract with.

You should understand your dependencies and their dependencies and all the intended uptime expectations.

Scheduled maintenance

Some services don’t count pre-arranged maintenance periods as downtime.

For example, a service could claim 100% uptime even though it shuts down every Monday evening for maintenance.

You shouldn’t hide uptime problems behind multiple maintenance periods. You can classify downtime as planned (scheduled maintenance) or unplanned (other problems), if your service genuinely needs scheduled maintenance.

Suppliers and contracts

You shouldn’t underestimate the impact that contracts you agree with suppliers (of products or services) can have on your service’s availability.

You need to fully understand the terms in any contracts you agree, for example:

service level agreements
uptime guarantees

Suppliers may miss uptime guarantees or service level agreement response times.

Although they may offer you money or service credits as compensation, you should consider whether this really offsets the effect of the downtime on your users.

If you’re regularly getting credits for uptime problems, consider whether you’re really getting the offered uptime or service level agreement response from your supplier.

Decide on out-of-hours support

If your service fails outside of normal office hours, like evenings and weekends, it’ll be down for a long time unless you’ve got someone responsible for out-of-hours support.

Carry out user research to find out whether your users are likely to use your service during these out-of-hours times. If they are you should:

put someone in your team on call to deal with any problems
have dedicated 24/7 support

Tell users about downtime

Deliberate downtime

You may decide that you can’t afford to guarantee your service will be up at all times. If you do this, tell your users when it’ll be down and explain why.

Unanticipated downtime

You should have a status page that you can update when there’s downtime you didn’t anticipate.

Case studies and examples

Find out what happens when something goes wrong on GOV.UK, from the Inside GOV.UK blog.

You may also find the Monitoring the status of your service guide useful.

Published 23 May 2016

Uptime and availability: keeping your service online

Design to maximise uptime

Issues that can affect uptime

Underlying infrastructure availability

Scheduled maintenance

Suppliers and contracts

Decide on out-of-hours support

Tell users about downtime

Deliberate downtime

Unanticipated downtime

Case studies and examples

Is this page useful?

Help us improve GOV.UK

Help us improve GOV.UK

Cookies on GOV.UK

Uptime and availability: keeping your service online

Design to maximise uptime

Issues that can affect uptime

Underlying infrastructure availability

Scheduled maintenance

Suppliers and contracts

Decide on out-of-hours support

Tell users about downtime

Deliberate downtime

Unanticipated downtime

Case studies and examples

Related guides

Updates to this page

Is this page useful?

Help us improve GOV.UK

Help us improve GOV.UK