Guidance

API technical and data standards

The following web-based application programming interface (API) standards guidance will help your organisation deliver the best possible services to users.

Publish your APIs over the internet by default. Email Government Digital Service (GDS) if you think your APIs should not be published over public infrastructure.

Follow the Technology Code of Practice

Make sure your APIs satisfy the requirements of the Technology Code of Practice by making sure they:

  • follow the Open Standards Principles of open access, consensus-based open process and royalty-free licensing
  • scale so they can maintain service level objectives and agreements when demand increases
  • are stable so they can maintain service level objectives and agreements when changed or dealing with unexpected events
  • adhere to UK government security policies and guidelines

Use RESTful

Follow the industry standard and where appropriate build APIs that are RESTful, which use HTTP verb requests to manipulate data.

When responding to requests, you should use HTTP verbs for their specified purpose. You should use links or CURIES (hypermedia) to refer to related resources. This makes it easier to find those resources. For example, you might return a “person” object which links to a resource representing their company in the following way:

{ 
“name”: “Bob Person”,
“company”: “https://your.api/company/bobscompany”;
}

JSON-LD is a specification which uses hypermedia to enable linked data in this way.

One of the advantages of REST is that it gives you a framework for communicating error states.

In some cases, it may not be applicable to build a REST API, for example, when you are building an API to stream data.

Use HTTPS

You should use HTTPS when creating APIs.

Adding HTTPS will secure connections to your API, preserve user privacy, ensure data integrity, and authenticate the server providing the API. The Service Manual provides more guidance on HTTPS.

Secure APIs using Transport Layer Security (TLS) v1.2. Do not use Secure Sockets Layer (SSL) or TLS v1.0.

There are multiple free and low-cost vendors that offer TLS certificates.

Make sure potential API users can establish trust in your certificates. Make sure you have a robust process for timely certificate renewal and revocation.

Use Uniform Resource Identifiers (URIs) to identify certain data

When your API returns data in response to an HTTP call, you should use URIs in the payload to identify certain data. Where appropriate, you should use CURIEs, for example, in Registers.

Use JSON

Your first choice for all web APIs should be JSON where possible.

Only use another representation to build something in exceptional cases, like when you:

  • need to connect to a legacy system, for example, one that only uses XML
  • will receive clear advantages from complying with a broadly adopted standard (for example, SAML)

We recommend you should:

  • create responses as a JSON object and not an array - arrays can limit the ability to include metadata about results and limit the API’s ability to add additional top-level keys in the future
  • avoid unpredictable object keys such as those derived from data as this adds friction for clients
  • use consistent grammar case for object keys - choose under_score or CamelCase and be consistent
  • document and implement a consistent ordering for JSON arrays to prevent errors caused by inaccurate assumptions

Use Unicode for encoding

The Unicode Transformation Format (UTF-8) standard is mandatory for use in government when encoding text or other textual representations of data.

Use of IP address whitelisting

You should not whitelist the IP addresses of the APIs you consume. This is because APIs may be provided using Content Delivery Networks (CDNs) and scalable load balancers, which rely on flexible, rapid allocation of IP addresses and sharing. Instead of whitelisting, you should use an HTTPS egress proxy.

When to authenticate your API

Authentication is required when you want to identify clients for the purposes of:

  • rate limiting/throttling
  • auditing
  • billing
  • authorisation

Your purpose will dictate the security requirements for your authentication solution. For example, if you need to identify users purely for rate limiting, you may not need to refresh user tokens very often as a token in the wrong hands will be unlikely to threaten your service.

Make sure you consider your API may require more than just authenticating an organisation token, for example, when dealing with sensitive information such as medical data.

When using open data do not use authentication so you can maximise the use of your API.

Authorise users of your API

The authorisation server is usually managed by the organisation implementing the API. Use OAuth2, the open authorisation framework, to authorise the client application. This service gives each registered application an OAuth2 Bearer Token, which can be used to make API requests on the application’s own behalf (the token may or may not be tied to a specific individual).

There are significant benefits to the simplicity of OAuth2 tokens. However, there may be more sensitive cases where it would be appropriate to explore the use of JSON Web Tokens (JWTs).

When authorising a user to directly manipulate an API you may choose to use Open ID Connect (OIDC), which builds on top of OAuth2.

Good practice for tokens and permissions

When using OAuth2 for authorisation you should refresh access tokens regularly. Failure to do so can lead to vulnerabilities.

Make sure the tokens you provide have the narrowest permissions possible. Narrowing the permissions means there’s a much lower risk to your API if the tokens are lost by users or compromised.

Make sure your organisation can force a reissue of a token if there is a reason to suspect it has been compromised.

Monitor unusual activity

Your API security is only as good as your day-to-day security processes.

Monitor APIs for unusual behaviour just like you’d closely monitor any website. Look out for changes in IP addresses or users using APIs at unusual times of the day.

Represent time and date

The government mandates using the ISO 8601 standard to represent date and time in your payload response. This helps people read the time correctly.

Use a consistent date format. For dates, this looks like 2017-08-09. For dates and times, use the form 2017-08-09T13:58:07Z.

Represent a physical location

The European Union mandates using the ETRS89 standard for the geographical scope of Europe. You can also use WGS 84 or other CRS coordinate systems for European location data in addition to this.

You must use the World Geodetic System 1984 (WGS 84) standard for the rest of the world. You can also use other CRS coordinate systems for the rest of the world in addition to this.

You should use GeoJSON for the exchange of location information.

Iterate your API

When iterating your API to add new or improved functionality, you should minimise disruption for your users so that they don’t incur unnecessary costs.

To minimise disruption for users, you should:

  • make backwards compatible changes where possible - specify parsers ignore properties they don’t expect or understand to ensure changes are backwards compatible (this allows you to add fields to update functionality without requiring changes to the client application)
  • use a version number as part of the URL when making backwards incompatible changes
  • make a new endpoint available for significant changes
  • provide notices for deprecated endpoints

When you need to make a backwards incompatible change

When you need to make a backwards incompatible change you should consider:

  • incrementing a version number in the URL (start with /v1/ and increment with whole numbers)
  • supporting both old and new endpoints in parallel for a suitable time period before discontinuing the old one
  • telling users of your API how to validate data, for example, let them know when a field is not going to be present so they can make sure their validation rules will treat that field as optional

Sometimes you’ll need to make a larger change and simplify a complex object structure by folding data from multiple objects together. In this case, make a new object available at a new endpoint, for example, combine data about users and accounts from /v1/users/123 and /v1/accounts/123 and produce /v1/consolidated-account/123.

Set clear deprecation policies

Set clear API deprecation policies so you’re not supporting old client applications forever.

State how long users have to upgrade, and how you’ll notify them of these deadlines. For example, at GDS, we usually contact developers directly but we also announce deprecation in HTTP responses using a ‘Warning’ header.

Performance testing and scalability

For highly cacheable open data access APIs, a well-configured Content Delivery Network (CDN) may provide sufficient scalability.

For APIs that don’t have those characteristics, you should set quota expectations for your users in terms of capacity and rate available. Start small, according to user needs, and respond to requests to increase capacity by making sure your API can meet the quotas you have set.

Make sure users can test your full API up to the limits you have set.

Enforce the quotas you have set, even when you have excess capacity. This makes sure that your users will get a consistent experience when you don’t have excess capacity, and will design and build to handle your API quota.

As with user-facing services, you should test the capacity of your APIs in a fully representative environment to help make sure you can meet demand.

Compliance testing

You should provide your development team with the ability to test your API using sample test data, if applicable. Testing your API should not involve using production systems and production data.

API users should also be able to test your API using a sandbox.

Design data fields

When designing your data fields, you should consider how you may need to iterate them. For example, if you need to collect personal information as part of your dataset you may decide on the following payload:

{
person": {
"name": "Alice Wonderland",
"dob": "1999-01-01",
"married": true
   }
}

In this case, before using the structure, you need to consider whether:

  • the design can cope with names from cultures which don’t have first and last names
  • the abbreviation “DOB” makes sense or whether it’s better to spell out the field to date of birth
  • DOB makes sense when combined with DOD (date of death) or DOJ (date of joining)

You should also make sure you provide all the relevant options. For example, the “marriage” field is likely to have more than 2 states you wish to record: “married”, “unmarried”, “divorced”, “widowed”, “estranged”, “annulled” and so on.

Respond to data requests

Configure APIs to respond to ‘requests’ for data rather than ‘sending’ or ‘pushing’ data. This makes sure the API user only receives the information they require.

When responding, your API must answer the request fully and specifically. For example, an API should respond to the request “is this user a UK citizen?” with a boolean. The answer should not return any more details than is required and should rely on the client application to correctly interpret it.

Download a whole dataset in bulk

You should allow users to download whole datasets unless they contain classified information. This gives users:

  • the ability to analyse the dataset locally
  • support when performing a task that requires access to the whole dataset (for example, plotting a graph on school catchment areas in England)

Make data available in CSV formats as well as JSON when you want to publish bulk data. This makes sure users can use a wide range of tools, including off-the-shelf software, to import and analyse this data.

Publish bulk data on data.gov.uk and ensure there is a prominent link to it.

Users should be able to index their local copy of data using their choice of database technology and then perform a query to meet their needs. This means that future API downtime won’t affect them because they already have all the data they need.

Using a record-by-record data API query to perform the same action would be suboptimal, both for the user and for the API. This is because:

  • rate limits would slow down access, or may even stop the whole dataset from downloading entirely
  • if the dataset is being updated at the same time with the record-by-record download, users may get inconsistent records

Keep your local dataset copy up to date

Don’t encourage users to keep large datasets up to date by re-downloading them because this approach is wasteful and impractical. Instead, let users download incremental lists of changes to a dataset. This allows them to keep their own local copy up to date and saves them having to re-download the whole dataset repeatedly.

There isn’t a recommended standard for this pattern, so users can try different approaches such as:

  • encoding data in Atom/RSS feeds
  • using emergent patterns, such as event streams used by products such as Apache Kafka

Audit logging of requests for personal data

If your API serves personal or sensitive data, you must log when the data was provided and to whom. This will help you respond to data subject access requests and help you detect fraud or misuse.

Document your API

To document your API start by:

  • using the OpenAPI Specification where appropriate for generating documentation
  • sampling code
  • explaining how developers can use your API

You should also include:

You should always make sure your documentation is clear, and communicate when changes are made.

Published 7 February 2018