Tag content with international language codes
Updated 16 March 2022
© Crown copyright 2022
This publication is licensed under the terms of the Open Government Licence v3.0 except where otherwise stated. To view this licence, visit nationalarchives.gov.uk/doc/open-government-licence/version/3 or write to the Information Policy Team, The National Archives, Kew, London TW9 4DU, or email: psi@nationalarchives.gov.uk.
Where we have identified any third party copyright information you will need to obtain permission from the copyright holders concerned.
This publication is available at https://www.gov.uk/government/publications/open-standards-for-government/language-tags
Use the ISO 639-1:2002 language codes standard to add consistent, internationally recognised language codes to your data.
1. Summary of the standard’s use for government
The ISO 639-1 standard uses 2 letter codes to represent the names of more than 500 internationally recognised languages. It does not represent languages that are exclusively for machines. Use this standard to make sure you reference languages in a consistent way across your datasets.
The government chooses standards using the open standards approval process and the Open Standards Board has final approval. Read more about the process for language codes.
2. How this standard meets user needs
Use this standard for consistent language tagging. For example, when using cross-platform character encoding to make sure different systems correctly identify languages.
When systems can consistently identify which languages you’ve used it can help users who:
- want to trade with UK businesses
- plan to travel to the UK from abroad
- live in the UK but do not speak English
The government also has a legal requirement to translate information into Welsh. This standard provides a consistent way of tagging this information so it’s easier to find.
Using this standard means:
- users can find information in the language they need
- services and content have consistent language tags
- screen readers can identify which language the content is in
3. How to use the standard
When you’re publishing content in multiple languages you must use this standard’s 2 letter codes in the tags or metadata.
You must use language tags in the relevant HTML and XML document metadata.
This standards does not cover:
- standard methods for attaching language tags to other formats, such as JSON
- methods of presenting a user with text, in particular, HTTP language negotiation or the URL suffix scheme currently used by GOV.UK
Use the World Wide Web Consortium (W3C) guidance to:
- show you how to annotate language on the web, for both HTML and XML formats
- declare the language of a web page or a portion of a web page using HTML lang attribute
- declare the language of a body of text using the ‘xml:lang’ attribute
You can also get a list of:
-
2 and 3 letter language tags on the Internet Assigned Numbers Authority (IANA) website
- extended language tags with script subtags in RFC 5646, if you need to add information to the language tag
- tags and their country names on the US library of congress website