Guidance

Core Elements of AI

Published 23 October 2020

1. Data Element

We must have access to the right data, for both development and operational use of any AI. This could include any source, from our own systems and sensors, to large public datasets, with different implications for privacy and security, and different ownership and usage rights. The one thing to remember about data is that it is more difficult than most people realise. And it’s difficult in so many different ways – data likes diversity. So handling data generally takes longer than you think and then some. But, without data you can’t do anything – it’s that important.

Here are some of the questions you need to ask and things to consider:

1.1 Data requirements and availability

  • What data do we need to use throughout, from development, training (the AI) and operational use?

  • Does that data exist in a usable format? Quite frequently the answer is either a no’, or a yes, but …’

  • If the data isn’t right are we able to wrangle it into shape?

  • Most importantly, do we understand the data, both structurally and semantically? You’ll often find the data is poorly documented and you have to rely on finding someone who really understands it – this in itself can be a challenge.

1.2 Data collection

If the data really isn’t available there are really only three options:

  1. Give up – Although this is the easiest option, it’s not guaranteed to fill the end user with joy. Though if collecting data is very hard or costly, then maybe this is the right answer.

  2. Collect the data – An option if you can and a major task in itself, but at least you will have a good understanding of the data at the end. Or, at least you think you will, it might be worth checking that you do.

  3. Generate synthetic data – This might be a good option if you really understand the data. It is challenging although there are more and more tools being created to help with this process.

1.3 Data labelling

A lot of AI algorithms require quality labelled training data to learn what to look for (this approach is known as supervised learning – see the Biscuit Book on AI, data science and machine learning for more detail).

So once you’ve got your hands on some data, you may need to think about how to label this data. Questions to think about:

  • What labels do you want to use?
  • How difficult and time-consuming will this be?
  • Who has the right expertise to be able to label data?

1.4 Representative and balanced datasets

  • Is the data sufficiently representative of all cases where we will want to use the AI or Autonomy? In other words is the data good enough for our purpose.

  • You need to think about how data may change over time. Is a training data set that was collected in one context or period of time going to be representative of the data a system might encounter when it is used in the future?
  • Do we understand where bias exists in our data, whether this matters and how we can mitigate if so? First thing to recognise is that all data will be biased in some way, so we should understand what the biases are, do they matter and if they do, what can be done about it? Ignoring them is a common tactic (not to be recommended!).

1.5 Data ownership and access

  • Do we have appropriate arrangements to access and use the data? An amazingly frequent blocker. Even if the data exists and is what you need, are you allowed to get at it? Don’t assume you can, or that it won’t take time to acquire.

1.6 Confidential and private data

  • Can we handle sensitive data securely? The essential thing here is having the right methods and secure storage. Again, this can take time to set up.

2. Algorithms Element

We must have appropriate algorithms (the approach a computer takes) to do what we want to do – the goal if you like. They determine what needs to be done, with what and when, and they must be right for the task. This is easy to state but not always easy to prove. Algorithms must be sufficiently robust, and able to deal with the complexity and constraints of the environment in which they operate – what works well in one environment might be terrible in another – you can’t assume anything.

Here are some questions you’ll need to ask:

2.1 Algorithm selection

  • Do we have appropriate algorithms and techniques to solve this problem? Do we understand how the algorithm works and its limitations?

  • This could be seen as a trick question: fully understanding how a machine learning algorithm works is near impossible. So really this is about understanding enough about how the algorithm works, and the situations it is likely to be suited for and those it is not.

2.2 Implications on, and constraints applied by, other Building Blocks

  • What is the impact of algorithm choice on the other Building Blocks, and what constraints do they impose? Examples are:

    • Do we understand the trade-offs between algorithm complexity and available compute power: does the compute platform have enough grunt to support the algorithm’s demand?
    • What about the trade-off with data: some algorithms may offer high performance but require extensive (and therefore costly and time consuming!) data labelling efforts.

2.3 Robustness

  • Are the algorithms and techniques able to deal with errors or noise in input data? Are they likely to crash or go wrong for other reasons?

2.4 Usage rights and licencing

  • Do we understand our legal usage rights where we wish to use algorithms produced by others? This is important because more often than not you will not be developing total solutions but utilising code and algorithms developed by others (including Open Source) – the question is do they permit this?

3. Platforms Element

The platform comprises the hardware and software where the AI is developed, and where it is then run. These are usually different platforms – it’s not a sensible approach to conduct development on live systems! – so we need to think about how to move a useful AI application from development to live.

These platforms could be completely standalone, but more often will be part of an interconnected system. We therefore need to ensure the design of the platform is appropriate, with technical and contractual arrangements to enable timely and cost-effective integration and updates, and with sufficient data storage, compute capability and connectivity where required.

Some questions you should ask are:

3.1 Operational platform

  • Where are the AI and its data physically hosted; who is providing the hardware platform?
  • Do we have the right technical and contractual arrangements to deploy, maintain and upgrade though life?

3.2 Development platform

  • What development platform will we use?
  • Is this accessible to all relevant parties (government, industry and academic partners, international partners, etc.)?
  • What is the route from development to deployment? (And, how many of us never think of that until it’s too late?)

3.3 Architectures and standards

  • What architecture will be used, and is it open?
  • What interfaces will we use?
  • Can we re-use system components across the organisation?

3.4 Connectivity and compute

  • Is there sufficient compute power, in the right places, or can we access it remotely?
  • How much data needs to move around the system, and is that achievable?
  • Have we considered the merits of AI at the edge (for example a drone might have enough onboard computer power to enable it to act completely independently) versus centralised systems, or novel alternatives?
  • Is the connectivity and compute robust and resilient? What happens if the wifi goes down!?

4. Integration Element

We must consider AI and Autonomy within the context of the wider capability it supports. This means working out how an Autonomous System will interact with other systems (a system of systems!), and potential implications of introducing an Autonomous System – both positive and negative.

Importantly, an Autonomous System will need to interact with people in some way, whether that be the operators or a pedestrian who could otherwise be run over by an overexcited autonomous car. It is therefore critical to understand how people and machines will interact to do the intended job and to do so safely and efficiently.

This means not simply assuming an Autonomous System will replace human operators without impacting the wider system. Equally, where people and machines work together it is important that the person still has a valued role and also does not have to spend significant amounts of time correcting what the machine has just done.

Here are some questions you’ll need to ask:

4.1 Systems integration

  • Are we clear how the AI and Autonomy integrate into the wider Autonomous System?
  • And how does that system integrate with other systems (it’s rarely completely isolated)?

4.2 Ergonomics and interface design

  • How will operators interface with the completed system? How well are the needs of people being considered?
  • Is it intuitive, easy and clear to the operator how to achieve the desired outcome?
  • Is the interface similar to other systems?
  • Can we use common components to do this?

4.3 Human-autonomy teaming

  • How does the Autonomous System work with the user or operator, as part of the team, rather than being just a tool that they need to operate?

  • How does the user maintain meaningful control, where necessary?

  • Are there any potentially negative implications of introducing autonomy, e.g. could autonomy applied to one role adversely affect a users’ ability to perform a different role?

  • Will the user get bored? This is important but often overlooked – if the machine does most of the work and leaves only tedious bits to the user, the overall system performance could decline due to the user becoming bored or simply loosing concentration.

4.4 Interoperability

  • How does our system interoperate with others?
  • Does the system need to interact outside of our own enterprise boundary, i.e. other than trained operators using our own systems?
  • Will it encounter the public, and will it need to interface with other machines outside of our control?