Advai: Operational Boundaries Calibration for AI Systems via Adversarial Robustness Techniques

Case study from Advai.

From:: Department for Science, Innovation and Technology
Published: 12 December 2023

Use case:: Big data analytics, Machine learning, Deep learning, Virtual agents or artificial conversational interfaces, Robotic process automation and decision management and Robotics and autonomous vehicles/systems
Sector:: Manufacturing (SIC Code Section C), Energy & Utilities (SIC Code Sections D & E), Retail (SIC Code Section G), Transportation & Storage (SIC Code Section H) and Accommodation and Food Service (SIC Code Section I)
Show 7 more
Digital & Comms (SIC Code Section J), Financial and Insurance (SIC Code Section K), Real Estate (SIC Code Section L), Professional, Scientific & Professional Activities (SIC Code Section M), Administrative & Support Services (SIC Code Section N), Public Administration & Defence (SIC Code Section O), and Other Services (SIC Code Section S)
Principle:: Safety, security and robustness, Appropriate transparency and explainability, Fairness and Accountability and governance
Key function:: R&D, Product and service development, Risk management and Strategy and corporate finance
AI Assurance Technique:: Data assurance, Compliance audit, Formal verification, Performance testing, Risk Assessment and Bias Audit
Assurance Technique Approach:: Technical, Procedural and Educational

Background & Description

To enable AI systems to be deployed safely and effectively in enterprise environments, there must be a solid understanding of their fault tolerances in response to adversarial stress-testing methods.

Our stress-testing tools identifies vulnerabilities from these two broad categories of AI failure:

Natural, human-meaningful vulnerabilities encompass failure modes that a human could hypothesise, e.g. a computer vision system struggling with a skewed, foggy, or rotated image.
Adversarial vulnerabilities, pinpoint where minor yet unexpected parameter variations can induce failure. These vulnerabilities not only reveal potential attack vectors but also signal broader system fragility. It’s worth noting that the methods for detecting adversarial vulnerabilities can often reveal natural failure modes, too.

The process begins with “jailbreaking” AI models, a metaphor for stress-testing them to uncover hidden flaws. This involves presenting the system with a range of adversarial inputs to identify at what points the AI fails or when it responds in unintended ways. These adversarial inputs are crafted using state-of-the-art techniques that simulate potential real-world attacks or unexpected inputs that the system may encounter.

Advai’s adversarial robustness framework then defines a model’s operational limits – points beyond which a system is likely to fail. This use case captures our approach to calibrating the operational use of AI systems according to their points of failure.

How this technique applies to the AI White Paper Regulatory Principles

More information on the AI White Paper Regulatory Principles.

Safety, Security & Robustness

Proactive adversarial testing pushes AI systems to their limits, ensuring that safety margins are understood. This contributes to an organisation’s ability to calibrate their use of AI systems within safe and secure parameters.

Appropriate Transparency & Explainability

Pinpointing the precise causes of failure is an exercise in explainability. The adversarial approach teases out errors in AI decision-making, promoting transparency and helping stakeholders understand how AI conclusions are reached.

Fairness

The framework is designed to align model use with organisational objectives. After all, ‘AI failure’ is by nature a deviation from an organisational objective. These objectives naturally include fairness related criteria, such as preventing bias-free models and promoting equitable outcomes.

Accountability & Governance

Attacks are designed to discover key points of failure and this information arms the managers responsible for overseeing those models with the ability to make better deployment decisions. Thus the assignment of an individual manager responsible for defining suitable operational parameters improves governance. The adversarial findings and automated documentation of system use also create an auditable trail.

Why we took this approach

Adversarial robustness testing is the gold standard for stress-testing AI systems in a controlled and empirical manner. It not only exposes potential weaknesses but also confirms the precise conditions under which the AI system can be expected to perform unreliably, guiding the formulation of precise operational boundaries.

Benefits to the organisation using the technique

Enhanced predictability and reliability of AI systems that are used within their operational scope, leading to increased trust from users and stakeholders.
A more objective risk profile that can be communicated across the organisation, helping technical and non-technical stakeholders align on organisational need and model deployment decisions.
Empowerment of the organisation to enforce an AI posture that meets industry regulations and ethical standards through informed boundary-setting.

Limitations of the approach

While adversarial testing is thorough, it is not exhaustive and might not account for every conceivable scenario, especially under rapidly evolving conditions.
The process requires expert knowledge and continuous re-evaluation to keep pace with technological advancements and emerging threat landscapes.
Internal expertise is needed to match the failure induced by adversarial methods with the organisation’s appetite for risk in a given use-case.
There is a trade-off between the restrictiveness of operational boundaries and the AI’s ability to learn and adapt; overly strict boundaries may inhibit the system’s growth and responsiveness to new data.

Further Links (including relevant standards)

Further AI Assurance Information

For more information about other techniques visit the CDEI Portfolio of AI Assurance Tools: https://www.gov.uk/ai-assurance-techniques
For more information on relevant standards visit the AI Standards Hub: https://aistandardshub.org

Published 12 December 2023

Contents

Advai: Operational Boundaries Calibration for AI Systems via Adversarial Robustness Techniques

Background & Description

How this technique applies to the AI White Paper Regulatory Principles

Safety, Security & Robustness

Appropriate Transparency & Explainability

Fairness

Accountability & Governance

Why we took this approach

Benefits to the organisation using the technique

Limitations of the approach

Further Links (including relevant standards)

Further AI Assurance Information

Is this page useful?

Help us improve GOV.UK

Help us improve GOV.UK

Cookies on GOV.UK

Advai: Operational Boundaries Calibration for AI Systems via Adversarial Robustness Techniques

Background & Description

How this technique applies to the AI White Paper Regulatory Principles

Safety, Security & Robustness

Appropriate Transparency & Explainability

Fairness

Accountability & Governance

Why we took this approach

Benefits to the organisation using the technique

Limitations of the approach

Further Links (including relevant standards)

Further AI Assurance Information

Updates to this page

Is this page useful?

Help us improve GOV.UK

Help us improve GOV.UK