COM guidance statement (G11): Guidance on the use of (Q)SAR models to predict genotoxicity

Question 1

Preface

Accepted Answer

The Committee on Mutagenicity of Chemicals in Food, Consumer Products and the Environment (COM) is an expert advisory committee whose terms of reference include providing advice on the principles of genotoxicity testing and assessment. The COM has published overarching guidance A strategy for testing of chemicals for genotoxicity. Within this guidance, quantitative structure-activity relationship (QSAR) evaluation is recommended as a preliminary step (Stage 0) to determine the intrinsic chemical and toxicological properties of a test chemical, prior to devising the genotoxicity testing approach (Stages 1 and 2). In the absence of specific guidance from COM on the use of QSAR models to predict genotoxicity, the committee agreed to develop such advice. It was considered that this would be included within the COM guidance document Genotoxicity assessment of impurities in chemical substances, as this was where the (Q)SAR approach was most likely to be applied. However, this document has been published as a stand-alone guidance.

In this document, structure-activity relationship (SAR) and QSAR models are collectively referred to as (Q)SAR models or in silico approaches. (Q)SAR evaluation refers to the application of QSAR models and/or knowledge-based SAR models appropriate for genotoxicity evaluation.

This guidance outlines COM recommendations for an evaluation of genotoxicity using (Q)SAR models, including the selection of the (Q)SAR model by assessment and justification of the model and reporting of (Q)SAR predictions.

The information in this document is intended to be a resource for regulatory authorities and assessors, (Q)SAR model developers, and (Q)SAR users.

Question 2

Background

Accepted Answer

SAR and QSAR models are theoretical models that can be used to qualitatively or quantitatively predict a broad range of chemical characteristics, including the physicochemical, biological (for example, an (eco)toxicological endpoint) and environmental properties of compounds, from the knowledge of their chemical structure (see European Chemicals Agency 2008, European Chemicals Agency 2016, OECD 2023).

A SAR is a qualitative relationship that relates a chemical (sub)structure to the presence or absence of a property or activity of interest (European Chemicals Agency 2016).

A QSAR is a mathematical, statistical-based model relating one or more quantitative parameters, derived from the chemical structure, to a quantitative measure of a property or activity (13).

In this guidance document, the chemical for which an endpoint is being predicted by a (Q)SAR model is referred to as the test chemical. In other sources, this test chemical may be called a ‘target chemical’ (13), ‘query compound’ or ‘input structure’. Test chemical selection is the first step in conducting (Q)SAR for evaluating genotoxicity.

A range of regulatory purposes and industry objectives (for example, chemical registration, pharmaceutical impurity analysis, chemical screening and characterization) acknowledge the utility of (Q)SAR models and support the use of (Q)SAR predictions for evaluation of genotoxicity. For example, (Q)SAR is an accepted approach for evaluating mutagenicity under ICH M7 (20) and is recommended for use in REACH chemical regulation (13, 10). However, guidance on specific criteria or considerations for conducting (Q)SAR for the evaluation of genotoxicity is limited.

The objective of this guidance is to outline COM recommended approaches and considerations for selecting (Q)SAR models and reporting (Q)SAR predictions for genotoxicity that meet regulatory standards, when necessary. The guidance also discusses how to further evaluate the (Q)SAR results with the use of expert knowledge.

When considering the use of (Q)SAR results for the evaluation of genotoxicity in the absence of experimental data for a test chemical, the COM recommends that several conditions should be met depending on the objective of the investigation - for example, regulatory submission, pharmaceutical development, chemical hazard screening, impurities analysis (13, 20, 29).

The general conditions are as outlined by the ‘OECD Principles for the Validation, for Regulatory Purposes, of (Quantitative) Structure-Activity Relationship Models’ (27) and include:

i) use of a (Q)SAR model where the scientific validity has been established
ii) the substance should fall within the applicability domain of the (Q)SAR model
iii) the prediction should be fit for the regulatory purpose
iv) the information should be well documented

Although these conditions are intended for regulatory applications, they can generally be applied for most objectives using (Q)SAR for hazard identification. Indeed, ECHA 2016 (13) recommends that these principles are adhered to in the case of REACH Annex XI for evaluating any toxicological endpoint, including genotoxicity (13).

Although the focus of the COM (Q)SAR guidance is to support the evaluation of genotoxicity, the recommendations presented here are underpinned by general guidance for (Q)SARs that contribute to developing a systematic and harmonised framework, including:

OECD Principles for the Validation, for Regulatory Purposes, of (Quantitative) Structure-Activity Relationship Models (27)
(Q)SAR Assessment Framework (29)
Guidance on information requirements and chemical safety assessment. Chapter R.6: QSARs and grouping of chemicals (12)
Practical guide: How to use and report (Q)SARs (13)
Memorandum on the use of in silico methods for assessment of chemical hazard (30)
Read across assessment framework (14)
How to apply ECHA’s practical guide ‘How to use and report (Q)SARs for the assessment of substances under BPR’ (15)

In addition, the general guidance documents listed above are considered alongside specific guidance on the use of (Q)SAR for genotoxicity predictions by ICH M7 (20, 21):

“ICH M7(R2) Guideline on assessment and control of DNA reactive (mutagenic) impurities in pharmaceuticals to limit potential carcinogenic risk EMA/CHMP/ICH/83812/2013. Step 5”

and:

“ICH M7(R2) Questions and Answers on assessment and control of DNA reactive (mutagenic) impurities in pharmaceuticals to limit potential carcinogenic risk EMA/CHMP/ICH/321999/2020. Step 5”

as well as peer-reviewed publications that provide information on the state of the science of (Q)SARs for genotoxicity assessment.

Question 3

Integrating (Q)SAR in an evaluation of genotoxicity

Accepted Answer

Integrating (Q)SAR results into a toxicological evaluation for genotoxicity can benefit from developing a framework for selecting (Q)SAR models and using the (Q)SAR results. In general, (Q)SAR results are integrated with experimental data in a weight of evidence (WoE) approach (13, 17, 26). The combination of experimental and (Q)SAR data for the assessment of genotoxicity can improve the overall robustness of an evaluation and allow for a more complete assessment of reliability with expert knowledge review (17, 26).

A suggested framework for using (Q)SAR to evaluate genotoxicity is presented in Figure 1. The framework includes 3 primary stages:

Problem formulation
(Q)SAR assessment
Regulatory assessment

The first stage consists of key steps of identifying regulatory guidance and selecting the genotoxicity endpoint. The second stage consists of identifying the acceptance of (Q)SAR for the specific guidance in place of experimental data, preparing for the (Q)SAR, and conducting the (Q)SAR. The last stage consists of reporting the (Q)SAR.

The first step in the framework is to identify the appropriate regulatory objective and guidance for respective regulations, which may define whether (Q)SAR data is accepted and for which endpoints, as well as any chemical classes considered unacceptable for the application. This step will underpin the model selection process and the reporting of (Q)SAR predictions. Thus it is important that this step occurs early in the process. This step also dictates the selection of genotoxicity endpoints for analysis.

The subsequent steps of model selection and prediction reporting are described below. Considerations for the use of expert knowledge to evaluate out-of-domain predictions are introduced.

Importantly, this framework allows for adaptation to various regulatory schemes, many of which have been suggested and outlined in the published literature. For example, Kovarich and Cappelli (26) present a general workflow for the in silico safety assessment of pharmaceutical impurities in which (Q)SAR data for bacterial mutagenicity (statistical and expert rule-based as defined in paragraphs 16 to 18) are used to identify Class 3, 4, or 5 mutagenic impurities, according to ICH M7 (20).

Question 4

Selection of a valid (Q)SAR model for genotoxicity

Accepted Answer

Selecting the appropriate (Q)SAR model(s) for genotoxicity requires an assessment and justification of the (Q)SAR model. For certain regulatory purposes, such as noted above for REACH regulatory purposes (REACH Annex XI, 1.3), the validity of the (Q)SAR model is the first condition to be fulfilled to allow the use of a (Q)SAR result (13).

A basic understanding of how models are built, curated, and updated can support the selection of the appropriate (Q)SAR model. The models are generally divided into expert systems and statistical QSAR systems (3, 33).

As introduced above, expert systems are based on the association of structural alerts and toxicological activity defined by rules (SAR). As described by Wichard 2017 and Barber and others 2017, an expert system generally simulates the judgement of a “human who has expert knowledge and experience in a particular field.”

The SAR generally consists of 2 major components including a knowledge base and an inference engine; however, not all may contain the latter (3). The former contains the accumulated experience necessary for understanding, formulating, and solving the problems and the latter includes the basic rules for the decision-making process. Specifically, in more advanced knowledge base systems, rules or facts are stored in a knowledge base and retrieved by an inference engine which considers the strength and direction of each assertion through a process known as reasoning (3). Expert systems may also provide the benefit of allowing related and/or tacit knowledge to be captured, often seen in rules, to capture additional information from the test set. The expert system provides reasoning and justification for the decision-making process in an output that can be understood, challenged, and judged by the user (3).

In contrast, statistical QSAR systems use a statistical correlation between structural descriptors and toxicological activity (28). Statistical QSAR systems are driven by the experimental data that are provided to the system (3, 33). Thus, ultimately the statistical-based model derives conclusions based on the training data, with the expert leading the system through the choice of the modelling technique and descriptor sets. Regardless of the different outputs, there is still a need for expert scrutiny of the output to ensure that correlations that may be true for the training set are causally linked rather than coincidences for the target chemicals (3). As introduced above (paragraph 2), expert and statistical systems are collectively referred to as (Q)SAR.

Several examples of (Q)SAR models for genotoxicity have been reported by authoritative bodies (13) and in the published literature ( 16, 18, 19). Many of these programs are widely used and are either commercially or freely available. The user should understand the tool, model, methodology type (SAR and/or QSAR) and genotoxicity endpoint examined to be able to assess the capabilities and limitations of the model for prediction of the genotoxic endpoint (19). Any example of a model introduced in this guidance is not an endorsement from the COM for its use as it is recommended by COM that each model should be evaluated based on the principles further described below.

Prediction models for in vitro bacterial reverse gene mutation (also known as the Ames test) are regarded as well-developed for the purpose and have been validated or evaluated against external data sets for various classes of chemicals, including:

industrial (19)
pharmaceuticals (7)
impurities (26)
pesticides and pesticide metabolites (6, 8, 16)
mycotoxins (26, 32)

Specifically, most tools demonstrate a predictivity comparable with the intrinsic experimental variability of the Ames data, indicating a satisfactory performance of the models predicting gene mutation in bacteria (8).

To increase sensitivity, predictions of gene mutation in bacteria should be made in 2 (Q)SAR models that complement each other, such as from an expert-based model used in conjunction with predictions by a statistical based model for ICH M7 for the evaluation of potential mutagenic impurities in pharmaceuticals (20) or 2 statistical model can be used for other purposes (for example, EFSA). Some models, such as Caesar by VEGA are considered hybrid or integrated tools, which combine statistical and rule-based models to reduce the number of false negative predictions. Consensus models also combine the output of 2 or more models (for example, CASE Ultra) but these are not necessarily of different type.

OECD (29) has recommended principles for validating (Q)SAR models. This model checklist, the (Q)SAR reporting model format (QMRF), is recommended by OECD (29), ECHA (12, 13), and ICH (21) and includes:

i) a defined endpoint
ii) an unambiguous algorithm
iii) a defined domain of applicability
iv) appropriate measures of goodness-of-fit, robustness and predictivity
v) a mechanistic interpretation, if possible

The selection of a quality (Q)SAR model should include validation under these OECD principles. Recommendations to assist with validating the models with respect to a defined endpoint are described in detail below.

Question 5

Defined endpoint

Accepted Answer

OECD 2023 (29) reports that the first principle for assessment of the (Q)SAR model is that the model has a defined endpoint. The COM advises that the evaluation of this principle requires an understanding of the validated experimental studies accepted for evaluation of genotoxicity endpoints by regulators. The (Q)SAR user is encouraged to understand the types of experimental studies conducted to create the test data used to develop the model training sets. As summarised by Hasselgren and others (17), gene mutation in vitro is most widely examined with the bacterial reverse mutation assay, whereas chromosomal damage is examined in vitro with the in vitro micronucleus (MNvit) assay or the in vitro chromosomal aberration (CA) assay. In vivo MN and CA assays are also utilized for this endpoint. Regarding aneugenicity, it is determined based on in vitro or in vivo MN data evaluated in such a way that the mechanism of MN formation could be determined. However, it is often not possible to differentiate an aneugen from a clastogen when evaluating most published historical data.

The number of compounds comprising toxicity databases of (Q)SAR models often vary by genotoxicity endpoint. It is recommended that users consider such statistics to support the evaluation of an endpoint by a model.

ICH M7 focuses only on predictions of the bacterial reverse gene mutation endpoint. However, the opportunity to extend the (Q)SAR approach to other genotoxicity endpoints can be recommended when other regulatory guidance allows it and if the models fill the assessment elements of a Model Checklist and information is provided according to OECD guidance (29) (for example, OECD Principles for (Q)SAR validation).

When considering non-bacterial mutation endpoints to inform on genotoxicity, the COM recommends (Q)SAR users monitor model updates and review published literature to identify potential model limitations (for example, false positive rates) and efforts to develop improved or newer models for consideration of non-bacterial mutation endpoints. If newer models are identified, users should ensure they are compliant with OECD Principles.

When a user implements a (Q)SAR framework that acknowledges OECD recommended assessment elements for models and predictions, potential differences and limitations in model performance for different genotoxicity endpoints will be accounted for, resulting in a balanced assessment to prevent misleading positive (or negative) results from outweighing more reliable data.

Question 6

Reporting the (Q)SAR predictions

Accepted Answer

Even with a valid model, unacceptable predictions of genotoxicity can be obtained under certain conditions (29, 4). For example, the genotoxicity prediction may not be valid if the target chemical falls outside the applicability domain of the (Q)SAR model, the results are inadequate for the regulatory purpose or the risk assessment, or reliable and adequate documentation of the applied method is not provided (13).

The OECD guidance on prediction reporting recommends the (Q)SAR Prediction Reporting Format (QPRF). The OECD prediction checklist includes the following requirements:

i) the model inputs should be correct
ii) the substance should be within the applicability domain of the model
iii) the predictions should be reliable
iv) the outcome should be fit for the regulatory purpose (12, 29)

Reporting the results should include these requirements. Recommendations from the COM to assist with reporting according to these requirements are described in detail below.

Question 7

Domain of applicability

Accepted Answer

OECD (29) principles require that the model is applicable to the substance under analysis. The (Q)SAR user should verify that the target chemical falls within the applicability domain (AD) of the model. The concept of AD was introduced above to assess the probability of a chemical being covered by the (Q)SAR training set. According to ECHA (13), predictions outside the AD are normally not reliable and their use is difficult to justify; however, several approaches can be considered to support an overall prediction of mutagenicity as described further below.

ECHA (13) presents an approach to evaluate if a substance falls into the AD by checking the following elements:

i) descriptor domain
ii) structural fragment domain
iii) mechanistic and metabolic domains, if possible
iv) analogues in the training set
v) accuracy of model predictions for analogues
vi) considerations for specific substances (for example, multi-constituents, additives, impurities, metabolites, degradation products, ionisable substances, large molecular weight substances, potentially hydrolysable substances, surfactants and isomers)

A detailed description of these elements is beyond the scope of this guidance; however, further discussion is presented by ECHA 2016 (13).

The COM recommends the use of a thorough expert knowledge review of the AD in cases in which (Q)SAR software is unable to generate a positive or negative prediction because the target chemical or impurity is outside the applicability domain of the model. As summarised by Amberg and others (1), out-of-domain and/or indeterminate results are often encountered as part of an ICH M7 impurity assessment and can be challenging for the pharmaceutical industry and regulatory agencies to support a prediction of mutagenicity. However, guidance has not been specifically provided by regulatory agencies for dealing with out of domain predictions when assessing mutagenicity, and the conservative approach to assume that indeterminate or out-of-domain (Q)SAR results are positive could result in unnecessary additional drug development costs and delays. Yet, applying expert review has been demonstrated to overturn out of domain calls in 75% of instances (Jayesekara and others 2021).

According to ICH M7 (21), an out of domain or non-coverage result from one of the 2 (Q)SAR models requires additional assessment to classify the compound as a Class 5 impurity (No structural alerts, or alerting structure with sufficient data to demonstrate lack of mutagenicity or carcinogenicity). The guidance acknowledges that, given that the relationship between chemical structure and DNA reactivity is well understood, it is unlikely that a structure with mutagenic potential would be associated with an out of domain result. However, expert knowledge review can provide reassurance in assignment of such impurities to Class 5. ICH M7 (20, 21) indicates that expert knowledge review may include the approaches described in Amberg and others (1).

Amberg and others (1) summarises a variety of approaches for handling out-of-domain results being used across pharmaceutical companies, as well as regulatory agencies, to support an overall prediction. These approaches include applying expert knowledge, applying an additional model, test and/or control of the impurity, and no follow-up.

Regarding expert knowledge review, this may include examining the training set and/or database analogues with detailed experimental data to understand how structural features or physicochemical properties influence the model’s prediction. For analogues, a structurally and/or toxicologically meaningful non-mutagenic analogue or group of analogues from chemical databases related to the target chemical could be used to support an overall prediction for a target chemical. This approach is also considered as read-across.

For rule-based models, expert review would include inspection of the structural features responsible for activation or deactivation of the alert along with an examination of plausible mechanisms, examples, and associated references for any activated alerts (1). Expert review of non-reactive groups could further support an overall prediction for out-of-domain (Q)SAR results (1). This includes a visual assessment of the compound to assure the lack of valid DNA-reactive alerts with plausible mechanisms, while considering knowledge of metabolic activation. Further, there may be additional expert considerations for (Q)SAR systems that include the battery of models for traditional Ames (GC base pairs) and an additional model for the AT base pair reversion site with reliance on the primary GC model when the AT model is out-of-domain.

In addition to expert knowledge review, there may be cases in which running an additional model will provide results within the AD, though this is not required if the user is able to still make a conclusive call on activity.

Question 8

Reliability of prediction

Accepted Answer

OECD (29) lists several aspects to consider when assessing the reliability of a prediction, with the acknowledgement that there may be some overlap with the AD. These include the elements of reproducibility, overall performance of the model, fit within the physicochemical, structural and response spaces of the training set of the model, performance of the model for similar substances, mechanistic and/or metabolic considerations, and consistency of information.

Assessments of reliability of genotoxicity (Q)SAR models per OECD (29) assessment elements are limited. However, discussions in the published literature have focused on the reliability of mutagenicity models compared to other genotoxicity endpoints. For example, Benigni and others 2020 (8) reported that (Q)SAR models do not seem to be able to provide reliable genotoxicity predictions for assays/ endpoints different from Ames and need to be further developed; this is primarily due to a substantial difference between the performance in the prediction of the Ames test, and that of the other experimental assays.

Specific to SAR, Cronin and others (11) proposed 12 criteria to assess the uncertainty associated with structural alerts, allowing for an assessment of confidence. The criteria are based around the stated purpose, description of the chemistry, toxicology and mechanism, performance and coverage, as well as corroborating and supporting evidence of the alert. Several of these characteristics have overlapping principles with the OECD (29) assessment elements.

The (Q)SAR user is encouraged to understand the elements informing on reliability and COM recommends that application of peer-reviewed published schemes can provide additional support for the reliability of a prediction under the OECD QPRF with the proper justification (29).

Question 9

Consideration of regulatory purpose

Accepted Answer

For a (Q)SAR prediction to be adequate, it should be reliable, as discussed above, and relevant for the regulatory decision being addressed (12, 13, 29). The OECD (29) lists the following assessment elements to verify that an outcome is fit for the regulatory purpose:

i) compliance with additional regulatory requirements
ii) correspondence between predicted property and property required by regulation
iii) decidability within the specific framework

For example, as noted throughout this guidance, ICH M7 requires the use of (Q)SAR in combination of 2 different models, rule-based, and statistical-based, to produce a reliable result (20). Thus, the use of an individual prediction would not be fit for purpose (29). This element requires the user to understand requirements for other regulatory purposes (for example, foods, cosmetics, chemicals) as well as potential regional differences (31, 30).

In addition, the property predicted by the (Q)SAR model should match the property required by regulation. For example, specific bacterial strains and presence of metabolic activation may need to be explicitly considered by the model if required by the regulation when evaluating in vitro mutagenicity in a bacterial reverse mutation assay. Further, if the regulation refers to a specific test guideline, the model should include the experimental results obtained following the specified test guideline in its training set. OECD notes exception, however, when considering models for Ames mutagenicity that may include historical data not performed by use of all currently required strains, but positive predictions from the model may still be adequate. Likewise, COM recommends that negative prediction data may still be accepted using negative results published for tests conducted using less than 5 strains of S. typhimurium, following expert review (for example, based on knowledge of mutagenic mechanism and expected inactivity in other bacterial strains not tested) (9). The use of prediction data from fewer than 5 strains is supported by findings showing that 93% of mutagens are identified using only S. typhimurium strains TA98 and TA100 (19).

The adequacy of the prediction for the purposes of classification and labelling (CLP regulations) and/or risk assessment is very much endpoint-dependent (ECHA 2016). Additional information may be needed to assess the adequacy of the prediction in the context of a regulatory decision. Guidance for reporting (Q)SAR results for REACH and for biocides are provided by ECHA (ECHA 2016, ECHA 2019).

Overall, this element of reporting (Q)SAR predictions is connected to the selection of a valid model and developing an initial framework (Figure 1). The COM recommends that the (Q)SAR user should consider the regulatory purpose early in the use of a (Q)SAR program for predicting genotoxicity.

Question 10

Additional considerations

Accepted Answer

Expert knowledge

The COM recommends that expert knowledge is an essential tool in the application of (Q)SAR for genotoxicity. Expert knowledge can be applied in several ways, such as to evaluate predictions outside the AD (as discussed above), and to reconcile predictions from multiple models. Steps may include a database search and/or a detailed review of structural fragments determining the prediction (2, 33). Expert review also requires the user to understand the model output and the basis for a prediction. For example, the training set examples are structurally similar to the query but contain alerting structures not present in the query. In these situations, an additional (Q)SAR prediction may also provide a positive prediction for the same incorrect reason and be irrelevant.

It has been reported that combining predictions with expert knowledge review can increase sensitivity and specificity of the prediction obtained (33, 7, 8, 5, 31). The authors acknowledged that the impact of expert review reveals a large area of context-dependent expert knowledge, which has not been routinely formalised in the (Q)SAR models (even in expert systems) and has the potential to substantially improve the prediction systems. Further, although not common, (Q)SAR models can include formalised rationale for predictions (9). Indeed, expert review has been reported as a valuable exercise that can overturn greater than 90% of equivocal and 75% of outside domain predictions (Jayeskara and others 2021).

Regarding ICH M7, specifically, it is recommended that the outcome of any computer system-based analysis be reviewed with the use of expert knowledge to provide additional supportive evidence on relevance of any positive, negative, conflicting or inconclusive prediction and provide a rationale to support the final conclusion (20).

As noted above, expert knowledge review might include one or a combination of the following:

i) comparison to structurally similar analogs for which bacterial reverse mutation assay data is available (read-across approach)
ii) expert review of the chemical structure to determine if there is potential for the chemical to react with DNA
iii) (Q)SAR output from an additional validated model of the same methodology (that is, expert rule-based or statistical) that generates a prediction that is within its applicability domain (2, 24, 21)

Expert knowledge review can also be effective in evaluating the impact of structural changes that result from metabolic or degradation processes on the genotoxic potential of substances (8). Benigni and others (8) identified and catalogued parent/metabolite structural differences (beyond known structural alerts) that may or may not change experimental Ames mutagenicity. Expert knowledge analysis of structural alerts permitted the rationalisation of most of the changes of patterns of genotoxicity. The authors reported that the expert evaluation is suitable for integration into WoE and tiered evaluation schemes.

Some (Q)SAR tools have provided functionality for expert review within the software to support predictions and the outcome of these should be documented transparently. These internal expert review tools have been demonstrated to match the expected conclusions from a human expert (9). If a decision matrix is developed or adapted externally from the software to combine and reconcile the output of 2 in silico systems, such as in the case of ICH M7, documentation of the transparent guidelines should be included. Such matrices have been presented in the published literature (33, 31.

Question 11

Conclusions

Accepted Answer

A suggested framework for using (Q)SAR to evaluate genotoxicity includes several key components:

i) identify relevant guidance
ii) select endpoints for evaluation
iii) identify need for (Q)SAR
iv) prepare for (Q)SAR (for example, selection of (Q)SAR models
v) conduct (Q)SAR
vi) report of predictions

The identification of the regulatory purpose (if applicable) and relevant guidance for the genotoxicity hazard identification objective is essential to ensuring that the appropriate approach for (Q)SAR model selection and prediction reporting are fit for purpose.

OECD (29) has recommended principles for validating (Q)SAR models. This model checklist is recommended by the OECD (29), ECHA (12, 13) and ICH (21) and includes:

i) a defined endpoint
ii) an unambiguous algorithm
iii) a defined domain of applicability
iv) appropriate measures of goodness-of-fit, robustness and predictivity
v) a mechanistic interpretation, if possible

The performance of (Q)SAR models for mutagenicity (Ames test) is higher than other genotoxicity endpoints, such as for chromosomal damage. Also, ICH M7 does not accept clastogenicity results (for example, chromosomal aberration) to support the evaluation of the mutagenicity of pharmaceutical impurities (21). Understanding the regulatory requirements is essential for selecting the genotoxicity endpoint for evaluation with (Q)SAR.

When considering other genotoxicity endpoints, the COM recommends (Q)SAR users and other stakeholders monitor model updates and review published literature to identify potential model limitations (for example, misleading positive rates) and efforts to develop improved or newer models for consideration of non-Ames endpoints. If newer models are identified, the user and assessor should ensure they are compliant with OECD Principles.

Application domains specific to a mutagenicity model may not sufficiently represent certain substance classes and the model documentation may include a list of these substance classes. The published literature has also noted such limitations specific to mutagenicity models and it is recommended that the (Q)SAR user determine if the target chemical is a member of a substance class for which the model is not recommended.

The OECD prediction checklist includes the following requirements:

i) the model inputs should be correct
ii) the substance should be within the applicability domain of the model
iii) the predictions should be reliable
iv) the outcome should be fit for the regulatory purpose (12, 29)

The COM recommends the use of a thorough expert knowledge review for all predictions to support a conclusion of positive or negative, or in cases in which (Q)SAR software is unable to generate a positive or negative prediction because the target chemical or impurity is outside the AD of the model. In addition to expert review, there may be cases in which running an additional model will provide results within the AD. It is recommended that expert review still be applied when additional models are used as both may provide the same incorrect prediction.

Figure 1. Framework for using (Q)SAR to evaluate genotoxicity

Question 12

References

Accepted Answer

1. Amberg A, Andaya RV, Anger LT, Barber C, Beilke L, Bercu J, Bower D, Brigo A, Cammerer Z, Cross KP, Custer L, Dobo K, Gerets H, Gervais V, Glowienke S, Gomez S, Van Gompel J, Harvey J, Hasselgren C, Honma M, ohnson C, Jolly R, Kemper R, Kenyon M, Kruhlak N, Leavitt P, Miller S, Muster W, Naven R, Nicolette J, Parenty A, Powley M, Quigley DP, Reddy MV, Sasaki JC, Stavitskaya L, Teasdale A, Trejo-Martin A, Weiner S, Welch DS, White A, Wichard J, Woolley D and Myatt GJ (2019). ‘Principles and procedures for handling out-of-domain and indeterminate results as part of ICH M7 recommended (Q)SAR analyses’ Regulatory Toxicology and Pharmacology 2019: volume 102, pages 53 to 64

2. Barber C, Amberg A, Custer L, Dobo KL, Glowienke S, Van Gompel J, Gutsell S, Harvey J, Honma M, Kenyon MO, Kruhlak N, Muster W, Stavitskaya L, Teasdale A, Vessey J and Wichard J (2015). ‘Establishing best practise in the application of expert review of mutagenicity under ICH M7’ Regulatory Toxicology and Pharmacology 2015: volume 73, issue 1, pages 367 to 377

3. Barber C, Hanser T, Judson P and Williams R (2017). ‘Distinguishing between expert and statistical systems for application under ICH M7’ Regulatory Toxicology and Pharmacology 2017: volume 84, pages 124 to 130

4. Barber C, Heghes C and Johnston L (2024). ‘A framework to support the application of the OECD guidance documents on (Q)SAR model validation and prediction assessment for regulatory decisions’ Computational Toxicology 2024: volume 30

5. Benigni R (2021). ‘In silico assessment of genotoxicity. Combinations of sensitive structural alerts minimize false negative predictions for all genotoxicity endpoints and can single out chemicals for which experimentation can be avoided’ Regulatory Toxicology and Pharmacology 2021: volume 126, page 105042

6. Benigni R, Battistelli CL, Bossa C, Giuliani A, Fioravanzo E, Bassan A, Gatnik MF, Rathman J, Yang C and Tcheremenskaia O (2019). ‘Evaluation of the applicability of existing (Q)SAR models for predicting thegenotoxicity of pesticides and similarity analysis related with genotoxicityof pesticides for facilitating of grouping and read across’ European Food Safety Authority Supporting Publication.

7. Benigni R and Bossa C (2019). ‘Data-based review of QSARs for predicting genotoxicity: the state of the art’ Mutagenesis 2019: volume 34, issue 1, pages 17 to 23

8. Benigni R, Serafimova R, Parra Morte JM, Battistelli CL, Bossa C, Giuliani A, Fioravanzo E, Bassan A, Gatnik MF, Rathman J, Yang C, Mostrag-Szlichtyng A, Sacher O and Tcheremenskaia O (2020). ‘Evaluation of the applicability of existing (Q)SAR models for predicting the genotoxicity of pesticides and similarity analysis related with genotoxicity of pesticides for facilitating of grouping and read across: an EFSA funded project’ Regulatory Toxicology and Pharmacology 2020: volume 114, page 104658

9. Cayley AN, Foster RS, Brigo A, Muster W, Musso A, Kenyon MO, Parris P, White AT, Cohen-Ohana M, Nudelman R and Glowienke S (2023). ‘Assessing the utility of common arguments used in expert review of in silico predictions as part of ICH M7 assessments’ Regulatory Toxicology and Pharmacology 2023: volume 144, page 105490

10. Chinen K and Malloy T (2022). ‘Multi-Strategy Assessment of Different Uses of QSAR under REACH Analysis of Alternatives to Advance Information Transparency’ International Journal of Environmental Research and Public Health 2022: volume 19, issue 7

11. Cronin MTD, Bauer FJ, Bonnell M, Campos B, Ebbrell DJ, Firman JW, Gutsell S, Hodges G, Patlewicz G, Sapounidou M, Spinu N, Thomas PC and Worth AP (2022). ‘A scheme to evaluate structural alerts to predict toxicity assessing confidence by characterising uncertainties’ Regulatory Toxicology and Pharmacology 2022: volume 135, page 105249

12. European Chemicals Agency (2008).’ Guidance on information requirements and chemical safety assessment. Chapter R.6: QSARs and grouping of chemicals’

13. European Chemicals Agency (2016). ‘Practical guide: How to use and report (Q)SARs. 3.1’

14. European Chemicals Agency (2017). ‘Read across assessment framework’

15. European Chemicals Agency (2019). ‘How to apply ECHA’s practical guide “How to sue and report (Q)SARs for the assessment of substances under BPR”’

16. Fischer BC, Musengi Y, Konig J, Sachse B, Hessel-Pras S, Schafer B, Kneuer C and Herrmann K (2024). ‘Matrine and Oxymatrine: evaluating the gene mutation potential using in silico tools and the bacterial reverse mutation assay (Ames test)’ Mutagenesis 2024: volume 39, issue 1, pages 32 to 42

17. Hasselgren C, Ahlberg E, Akahori Y, Amberg A, Anger LT, Atienzar F, Auerbach S, Beilke L, Bellion P, Benigni R, Bercu J, Booth ED, Bower D, Brigo A, Cammerer Z, Cronin MTD, Crooks I, Cross KP, Custer L, Dobo K, Doktorova T, Faulkner D, Ford KA, Fortin MC, Frericks M, Gad-McDonald SE, Gellatly N, Gerets H, Gervais V, Glowienke S, Van Gompel J, Harvey JS, Hillegass J, Honma M, Hsieh JH, Hsu CW, Barton-Maclaren TS, Johnson C, Jolly R, Jones D, Kemper R, Kenyon MO, Kruhlak NL, Kulkarni SA, Kummerer K, Leavitt P, Masten S, Miller S, Moudgal C, Muster W, Paulino A, Lo Piparo E, Powley M, Quigley DP, Reddy MV, Richarz AN, Schilter B, Snyder RD, Stavitskaya L, Stidl R, Szabo DT, Teasdale A, Tice RR, Trejo-Martin A, Vuorinen A, Wall BA, Watts P, White AT, Wichard J, Witt KL, Woolley A, Woolley D, Zwickl C and Myatt GJ (2019). ’Genetic toxicology in silico protocol’ Regulatory Toxicology and Pharmacology 2019: volume 107, page 104403

18. Honma M. (2020). ‘An assessment of mutagenicity of chemical substances by (quantitative) structure-activity relationship’ Genes and Environment 2020: volume 42, page 23

19. Honma M, Kitazawa A, Cayley A, Williams RV, Barber C, Hanser T, Saiakhov R, Chakravarti S, Myatt GJ, Cross KP, Benfenati E, Raitano G, Mekenyan O, Petkov P, Bossa C, Benigni R, Battistelli CL, Giuliani A, Tcheremenskaia O, DeMeo C, Norinder U, Koga H, Jose C, Jeliazkova N, Kochev N, Paskaleva V, Yang C, Daga PR, Clark RD and Rathman J (2019). ‘Improvement of quantitative structure-activity relationship (QSAR) tools for predicting Ames mutagenicity: outcomes of the Ames/QSAR International Challenge Project’ Mutagenesis 2019: volume 34, issue 1, pages 3 to 16

20. ICH (2023a). ‘ICH M7(R2) Guideline on assessment and control of DNA reactive (mutagenic) impurities in pharmaceuticals to limit potential carcinogenic risk EMA/CHMP/ICH/83812/2013. Step 5.’

21. ICH (2023b). ‘ICH M7(R2) Questions and Answers on assessment and control of DNA reactive (mutagenic) impurities in pharmaceuticals to limit potential carcinogenic risk EMA/CHMP/ICH/321999/2020. Step 5.’

22. Jayasekara P Suresh, Skanchy Sophie K, Kim Marlene T, Kumaran Govindaraj, Mugabe Benon E, Woodard Lauren E Yang Jian, Zych Andrew J, Kruhlak Naomi L. ‘Assessing the impact of expert knowledge on ICH M7 (Q)SAR predictions. Is expert review still needed?’ Regulatory Toxicology and Pharmacology 2021: volume 125, page 105006

23. Kovarich S and Cappelli CI (2022). ‘Use of In Silico Methods for Regulatory Toxicological Assessment of Pharmaceutical Impurities’ Methods in Molecular Biology 2022: volume 2,425, pages 537 to 560

24. Landry C, Kim MT, Kruhlak NL, Cross KP, Saiakhov R, Chakravarti S and Stavitskaya L (2019). ‘Transitioning to composite bacterial mutagenicity models in ICH M7 (Q)SAR analyses’ Regulatory Toxicology and Pharmacology 2019: volume 109, page 104488

25. Lemee P, Fessard V and Habauzit D (2023). ‘Prioritization of mycotoxins based on mutagenicity and carcinogenicity evaluation using combined in silico QSAR methods’ Environmental Pollution 2023: volume 323, page 121284

26. Mombelli E, Raitano G and Benfenati E (2022). ‘In Silico Prediction of Chemically Induced Mutagenicity: A Weight of Evidence Approach Integrating Information from QSAR Models and Read-Across Predictions’ Methods in Molecular Biology 2022: volume 2,425, pages 149 to 183

27. OECD (2007). ‘OECD principles for the validation, for regulatory purposes, of (quantitative) structure-activity relationship models’

28. OECD (2017). ‘Guidance on Grouping of Chemicals, 2nd edition’

29. OECD (2023). ‘(Q)SAR Assessment Framework. Guidance for the regulatory assessment of (Quantitative) Structure Activity Relationship models, predictions, and results based on multiple predictions’ E.D.O. Environment Health and Safety

30. SCCS (2023). ‘The SCCS notes of guidance for the testing of cosmetic ingredients and their safety evaluation, 12th revision’

31. Tcheremenskaia O and Benigni R (2021). ‘Toward regulatory acceptance and improving the prediction confidence of in silico approaches: a case study of genotoxicity’ Expert Opinion on Drug Metabolism and Toxicology 2021: volume 17, issue 8, pages 987 to 1,005

32. Tolosa J, Serrano Candelas E, Valles Pardo JL, Goya A, Moncho S, Gozalbes R and Palomino Schatzlein M (2023). ‘MicotoXilico: An Interactive Database to Predict Mutagenicity, Genotoxicity, and Carcinogenicity of Mycotoxins’ Toxins 2023: volume 15, issue 6

33. Wichard JD (2017). ‘In silico prediction of genotoxicity’ Food and Chemical Toxicology 2017: volume 106, pages 595 to 599

COM guidance statement (G11): Guidance on the use of (Q)SAR models to predict genotoxicity

Preface

Background

Integrating (Q)SAR in an evaluation of genotoxicity

Selection of a valid (Q)SAR model for genotoxicity

Defined endpoint

Reporting the (Q)SAR predictions

Domain of applicability

Reliability of prediction

Consideration of regulatory purpose

Additional considerations

Expert knowledge

Conclusions

Figure 1. Framework for using (Q)SAR to evaluate genotoxicity

References

Is this page useful?

Help us improve GOV.UK

Help us improve GOV.UK

Cookies on GOV.UK

Preface

Background

Integrating (Q)SAR in an evaluation of genotoxicity

Selection of a valid (Q)SAR model for genotoxicity

Defined endpoint

Reporting the (Q)SAR predictions

Domain of applicability

Reliability of prediction

Consideration of regulatory purpose

Additional considerations

Expert knowledge

Conclusions

Figure 1. Framework for using (Q)SAR to evaluate genotoxicity

References

Is this page useful?

Help us improve GOV.UK

Help us improve GOV.UK