Using Explainable AI in the Production of Biological Age Measures
Standard approaches to generating aging clocks from biological data produce algorithmic combinations of factors that are opaque. It is entirely unclear as to how they relate to underlying mechanisms of damage and dysfunction that produce degenerative aging, and thus hard to use them as a tool to assess ways to modify those mechanisms. Explainable artificial intelligence is a term of art used to describe approaches to machine learning that produce more insight into how the final product actually works, what factors went into its construction, how it relates to underlying processes. Given that the primary challenge in the field of measuring biological age, such as via epigenetic clocks, is that we don't understand how these clocks relate to specific causes and processes of aging, it seems sensible to make more of an effort to produce aging clocks that are comprehensible from the outset. The work here is a step in that direction.
Existing biological age clocks have three main limitations. First, they necessitate a trade-off between accuracy (ie, predictive performance for chronological age or mortality) and interpretability (ie, understanding each feature's contribution to the prediction). Most of them use linear models that offer interpretability but weaker predictive power for mortality prediction than complex machine-learning models. This choice is natural given that interpretability is a key goal of biological age clocks: identifying biomarkers of biological age can improve our understanding of the ageing process and help develop drugs that target ageing-related dysfunction. Although advanced machine-learning models have created first-generation biological age models using diverse data types such as epigenetic features, blood markers, electrocardiogram features, brain MRI features, and transcriptomic features, these models are hard to interpret and do not have individualised explanations. To build models that are both accurate and interpretable, we turn to the emerging area of explainable artificial intelligence (XAI).
The second limitation is that interpretations of previous biological age clocks might not address important scientific questions. Previous biological age studies primarily explain the model as a whole (global explanation). However, given the substantial variations in ageing processes among individuals, individualised explanations are crucial for comprehending complex ageing mechanisms. We leveraged recent XAI methods to provide principled individualised (local) explanations on the basis of feature attributions. Typically, feature attributions can be difficult to understand for non-machine-learning practitioners because they are usually in units of predicted probability or logits units. To make our biological age explanations more accessible, we rescaled our attributions to the age scale in units of years so that the rescaled attributions sum to the biological age acceleration (AgeAccel) of an individual.
The third limitation of current biological age clocks is their inability to incorporate several age-related outcomes, such as cause-specific mortalities. Their inability to account for these factors restricts our understanding of important features for different ageing processes. This shortcoming is problematic because biological ageing is enormously complex and thought to be driven by many biological processes. Previous studies noted low agreement between biological age clocks in terms of their correlations with each other and associations with ageing traits, implying that they measure different aspects of biological age. To solve this, we developed our biological age clocks by predicting diverse age-related outcomes, such as specific mortalities and morbidities, allowing us to target and specify particular underlying ageing mechanisms that our clocks capture.
Here we introduce ExplaiNAble BioLogical Age (ENABL Age), a new approach to estimate and interpret biological age that combines complex machine learning and XAI methods. We performed a comprehensive validation of ENABL Age using the UK Biobank and National Health and Nutrition Examination Survey (NHANES) datasets, assessing its ability to capture ageing mechanisms and offering concrete examples of its interpretability.