The falling cost of gene sequencing allows for genetic data to be incorporated into studies of ever larger populations. At least hundreds of thousands of entire human genomes have been sequenced, and more selective sequencing has been undertaken for millions more. This data is now beginning to show up in epidemiological studies that tackle questions of health, choice, aging, and longevity.
What should we expect to see emerge from this scientific analysis? It seems fairly clear from the extensive existing evidence, data that results from many association studies carried out in search of gene variants correlated with longevity, that a large number of genes contribute to life span. Collectively these genes influence the highly complex relationship between the operation of metabolism and pace of aging, but the contribution to longevity resulting from any one gene is small.
Further, the contribution of a single gene to aging and longevity is usually strongly contingent on environmental factors or the presence of other gene variants. As a result, an association with longevity discovered in one study population is rarely replicated in others. Only a very few genes have exhibited a robust correlation with longevity in multiple studies, and their effect sizes are (with one exception) quite small.
When it comes to the overall interaction between genes and longevity, many lines of evidence lead the scientific community to believe that the genetic contribution to human variation in aging is smaller than the environmental contribution. Those environmental factors include lifestyle choices, burden of infection, and so forth. The study here reinforces that consensus, producing a model that predicts the difference in life expectancy for the best and worst human genomes to be somewhat less than the difference between a good lifestyle and a bad lifestyle established in other epidemiological studies.
Researchers set out to identify key genetic drivers of lifespan. In the largest ever genome-wide association study of lifespan to date, they paired genetic data from more than 500,000 participants in the UK Biobank and other cohorts with data on the lifespan of each participant's parents. Rather than studying the effects of one or more selected genes on lifespan, they looked across the whole genome to answer the question in a more open-ended way and identify new avenues to explore in future work.
Because the effect of any given gene is so small, the large sample size was necessary to identify genes relevant to lifespan with enough statistical power. Using this sample, the researchers validated six previously identified associations between genes and aging, such as the APOE gene, which has been tied to risk of neurodegenerative disease. They also discovered 21 new genomic regions that influence lifespan.
They used their results to develop a polygenic risk score for lifespan: a single, personalized genomic score that estimates a person's genetic likelihood of a longer life. Based on weighted contributions from relevant genetic variants, this score allowed the researchers to predict which participants were likely to live longest. "Using a person's genetic information alone, we can identify the 10 percent of people with the most protective genes, who will live an average of five years longer than the least protected 10 percent."
Living long and healthy lives is of great interest to us all, yet investigation into the genomic basis of lifespan has been hampered by limited sample sizes, both in terms of gene discovery and identification of longevity pathways. Applying univariate, multivariate, and risk factor-informed genome-wide association to 1,012,240 parental lifespans from European subjects in UK Biobank and an independent replication cohort, we validate previous associations near CDKN2B-AS1, ATXN2/BRAP, FURIN/FES, FOXO3A, 5q33.3/EBF1, ZW10, PSORS1C3, 13q21.31, and provide evidence against associations near CLU, CHRNA4, PROX2, and d3-GHR.
Our combined dataset reveals 21 further loci and shows, using gene set and tissue-specific analyses, that genes expressed in foetal brain cells and adult prefrontal cortex are enriched for genetic variation affecting lifespan, as are gene pathways involving lipoproteins, lipid homeostasis, vesicle-mediated transport, and synaptic function.
We next perform a lookup of disease SNPs and find variants linked to dementia, smoking/lung cancer, and cardiovascular risk explain the largest amount of variation in lifespan. This, and the notable absence of cancer susceptibility SNPs (other than lung cancer) among the top lifespan variants, suggests larger, more common genetic effects on lifespan reflect modern lifestyle-based susceptibilities. Finally, we create polygenic scores for survival in independent sub-cohorts and partition populations, using DNA information alone, into deciles of expectation of life with a difference of more than five years from top to bottom decile.