CpG Site Density in the Genome Predicts Species Maximum Life Span

Researchers investigating epigenetic modifications and their relationship to aging have found that the density of CpG sites, where DNA methylation occurs in order to modify the pace of production of specific proteins, correlates with maximum species life span. This is an interesting finding, but, as for the epigenetic clock used to assess aging in individuals, it will likely require the work of many research groups and many years to build a firm understanding as to why this correlation exists.

Ageing involves the decline of diverse biological functions and the dynamics of this process limit species maximum lifespan. Longevity of individuals is strongly linked to specific alleles in genetic model organisms. Ageing is also associated with several epigenetic changes involving DNA methylation (DNAm). DNAm of cytosine-phosphate-guanosine (CpG) sites, involves a covalent modification to cytosine to form 5-methylcytosine. This modification to DNA has the potential to regulate gene expression, including of genes critical for longevity, without altering the underlying sequence.

The observation that DNAm at promoter CpG sites can accumulate or decline predictably with age, over and above the more random process of epigenetic drift, has enabled the development of "clock like" biomarkers for age. Individual human age, for example, can be predicted with great accuracy in a range of tissues by an epigenetic clock. Similar epigenetic clocks have been created in a range of mammal and bird species.

Maximum lifespans differ greatly among species, even among fairly closely-related species. In vertebrates, species such as the pygmy goby (Eviota sigillata) live for only eight weeks, while the Greenland shark (Somniosus microcephalus) may live for more than 400 years. In mammals, the forest shrew (Myosorex varius) has one of the shortest reported lifespans at 2.1 years, whereas some bowhead whales (Balaena mysticeta) have been reported to be older than 200 years. Despite profound importance, lifespan is poorly characterised for most wild animals because it is difficult to estimate.

Maximum lifespan is believed to be under genetic control, but so far, no gene variants can account for differences in lifespan among species. Because ageing is characterised by changes in gene expression caused by DNAm, another potential controller of lifespan is genomic changes that accommodate DNAm's effects on regulation of gene expression. Specifically, clusters of high density CpG sites, also known as CpG islands, are highly conserved within promoter sequences and well known for regulating gene expression. CpG sites are also prone to mutation and their function in regulating gene expression may make them prime targets for evolutionary pressures to vary lifespans.

Here, we extend observations of the correlation between promoter CpG density and lifespan in mammals to produce a predictive model for lifespan in all vertebrates. We use reference genomes of animals with known lifespans to identify promoters that can be predictive of lifespan. We combined data from major databases including NCBI Genomes, the Eukaryotic Promoter Database (EPD), Animal Ageing and Longevity Database (AnAge) and TimeTree to build a predictive model that estimates lifespan. Our results show CpG density in selected promoters is highly predictive of lifespan across vertebrates. To our knowledge this is the first study which has built a genetic predictive model to estimate the lifespan of vertebrate species from genetic markers.

Link: https://doi.org/10.1038/s41598-019-54447-w