For some years the ability to gather biological data has far outpaced the ability to analyze that data usefully. The genome, the epigenome, the proteome, the transcriptome, and more, all repeated over countless thousands of animals and humans. Enormous vaults of data now exist in all branches of the life sciences, enough to keep researchers occupied for decades. In order to speed up the process of analysis and understanding, scientists are increasingly applying modern tools of machine learning to life science data. This is still an incremental process, but a faster incremental process.
The blood transcriptome is expected to provide a detailed picture of an organism's physiological state with potential outcomes for applications in medical diagnostics and molecular and epidemiological research. We here present the analysis of blood specimens of 3,388 adult individuals, together with phenotype characteristics such as disease history, medication status, lifestyle factors, and body mass index (BMI). The size and heterogeneity of this data challenges analytics in terms of dimension reduction, knowledge mining, feature extraction, and data integration.
Self-organizing maps (SOM)-machine learning was applied to study transcriptional states on a population-wide scale. This method permits a detailed description and visualization of the molecular heterogeneity of transcriptomes and of their association with different phenotypic features.
The diversity of transcriptomes is described by personalized SOM-portraits, which specify the samples in terms of modules of co-expressed genes of different functional context. We identified two major blood transcriptome types where type 1 was found more in men, the elderly, and overweight people and it upregulated genes associated with inflammation and increased heme metabolism, while type 2 was predominantly found in women, younger, and normal weight participants and it was associated with activated immune responses, transcriptional, ribosomal, mitochondrial, and telomere-maintenance cell-functions. We find a striking overlap of signatures shared by multiple diseases, aging, and obesity driven by an underlying common pattern, which was associated with the immune response and the increase of inflammatory processes.