DNA methylation-based age prediction using massively parallel sequencing data and multiple machine learning models

Anastasia Aliferi, David Ballard*, Matteo D. Gallidabino, Helen Thurtle, Leon Barron, Denise Syndercombe Court

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

75 Citations (Scopus)
104 Downloads (Pure)


The field of DNA intelligence focuses on retrieving information from DNA evidence that can help narrow down large groups of suspects or define target groups of interest. With recent breakthroughs on the estimation of geographical ancestry and physical appearance, the estimation of chronological age comes to complete this circle of information. Recent studies have identified methylation sites in the human genome that correlate strongly with age and can be used for the development of age-estimation algorithms. In this study, 110 whole blood samples from individuals aged 11–93 years were analysed using a DNA methylation quantification assay based on bisulphite conversion and massively parallel sequencing (Illumina MiSeq) of 12 CpG sites. Using this data, 17 different statistical modelling approaches were compared based on root mean square error (RMSE) and a Support Vector Machine with polynomial function (SVMp) model was selected for further testing. For the selected model (RMSE = 4.9 years) the mean average error (MAE) of the blind test (n = 33) was calculated at 4.1 years, with 52% of the samples predicting with less than 4 years of error and 86% with less than 7 years. Furthermore, the sensitivity of the method was assessed both in terms of methylation quantification accuracy and prediction accuracy in the first validation of this kind. The described method retained its accuracy down to 10 ng of initial DNA input or ∼2 ng bisulphite PCR input. Finally, 34 saliva samples were analysed and following basic normalisation, the chronological age of the donors was predicted with less than 4 years of error for 50% of the samples and with less than 7 years of error for 70%.

Original languageEnglish
Pages (from-to)215-226
Number of pages12
JournalForensic Science International: Genetics
Early online date8 Sept 2018
Publication statusPublished - Nov 2018
Externally publishedYes


Dive into the research topics of 'DNA methylation-based age prediction using massively parallel sequencing data and multiple machine learning models'. Together they form a unique fingerprint.

Cite this