Model design for non-parametric phylodynamic inference and applications to pathogen surveillance

The COVID-19 Genomics UK (COG-UK) Consortium, Xavier Didelot, Lily Geidelberg, Erik M. Volz, Matthew Bashton, Darren Smith, Andrew Nelson, Gregory R. Young

Research output: Working paperPreprint

8 Downloads (Pure)


Inference of effective population size from genomic data can provide unique information about demographic history, and when applied to pathogen genetic data can also provide insights into epidemiological dynamics. The combination of non-parametric models for population dynamics with molecular clock models which relate genetic data to time has enabled phylodynamic inference based on large sets of time-stamped genetic sequence data. The methodology for non-parametric inference of effective population size is well-developed in the Bayesian setting, but here we develop a frequentist approach based on non-parametric latent process models of population size dynamics. We appeal to statistical principles based on out-of-sample prediction accuracy in order to optimize parameters that control shape and smoothness of the population size over time. We demonstrate the flexibility and speed of this approach in a series of simulation experiments, and apply the methodology to reconstruct the previously described waves in the seventh pandemic of cholera. We also estimate the impact of non-pharmaceutical interventions for COVID-19 in England using thousands of SARS-CoV-2 sequences. By incorporating a measure of the strength of these interventions over time within the phylodynamic model, we estimate the impact of the first national lockdown in the UK on the epidemic reproduction number.

Original languageEnglish
PublisherCold Spring Harbor Laboratory Press
Publication statusPublished - 16 Aug 2021

Publication series

PublisherCold Spring Harbor Laboratory Press

Cite this