Skip to main navigation Skip to search Skip to main content

Addressing pandemic-wide systematic errors in the SARS-CoV-2 phylogeny

Martin Hunt, Angie S. Hinrichs, Daniel Anderson, Lily Karim, Bethany L. Dearlove, Jeff Knaggs, Bede Constantinides, Philip W. Fowler, Gillian Rodger, Teresa Street, Sheila Lumley, Hermione Webster, Theo Sanderson, Christopher Ruis, Benjamin Kotzen, Nicola de Maio, Lucas N. Amenga-Etego, Dominic S. Y. Amuzu, Martin Avaro, Gordon A. AwandareReuben Ayivor-Djanie, Timothy Barkham, Matthew Bashton, Elizabeth M. Batty, Yaw Bediako, Denise De Belder, Estefania Benedetti, Andreas Bergthaler, Stefan A. Boers, Josefina Campos, Rosina Afua Ampomah Carr, Yuan Yi Constance Chen, Facundo Cuba, Maria Elena Dattero, Wanwisa Dejnirattisai, Alexander Dilthey, Kwabena Obeng Duedu, Lukas Endler, Ilka Engelmann, Ngiambudulu M. Francisco, Jonas Fuchs, Etienne Z. Gnimpieba, Soraya Groc, Jones Gyamfi, Dennis Heemskerk, Torsten Houwaart, Nei-Yuan Hsiao, Matthew Huska, Martin Hölzer, Arash Iranzadeh, Hanna Jarva, Chandima Jeewandara, Bani Jolly, Rageema Joseph, Ravi Kant, Karrie Ko Kwan Ki, Satu Kurkela, Maija Lappalainen, Marie Lataretu, Jacob Lemieux, Chang Liu, Gathsaurie Neelika Malavige, Tapfumanei Mashe, Juthathip Mongkolsapaya, Brigitte Montes, Jose Arturo Molina Mora, Collins M. Morang’a, Bernard Mvula, Niranjan Nagarajan, Andrew Nelson, Joyce M. Ngoi, Joana Paula da Paixão, Marcus Panning, Tomas Poklepovich, Peter K. Quashie, Diyanath Ranasinghe, Mara Russo, James Emmanuel San, Nicholas D. Sanderson, Vinod Scaria, Gavin Screaton, October Michael Sessions, Tarja Sironen, Abay Sisay, Darren Smith, Teemu Smura, Piyada Supasa, Chayaporn Suphavilai, Jeremy Swann, Houriiyah Tegally, Bryan Tegomoh, Olli Vapalahti, Andreas Walker, Robert J. Wilkinson, Carolyn Williamson, Xavier Zair, IMSSC Laboratory Network Consortium, Tulio de Oliveira, Timothy E.A. Peto, Derrick Crook, Russell Corbett-Detig, Zamin Iqbal*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

The majority of SARS-CoV-2 genomes obtained during the pandemic were derived by amplifying overlapping windows of the genome ('tiled amplicons'), reconstructing their sequences and fitting them together. This leads to systematic errors in genomes unless the software is both aware of the amplicon scheme and of the error modes of amplicon sequencing. Additionally, over time, amplicon schemes need to be updated as new mutations in the virus interfere with the primer binding sites at the end of amplicons. Thus, waves of variants swept the world during the pandemic and were followed by waves of systematic errors in the genomes, which had significant impacts on the inferred phylogenetic tree.

Here we reconstruct the genomes from all public data as of June 2024 using an assembly tool called Viridian ( https://github.com/iqbal-lab-org/viridian ), developed to rigorously process amplicon sequence data. With these high-quality consensus sequences we provide a global phylogenetic tree of 4,471,579 samples, viewable at https://viridian.taxonium.org . We provide simulation and empirical validation of the methodology, and quantify the improvement in the phylogeny.

Original languageEnglish
Pages (from-to)653-662
Number of pages10
JournalNature Methods
Volume23
Issue number3
Early online date9 Feb 2026
DOIs
Publication statusPublished - 1 Mar 2026

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • COVID-19/virology
  • Genome, Viral
  • Humans
  • Pandemics
  • Phylogeny
  • SARS-CoV-2/genetics
  • Software

Fingerprint

Dive into the research topics of 'Addressing pandemic-wide systematic errors in the SARS-CoV-2 phylogeny'. Together they form a unique fingerprint.

Cite this