Lineage replacement and evolution captured by three years of the United Kingdom Covid Infection Survey
Data files
Sep 08, 2023 version files 3.84 GB
-
consensus.fasta
3.84 GB
-
README.md
2.24 KB
-
SampleInfo.csv
6.05 MB
-
TestingInfo.csv
14.03 KB
Abstract
The Office for National Statistics COVID-19 Infection Survey (ONS-CIS) is the largest surveillance study of SARS-CoV-2 positivity in the community and collected data on the United Kingdom (UK) epidemic from April 2020 until March 2023 before being paused. Here, we report on the epidemiological and evolutionary dynamics of SARS-CoV-2 determined by analysing the sequenced samples collected by the ONS-CIS during this period. We observed a series of sweeps or partial sweeps, with each sweeping lineage having a distinct growth advantage compared to its predecessors. The sweeps also generated an alternating pattern in which most samples had either S-gene target failure (SGTF) or non-SGTF over time. Evolution was characterised by steadily increasing divergence and diversity within lineages but with step increases in divergence associated with each sweeping major lineage. This led to a faster overall rate of evolution when measured at the between-lineage level compared to within-lineages, and fluctuating levels of diversity. These observations highlight the value of viral sequencing integrated into community surveillance studies to monitor the viral epidemiology and evolution of SARS-CoV-2, and potentially other pathogens, particularly in the current phase of the pandemic with routine RT-PCR testing now ended in the community.
Lineage replacement and evolution captured by three years of the United Kingdom Covid Infection Survey
Consensus sequences and associated metadata used in the above study. The Office for National Statistics COVID-19 Infection Survey (ONS-CIS) is the largest surveillance study of SARS-CoV-2 positivity in the community and collected data on the United Kingdom (UK) epidemic from April 2020 until March 2023 before being paused. Here, we report on the epidemiological and evolutionary dynamics of SARS-CoV-2 determined by analysing the sequenced samples collected by the ONS-CIS during this period. We observed a series of sweeps or partial sweeps, with each sweeping lineage having a distinct growth advantage compared to its predecessors. The sweeps also generated an alternating pattern in which most samples had either S-gene target failure (SGTF) or non-SGTF over time. Evolution was characterised by steadily increasing divergence and diversity within lineages but with step increases in divergence associated with each sweeping major lineage. This led to a faster overall rate of evolution when measured at the between-lineage level compared to within-lineages, and fluctuating levels of diversity. These observations highlight the value of viral sequencing integrated into community surveillance studies to monitor the viral epidemiology and evolution of SARS-CoV-2, and potentially other pathogens, particularly in the current phase of the pandemic with routine RT-PCR testing now ended in the community.
Description of the data and file structure
consensus.fasta - a fasta file containing approximately 125,000 consensus sequences, each labelled by its COG-UK ID and collection data.
SampleInfo.csv - a table giving key metadata for each consensus sequence, specifically collection date and Pango lineage
TestingInfo.csv - a table giving number of swabs, positive swabs, positive swabs with Ct<=30, and positive swab successfully sequenced (by week)
Sharing/Access information
The sequences have also been deposited in the European Nucleotide Archive (ENA) at EMBL-EBI as part of the COG-UK consortium, which has accession number PRJEB37886 (https://www.ebi.ac.uk/ena/browser/view/PRJEB37886)
Code/Software
Samples collected as part of the ONS Covid Infection Survey and sequenced as part of the COG-UK consortium.