Methods: Deep-sequencing data of SARS-CoV-2 from public databases and from clinical samples were analyzed to identify and map genetic variants and sub-genomic RNA transcripts across the genome. Results: Sequence analysis suggests that the 3 adjacent nucleotide changes that result in the K203/R204 variant have arisen by homologous recombination from the core sequence of the leader transcription-regulating sequence (TRS) rather than by stepwise mutation. The resulting sequence changes generate a novel sub-genomic RNA transcript for the C-terminal dimerization domain of nucleocapsid. Deep-sequencing data from 981 clinical samples confirmed the presence of the novel TRS-CS-dimerization domain RNA in individuals with the K203/R204 variant. Quantification of sub-genomic RNA indicates that viruses with the K203/R204 variant may also have increased expression of sub-genomic RNA from other open reading frames.
Conclusions: The finding that homologous recombination from the TRS may have occurred since the introduction of SARS-CoV-2 in humans, resulting in both coding changes and novel sub-genomic RNA transcripts, suggests this as a mechanism for diversification and adaptation within its new host.