My most recent article published was a quick review of what was known about the SARS Omicron variant in the post 2021, The Year of the Various SARS-CoV-2 Variants of Concern – from Alpha to Omicron. While researching the content for that article I came across an intriguing preprint describing a potential source for the ins214EPE mutation in Omicron (Venkatakrishnan et al., 2021). One particular hypothesis in that paper put forward that the source of ins214EPE might be another coronavirus member (one that seasonally infects humans) which was uniquely interesting. This also seemed quite noteworthy to me as Omicron is the first Variant of Concern (VOC)/Variant of Interest (VOI) to contain such an insertion in its characterizing mutations. Both, the unique molecular biology hypothesized to be at the heart of the story, and the potential impact if such mechanism is repeated prompted this brief story.
NTD is an interesting mutation target
Most mutations in the SARS-CoV-2 spike glycoprotein in all of the variants up to and including Delta were detected either in the N-terminal domain (NTD) or the Receptor binding domain (RBD), with a smaller scattering in other areas (see our Review of SARS-CoV-2 Variants and references therein). Most cataloged RBD mutations in VOC/VOI affect antibody recognition or angiotensin-converting enzyme 2 (ACE2) receptor binding though some have an effect on both. Because of the location of the NTD on the surface of the spike molecule, known NTD mutations are restricted to altering antibody recognition. Many mutations of high prevalence in VOC/VOI are within the RBD and almost always consist of single amino acid missense mutations. In the NTD a mix of single amino acid changes and short deletions have been found. It has been established that NTD deletions are concentrated in specific areas (McCarthy et al., 2021) termed recurrent deletion regions (RDRs). In addition to also showing instances of several NTD deletions, Omicron however was unique in the insertion of three amino acids immediately after residue 214.
How do insertion mutations occur in SARS?
It happens that members of the beta-coronavirus genus of the coronaviridae family sustain frequent insertion mutations and these are evolutionary drivers. In particular, insertion mutations in sarbecovirus subgenus differentiate between highly pathogenic and lower or non-pathogenic strains. These insertion mutations arise through one of two major mechanisms. The mechanisms are template switching by the RNA-dependent RNA polymerase (RdRp) and slippage/duplication of sequence by the same enzyme (Garushyants et al., 2021), as illustrated in Figure 1. Template switching is a normal functional part of the coronavirus life cycle as it is used in the production of subgenomic RNA (sgRNA). A study early on in the pandemic found that indels found in GISAID deposited SARS-CoV-2 sequences up to March of 2021 clustered at specific regions of the genome associated with sites of template switching (Chrisman et al., 2021). The very nature of SARS-CoV-2 molecular biology provides a route for generation of novel insertion mutations!
Figure 1: Depiction of the possible mechanisms of insertions by the RNA-dependent RNA polymerase (RdRp) and slippage/duplication of sequence by the same enzyme, with (a) being polymerase slippage, (b) local duplication, and (c) template switch. Source: Garushyants et al., 2021.
Interesting hypothesis or worrying precedent?
In parallel, Venkatakrishnan et al. (2021) made an interesting early observation that the Omicron mutant ins214EPE might have been acquired by template switching involving heterologous viral or human host sequences in infected cells. They found that an identical sequence to the EPE insertion is present in the seasonal coronavirus HCoV-229E which commonly infects humans. A valid hypothesis is thus that an Omicon predecessor genomic RNA was template switched with an antigenomic HCoV-229E RNA in a human cell co-infected with both viruses. This sort of co-infection and recombination with coronaviruses has precedent (Venkatakrishnan et al., 2021). The authors also point out that the same sequence could have originated from the host genome, and there are many human genome fragments with EPE coding sequence some of which are transcribed. This data and hypothesis fits in nicely with another study published just as Omicron was discovered (Gerdol et al., 2021). Gerdol and co-authors (2021) discovered similarly to deletion sites in NTD that there is a hotspot for insertions in the spike NTD characterized by acquisitions of 1-8 additional codons between Val213 and Leu216. They termed this recurrent insertion region (RIR1). Thus, the highly antigenic NTD has multiple sites prone to both deletions and insertions, and this likely provides a potent way for SARS-CoV-2 to evolve away from neutralizing antibody recognition.
Figure 2: A possible mechanism of template switching leading to the generation of the ins214EPE in Omicron (A) and nucleotide comparison corresponding to the Omicron insert aligned with a homologous sequence from HCoV 229E (B). Source: Venkatakrishnan et al. (2021).
Whatever the source and genesis, now that a SARS VOC has occurred with inserted sequence in a known hot-spot for insertions, it is reasonable to suggest that this could happen again. Such insertions provide a previously underappreciated reservoir of possible genomic changes to a deadly human pathogen, and as such warrant careful monitoring. In fact, one of the changes from ancestral sequences that defines SARS-CoV-2, is the PRRA tetrapeptide insertion into the spike glycoprotein which introduces a polybasic furin cleavage site (Garushyants et al., 2021). This added furin cleavage site is crucial to pathogenicity versus other beta-coronavirus and loss of the site attenuates pathogenesis.
Chrisman et al., Indels in SARS-CoV-2 occur at template-switching hotspots. (2021) Biodata Min, Mar 20;14(1):20. doi: 10.1186/s13040-021-00251-0.
Garushyants et al., Template switching and duplications in SARS-CoV-2 genomes give rise to insertion variants that merit monitoring. (2021) Commun Biol, Nov 30;4(1):1343. doi: 10.1038/s42003-021-02858-9.
Gerdol et al., Emergence of a recurrent insertion in the N-terminal domain of the SARS-CoV-2 spike glycoprotein. (2022) Virus Res, Mar;310:198674. doi: 10.1016/j.virusres.2022.198674.
McCarthy et al., Recurrent deletions in the SARS-CoV-2 spike glycoprotein drive antibody escape. (2021) Science, Mar 12;371(6534):1139-1142.