Adaptive evolution of the SARS-CoV-2 virus

In the fight to end the pandemic there is a critical factor, the being able to avoid the adaptive evolution of the coronavirus 2 virus of severe acute respiratory syndrome (SARS-CoV-2). Adaptive evolution occurs when selection favors the fixation of genes that control characters capable of optimizing the degree of infection of the virus. Understanding trends in the evolution of the SARS-CoV-2 coronavirus 2 is critical to controlling the COVID-19 pandemic.

This evolution is that since an “Adam” of the SARS-Cov-2 virus that gave rise to the entire pandemic, the virus has been changing little by little. These changes, called mutations, allow the virus to somehow bypass the host immune system to become more infectious.

Vaccines help mitigate this process by preparing the body for a possible invasion of SARS-CoV-2 viral particles. Therefore, it is a struggle to mitigate the infective nature of new variants of the virus that will surely continue to emerge. Since there are so many viruses in the world, the human being is the laboratory of the virus itself, and if there is a variant that spreads more it is likely that in a short time it will end up replacing the others, as in fact it is happening.

Viruses and their behavior in cells

By definition, viruses are macromolecular entities (large molecules) capable of replicating only inside cells. Lacking metabolic machinery of their own, their existence outside cells resembles inert matter.

Instead of saying whether they are alive or dead, viral particles inside the cell are considered to be active and on the outside inactive.

There is a component common to all viruses and that is that they can be defined as infectious agents, capable of transmitting an infection. We can find viruses that cause slow, powerful, persistent infections, etc.. They all cause disease and can transmit it from one individual to another. It is an individual who carries the virus in his body and spreads it.

Within a cell a virus takes advantage of cellular synthesizing machinery to produce its own components. The instructions that come in the genetic material of the virus are read without question by ribosomes. Once its components are encoded, they are sent to the endoplasmic reticulum that functions just like a packager, just like an Amazon package center, and assembles its different parts and creates new units of viruses.

The virus SARS-CoV-2

From the first moment, thanks to the work of Chinese virologists we achieved the RNA sequence of the original genome of the SARS-CoV-2 virus, isolated from Wuhan-Hu-1,whose complete sequence of  29,903 ribonucleotides of single-stranded RNA was deposited in GenBank(MN908947.3)and whose sequence of amino acids encoded from glycoprotein S (spike S) corresponds to  QHD43416.1.

Glycoprotein S is one of the keys to this virus, since several vaccines use this glycoprotein S. It is noted where it is in Fig. 1. In addition, the figure shows the positions of several restriction enzymes. Restriction enzymes are like scissors that recognize nucleotide sequences and cut through that sequence.

evolución adaptativa
Figura 2. Map of the RNA genome of the SARS-CoV-2 coronavirus converted to DNA, as stored in the GenBank reference database: MN908947.3. The relative position of the sequence encoding glycoprotein S and various restriction enzymes is indicated (in blue). Tool used SnapGene.

The genome of the virus is like a rope with knots. Each node is a nitrogenous base being A (Adenine), T (Thymine), C (Cytosine) and G (Guanine) the bases if it is DNA (deoxyribonucleic acid), substituting the T for U (Uracil) in the RNA (ribonucleic acid). Part of the sequence of the original SARS-CoV-2 virus comes in Fig. 2. It shows position 266 which has an ATG triplet (which gives rise to Methionine), which is an initiate codon, that is, the cell is told “from here you see tying amino acids to a chain”. Therefore, the first 265 nucleotides can be said to be “mute”.

Figure 2. The ATG initiation codon of the SARS-Cov-2 virus is designated.

Gene sequence in SARS-Cov-2

If we take a look at the gene sequence we see that ORF1ab, the largest gene, contains overlapping open reading frames encoding the polyproteins PP1ab  and  PP1a. Non-structural SARS-CoV-2 proteins are responsible for viral transcription, replication, proteolytic processing, suppression of host immune responses, and suppression of host gene expression. RNA-dependent RNA polymerase is a target of antiviral therapies.

The genes of that original Wuhan sequence are as follows, most notably the S and N proteins:


AdhesionStartEndGen symbolStrain

The delta variant has this sequence of genes:


AdhesionStartEndGen symbolStrain











New study on global and regional adaptive developments of SARS-CoV-2

One study analyzed more than 300,000 high-quality genomic sequences of SARS-CoV-2 variants available since January 2021. The results show that the ongoing evolution of SARS-CoV-2 during the pandemic is primarily characterized by  purifying selection,but a small set of sites appear to evolve under  positive selection.

To investigate the evolution of SARS-CoV-2, the study collected all SARS-CoV-2 genomes as of January 8, 2021, and built a global phylogenetic tree using a “divide and rule” approach. Patterns of repeated mutations fixed along the tree were analyzed to identify sites subject to positive selection.

These sites form a network of possible epistática interactions. Epistasis   is the interaction between different genes when expressing a certain phenotypic character, that is, when the expression of one or more genes depends on the expression of another gene. Analysis of the so-called adaptive mutations provides identification of signatures of evolutionary partitions of SARS-CoV-2. The dynamics of these partitions during the course of the pandemic reveal alternate periods of globalization and regional diversification.

The diversity of viruses within each geographic region has been steadily increasing throughout the pandemic, but analysis of the phylogenetic distances between pairs of regions reveals four distinct periods based on the global partitioning of the tree and the emergence of key mutations.

The initial period of rapid diversification in region-specific phylogenies that ended in February 2020 was followed by a large event of extinction and global homogenization concomitant with the spread of D614G  in protein S, which ended in March 2020.

Finally, starting in July 2020, multiple mutations began to emerge, some of which have since been shown to allow antibody evasion, associated with ongoing regional diversification, which could be indicative of speciation and adaptative evolution.


In this pandemic it seems that we are always behind the virus. Adaptive evolution is one of the trump cards that the virus has to dodge all the efforts we make to prevent its advance.

The adaptive evolution of the virus during a pandemic is a rapidly changing goal, so inevitably any study I come out today may be outdated tomorrow. However, several trends appear that several studies reveal to be general and robust. Although it is difficult to determine positive selection for individual sites, the adaptive evolution of SARS-CoV-2 involves multiple amino acid replacements seems beyond a reasonable doubt.

Unsurprisingly, there are multiple positively selected sites in protein S, but, more surprisingly, there is another protein, N which includes several sites that appear to be strongly selected as well. Adaptive evolution seems to be another problem to be overcome in order to put an end to the pandemic caused by COVID-19 once and for all.

Leave a Reply