The pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (Cevik et al., 2020, Wu et al., 2020, Zhu et al., 2020) has been characterized by the emergence of variants involving mutations in the surface spike glycoprotein (S-protein) that is involved in human host cell entry and is important for immune recognition (Guruprasad, 2021, Harvey et al., 2021, van Dorp et al., 2021). The direct health implications of these variants have put the field of protein evolution into a central role, with a public understanding of the importance of mutations leading to more infectious or antibody- or vaccine-resistant variants (Kemp et al., 2021, McCarthy et al., 2021, Otto et al., 2021, van Dorp et al., 2021, van Dorp et al., 2020, Williams and Burgers, 2021). Accordingly, there is substantial interest in using protein evolution and protein modelling methods to rationalize and predict these events.
The S-protein enters human host cells by fusion with the cell-surface receptor angiotensin-converting enzyme 2 (ACE2), and the high binding affinity of the S-protein variants towards ACE2 is a prerequisite for infection (Fehr and Perlman, 2015, Letko et al., 2020, Wang et al., 2020). However at the same time, the S-protein is presented to the human immune system leading to development of immunity in the population after infection or after vaccination, since this presentation is the design rationale behind many vaccines (Forni et al., 2021, Liu et al., 2020). These two binding processes can be viewed as opposed, with one favoring infection and the other limiting it, and thus, the fitness of the virus is linked to both these binding events, as explored further in this study.
Evolution occurs by changing mutations in a folded protein structure (Bajaj and Blundell, 1984, Dasmeh et al., 2013, Liberles and a, Teichmann, S. a, Bahar, I., Bastolla, U., Bloom, J., Bornberg-Bauer, E., Colwell, L.J., de Koning, a P.J., Dokholyan, N. V, Echave, J., Elofsson, A., Gerloff, D.L., Goldstein, R. a, Grahnen, J. a, Holder, M.T., Lakner, C., Lartillot, N., Lovell, S.C., Naylor, G., Perica, T., Pollock, D.D., Pupko, T., Regan, L., Roger, A., Rubinstein, N., Shakhnovich, E., Sjölander, K., Sunyaev, S., Teufel, A.I., Thorne, J.L., Thornton, J.W., Weinreich, D.M., Whelan, S., , 2012, Lobkovsky et al., 2010, Wylie and Shakhnovich, 2011), and thus it is important to provide the structural context to the mutations in order to predict and rationalize their impacts. Fortunately, the few years before the pandemic witnessed major technical breakthroughs in the field of cryo-electron microscopy applied to solving the 3-dimensional structures of macromolecules (Blundell and Chaplin, 2021, Danev et al., 2019, Fernandez-Leiro and Scheres, 2016, Murata and Wolf, 2018). This development made it possible during the pandemic for research groups to publish hundreds of structures of the S-protein solved for all relevant conformational states either by itself or in complex with ACE2 or a large variety of antibodies, typically at resolutions of 2–4 Å that establish well the polypeptide backbone structures (Mehra and Kepp, 2022a). This means that structure-based evolution models of the SARS-CoV-2 S-protein are in principle feasible.
The present work develops a new method by using ensembles of these experimental structures and computational models estimating binding affinities to evaluate mutation effects in a model of virus fitness that includes both ACE2 and antibody binding as a competitive binding phenomenon. We propose that the host-virion interaction can be viewed as a situation where the virion seeks to enter the cell via binding its S-protein to ACE2 before the S-protein is bound by circulating antibodies. In this model, the virus fitness becomes directly proportional to the binding affinity towards ACE2 minus the binding affinity towards a representative cocktail of antibodies, and the mutation effect on fitness (the selection coefficient) becomes proportional to the change in this difference relative to the wild-type or reference strain.
One of the advantages of this “selectivity model” is that it compares differences of differences (differences in binding differences of mutant and wild type to ACE2 and antibodies). This protocol removes systematic errors that otherwise always exist in the experimental assays and especially in the computer models, from bias towards some amino acid mutation types (Caldararu et al., 2020, Pucci et al., 2018) and reliance on a folded wild-type structure for estimating the mutation effects (Caldararu et al., 2021, Caldararu et al., 2020, Christensen and Kepp, 2012, Iqbal et al., 2021, Kepp, 2015, Khan and Vihinen, 2010, Louis and Abriata, 2021, Pucci et al., 2022, Pucci et al., 2018). Our results correlated well with diverse experimental binding/escape datasets for ACE2 and antibodies. Overall, our work demonstrates that simple, intuitively appealing, and computable models to estimate the fitness function of the virus are feasible, although more work is needed to improve this concept to real-world settings.
Comments (0)