Mass spectrometry analysis of protein glycosylation and viral infectivity

The recent COVID-19 pandemic and race to develop vaccines against the causative agent, severe acute respiratory syndrome coronavirus 2, has reminded us of the important role that protein glycosylation plays in the mechanisms of pathogen infection and host immune response. CLI caught up with Dr Sanda and Dr Campos (both at Team Scientific Service Group Biomolecular Mass Spectrometry, Max Planck Institute for Heart and Lung Research, Bad Nauheim, Germany), and Dr Girgis (Department of Bioengineering, Volgenau School of Engineering, George Mason University, Fairfax, Virginia, USA) to find out more about the challenges of analysing protein glycosylation, current techniques and developments that are still needed that may well allow the production of better vaccines and therapeutics.

Why is the study of post-translational modification of viral proteins important?

Post-translational modifications (PTMs) refer to the covalent modifications of polypeptides after they are biosynthesized. By introducing new functional moieties, such as phosphate and carbohydrate conjugates, these PTMs play a fundamental role in regulating the folding, stability, enzymatic activity, subcellular localization, and protein–protein interactions during bicellular growth and differentiation. Therefore, the PTMs of viral proteins as well as the host cell proteome are the determining factors in the level of virus–host interactions and the extent of the host immune response.

Glycosylation is one of the most important types of PTMs and is the focus of our studies. It involves the covalent attachment of different types of glycans to specific sites on protein structures, and it can impact protein structure, orientation, binding affinity and metabolism. However, given their structural diversity, their functions have not been fully explored yet.

Viral envelope proteins are often decorated by glycans that can account for up to half of the molecular weight of these glyco-proteins. Despite the numerous types of glycosylation, N- and mucin-type O-linked glycosylation are the most widely exploited in viral research. Prominent examples include the Ebola virus glycoprotein modified by a very high glycan content, the heavily glycosylated gp120 glycoprotein in HIV, and the HIV1 glycoprotein gp160 that is glycosylated by the addition of multiple N-linked glycans.

It is strongly believed that the high levels of glycosylation may serve primarily as a protective shield against the host’s immune system. Also, as the virus hijacks the host cellular machinery, viral surface proteins may incorporate familiar host glycans, which can thereby change the ability of the host to recognize the virus and stimulate the immune response [1].

Although the innate immune system is constantly evolving a range of strategies to combat glycosylated epitopes of serious pathogens, mutations can lead to failures of the immune reactions. Alterations of viral glycoproteins can significantly impact viral infectivity and characteristics (such as the extent of protein glycosylation), which may jeopardize the efficacy of existing vaccines. Moreover, antigen glycosylation complicates the development of vaccines and antibody-based therapies.

Another prominent example that highlights the role played by protein glycosylation in immune response against infections is the influenza virus. Host-cell dependent glycosylation of hemagglutinin (HA) and neuraminidase (NA) as well as the surface glycoproteins of influenza viruses were proven to be critical during influenza infections. The HA N-glycosylation affects T-cell activation and cytokine production, and thus promotes immune evasion.

Furthermore, glycans on viral entry proteins are greatly involved in the modulation of receptor binding and entry. Again, we can use influenza viruses as an example: influenza viruses attach to glycans on cellular surface glycoproteins. HA and NA interact with the terminal glycans of the host cell surface glycoproteins. The NA can cleave these glycans to gain access to the epithelial cells, playing a secondary role in helping viruses to enter host cells. On the other hand, NA can also cleave glycans from glycoproteins of the enveloped virus itself and enhance infectivity by preventing aggregation of viral particles.

In recent years, it has inevitably become important to identify common glycosylation patterns for translational applications. This has been particularly important in the study of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike (S) protein.

Like many viruses, SARS-CoV-2 launches its cellular invasion through its heavily glycosylated S protein. A group of researchers from the University of Georgia reported their work about the 3D models of glycosylated SARS-CoV-2 spike protein. The results of their molecular dynamics simulation revealed that glycans shield approximately 40% of the underlying protein surface of the S glycoprotein from antibody recognition, but with the notable exception of the ACE2 receptor binding domain, suggesting that a vaccine with this epitope may be effective if the virus continues to target the same host receptor [2]. The viral S protein of SARS-Cov-2 has been the ultimate target for vaccine production and used as an immunogen for vaccines to generate neutralizing antibodies.

Glycosylation is a heterogenic process that depends on many factors, including age, underlying disease and ethnicity; therefore, assessing the glycosylation profile may be correlated to the observed differential susceptibilities among individuals to COVID-19. While overall shielding of the underlying protein surface does not appear to be highly sensitive to glycan microheterogeneity, it could impact the innate immune response by altering the ability of collectins and other lectins of the immune system to effectively bind to the S glycoprotein and neutralize the virus. Also, it may impact the adaptive immune response by altering the amount of viable human leucocyte antigen (HLA) [3].

What is the best way to study viral protein glycosylation and why?

As a result of its crucial role in various biological functions, several strategies were recruited to detect, purify and analyse glycosylation, including glycan staining and visualization, and glycan crosslinking to agarose or magnetic resin. However, these methodologies rely on antibodies and lectins that bind to glycan structures, but these are often not specific and don’t cover the great diversity seen in glycans. Proteomic analysis by high-resolution mass spectrometry (MS), however, remains the gold standard for this kind of analysis. MS analysis is employed for glycan ionization, fragmentation, and mass identification of the fragments.

The two main MS ionization techniques used for analysis of bio-molecules are electrospray ionization (ESI) and matrix-assisted laser desorption ionization (MALDI). In MALDI analysis, the analytes are ionized through a matrix such as 2,5-dihydroxybenzoic acid (DHB) using a laser beam pulse and then pushed to a time-of-flight (TOF) mass analyser. In ESI analysis, the analytes in solution are aerosolized using high voltage and the droplets are desolvated and ions are transferred to the first quadrupole for a precursor ion selection. The ESI technique is usually coupled with liquid chromatography, whereas the analytes are separated on a reversed phase silica-packed column.

Glycan fragmentation usually occurs in collision-induced dissociation (CID) fragmentation units, where ions enter a collision cell filled with an inert gas (such as argon or nitrogen) and are subjected to high- or low-energy collisions. The use of high-energy and low-energy CID can provide pivotal intricate structural details for glycan analysis. Also, high-energy C-trap dissociation (HCD) fragmentation was adopted by Orbitrap instruments for more efficient fragmentation. Furthermore, an electron-transfer/higher-energy collision dissociation (ETD/EThcD) fragmentation scheme is applied to incorporate both glycan and peptide fragments in one single spectrum. Multiple ion isolation and fragmentation cycles may be needed for a comprehensive structural elucidation. The final destination of the ions is the high-resolution mass analyser which is either a TOF mass analyser or an Orbitrap mass analyser. In TOF MS, the mass-to-charge ratio determines the time that it subsequently takes for the ions to travel through a flight tube to reach the detector. In Orbitrap mass analysers, ions enter a chamber with a spindle-like electrode that traps ions in an orbital motion around the spindle. The current generated from the trapped ions is detected and converted to a mass spectrum.

MS was successfully implemented in various studies by us and other groups to determine the type (N- or O-) and extent of glycosylation of viral proteins such as the SARS-COV-2 S protein (see Figure 2 in Campos et al. 2022 [8]; https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/pmic.202100322). Detailed knowledge of S protein glycosylation is not only important for vaccine development, but also to understand its role in receptor binding. Moreover, realizing the exact location of glycosylation may contribute to the development of new therapeutics.

What are the challenges of investigating protein glycosylation by MS?

There are several important challenging aspects in studying protein glycosylation. Although there are many universal obstacles in the glycoproteomic field, a few of these are related to matrices, such as viral glycoproteins. Bottom-up glycoproteomic analysis (analysis of glycopeptides as a product of proteolytic cleavage by endoproteases), compared to the proteomics approach, is more difficult because of poor glycopeptide backbone fragmentation, differences in ionization efficiency, and glycosylation microheterogeneity and macroheterogeneity.

Microheterogeneity (multiple glycans occupying one glycosylation site) and macroheterogeneity (glycosite occupancy ranges between 0 and 100%) may cause a dilution of a specific glycopeptide concentration.

Another important challenge of glycoproteomic analysis is the resolution of isomeric structures with identical molecular weight. Structure-specific glycoproteomics is a set of techniques that allow the identification or quantification of specific glycostructures at a specific glycosite. Scientists use chromatographic techniques [4] to separate isomeric structures. Separation of ions in the gaseous phase based on ion mobility could also be used to differentiate isobaric ions [5,6]. Exoglycosidase treatment could be used to cleave the glycounit [7] with a specific linkage or limited fragmentation of the glycopeptide in the collision cell, could resolve specific isomers of the outer arm structures [5]. In principle, structure-specific glyco-proteomic analysis still needs a lot of development. Determination of glycosites occupied by O-glycan could represent a challenge due to the lack of a glycosylation motif compared to N-glycosylation.

A major challenge in glycoproteomic analysis of viral glycoproteins is low sensitivity. Most of the glycoproteomic analysis of the SARS-Cov-2 S glycoprotein was performed on proteins overexpressed in HEK-293 cells. Some studies have described the analysis of the protein isolated from Vero cells, which is more similar to the real virus [8], but the analysis of the glycoprotein peaks isolated from patients infected with SARS-Cov-2 is still lacking.

Also, data analysis of O-glycosylation continues to be a major source of variation. A unified technique and reliable software may help standardize the workflow of this complex type of analysis.

How can the methodology and standardization be improved?

Despite the recent advances in the glycoproteomic field, characterization and control of glycosylation profiles during various stages of protein production is currently hindered by the voids in understanding of different parameters and culture conditions and the lack of adequate analytical methodologies. With this regard, we would like to emphasize the importance of establishing a unified optimized sample preparation protocol.

Given the inherent complexity of protein glycosylation, more efficient workflows and more creative experimentations are urgently needed. More specifically, a more focused intact glycopeptide identification may represent a pragmatic means to study glycoproteomic heterogeneity in various disease diagnostics and biomarker discovery.

Furthermore, glycoengineering may open the gate to a new era of manufacturing consistent glycoproteins with enhanced reproducibility, potency and improved functionality in non-mammalian expression systems.

In addition, there is room for improvement in chemical enrichment strategies and the use of proteases with greater site specificity. Meanwhile, the use of creative MS methodologies and more advanced instrumentation may help improve the identification process. Finally, software development with more comprehensive algorithms for data analysis would also be beneficial. Ultimately, these developments may represent a corner stone for future therapeutic products with enhanced efficacy and lower toxicity.

What do you envisage for the future of mass spectrometric analysis of viral protein glycosylation?

The ultimate goal would be to implement a standardized methodology to analyse specimens from infected people, so that one could examine the levels and the types of glycosylation of the viral proteins and monitor their changes.

This could eventually lead to better understanding of viral mechanisms of infections and viral immune evasions. More importantly, glycosylation profiling could shed light on antigen selection for vaccine production and the development of targeted therapeutics during the course of a pandemic or the various waves of infection. Furthermore, vaccine development based on a well-studied glycosylation profile may result in more efficient therapeutic regimens.

Glycans represent structural features that are not encoded in the eukaryotic genome, but yet play a crucial role in immune recognition and have a huge impact on vaccine design. Such vaccine candidates are often expressed in cell lines that do not recapitulate the glycosylation pattern present on native pathogens, and potentially do not elicit biologically relevant immune responses.

In a nutshell, being such a powerful and high-throughput method of glycosylation analysis, mass spectrometry-based methodologies can be an essential tool for antigen assessment and vaccine design. Glycosylation profiling is key for the development of viral vaccines and therapeutics, which should draw much more attention from scientists worldwide.

The experts

Dr Diana Campos PhD, Team Scientific Service Group Biomolecular Mass Spectrometry, Max Planck Institute for Heart and Lung Research, Bad Nauheim, Germany. Email: Diana.Campos@mpi-bn.mpg.de

Dr Miroslav Sanda PhD, Associate Professor, Team Scientific Service Group Biomolecular Mass Spectrometry, Max Planck Institute for Heart and Lung Research, Bad Nauheim, Germany Email: Miloslav.Sanda@mpi-bn.mpg.de

Dr Michael Girgis PhD, Department of Bioengineering, Volgenau School of Engineering, George Mason University, Fairfax, Virginia, USA Email: myassagi@gmu.edu

References

1. Crispin M, Doores KJ. Targeting host-derived glycans on enveloped viruses for antibody-based vaccine design. Curr Opin Virol 2015;11:63–69  (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4827424/).
2. Grant OC, Montgomery D, Ito K, Woods RJ. Analysis of the SARS-CoV-2 spike protein glycan shield reveals implications for immune recognition. Sci Rep 2020;10(1):14991  (https://www.nature.com/articles/s41598-020-71748-7).
3. Khatri K, Klein JA, White MR et al. Integrated omics and computational glycobiology reveal structural basis for influenza A virus glycan microheterogeneity and host interactions. Mol Cell Proteomics 2016;15(6):1895–1912 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5083086/).
4. Kozlik P, Goldman R, Sanda M. Hydrophilic interaction liquid chromatography in the separation of glycopeptides and their isomers. Anal Bioanal Chem 2018;410(20):5001–5008 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6041177/).
5. Sanda M, Benicky J, Goldman R. Low collision energy fragmentation in structure-specific glycoproteomics analysis.  Anal Chem 2020;92(12):8262–8267 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8145010/).
6. Sanda M, Morrison L, Goldman R. N- and O-glycosylation of the SARS-CoV-2 spike protein. Anal Chem 2021;93(4):2003–2009 (https://pubs.acs.org/doi/10.1021/acs.analchem.0c03173).
7. Sanda M, Ahn J, Kozlik P, Goldman R. Analysis of site and structure specific core fucosylation in liver cirrhosis using exoglycosidase-assisted data-independent LC-MS/MS. Sci Rep 2021;11(1):23273 (https://www.nature.com/articles/s41598-021-02838-3).
8. Campos D, Girgis M, Sanda M. Site-specific glycosylation of SARS-CoV-2: big challenges in mass spectrometry analysis. Proteomics 2022;22(15–16):e2100322  (https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/pmic.202100322).