Chemistry matters. Join us to get the news you need.

If you have an ACS member number, please enter it here so we can link this account to your membership. (optional)

ACS values your privacy. By submitting your information, you are gaining access to C&EN and subscribing to our weekly newsletter. We use the information you provide to make your reading experience better, and we will never sell your data to third party members.


Infectious disease

What do we know about the novel coronavirus’s 29 proteins?

These biomolecules could hold clues to why the virus is so infectious and to how to stop it

by Alla Katsnelson, special to C&EN
April 1, 2020


A transmission electron microscope image shows the crown-like shape of SARS-CoV-2.

Scientists across the globe are gunning to understand the novel coronavirus, called SARS-CoV-2, and what makes it so contagious and deadly.

Several members of the coronavirus family infect humans: four cause the common cold and two—SARS-CoV and MERS-CoV—have triggered dangerous epidemics. The novel coronavirus’s closest kin is SARS-CoV, which jumped species from bats to civets to humans to cause the severe acute respiratory syndrome (SARS) epidemic of 2002–2003. That SARS outbreak infected over 8,000 people. The current coronavirus pandemic has infected more than 880,000 people and killed more than 44,000 as of April 1, according to data from Johns Hopkins University.

Support nonprofit science journalism
C&EN has made this story and all of its coverage of the coronavirus epidemic freely available during the outbreak to keep the public informed. To support us:
Donate Join Subscribe

“From a molecular perspective, figuring out why the virus is so much more transmissible than past viruses is where we should be looking right now,” says Robert Kirchdoerfer, a structural biologist at the University of Wisconsin–Madison who studies how coronaviruses fuse with host cells.

While people infected with the early 2000s SARS virus showed severe symptoms almost right away, people infected with SARS-CoV-2 can spread the virus before they show symptoms or when they are just mildly sick. A study published in Nature reports that patients shed virus most efficiently in the first week of illness, when their symptoms are mild (2020, DOI: 10.1038/s41586-020-2196-x). The US Centers for Disease Control and Prevention now estimates that 25% of people infected with the virus show no symptoms.

“That gives it an advantage to spread,” says Melanie Ott, a virologist at the University of California, San Francisco, and a member of the Quantitative Biosciences Institute’s COVID consortium. There must be something in the virus’s biological makeup that facilitates this silent transmission and that makes its effect on its human hosts so variable. But researchers don’t yet know what it is.

The RNA genome of SARS-CoV-2 has 29,811 nucleotides, encoding for 29 proteins, though one may not get expressed. Studying these different components of the virus, as well as how they interact with our cells, is already yielding some clues, but much remains to be learned, Ott says.

Coronavirus proteins
Credit: Adapted from Nature (DOI: 10.1038/s41433-020-0790-7) and bioRxiv (DOI: 10.1101/2020.03.12.988865)/C&EN/Shutterstock

SARS-CoV-2 has four structural proteins (top): the E and M proteins, which form the viral envelope; the N protein (detail not shown), which binds to the virus’s RNA genome; and the S protein, which binds to human receptors. The viral genome consists of more than 29,000 bases and encodes 29 proteins (bottom). The nonstructural proteins get expressed as two long polypeptides, the longer of which gets chopped up by the virus’s main protease. This group of proteins includes the main protease (Nsp5) and RNA polymerase (Nsp12).

The spike

Coronaviruses are named for the crown of protein spikes covering their outer membrane surface. Early work on the novel coronavirus has focused on these spike proteins—also called S proteins—because they are the keys that the virus uses to enter host cells. In both SARS-CoV and SARS-CoV-2, the S protein binds to a receptor called angiotensin converting enzyme 2 (ACE2) to hack its way into host cells.

At the amino acid level, the spike proteins on SARS-CoV-2 are about 80% identical to those on SARS-CoV. “The residues which interact with ACE2 are conserved, compared to [the first] SARS, but the residues in between are different, and there are also some insertions,” says Rolf Hilgenfeld, a structural biologist who studies coronaviruses at the University of Lübeck.

Studies so far suggest that the new virus’s spike proteins bind to ACE2 significantly more strongly than those of SARS-CoV. “That is probably one of the reasons it spreads more easily and is more infectious,” Hilgenfeld says.

From a molecular perspective, figuring out why the virus is so much more transmissible than past viruses is where we should be looking right now.
Robert Kirchdoerfer, structural biologist, University of Wisconsin–Madison

A non-peer-reviewed preprint posted on medRxiv on March 27 reports that ACE2 is expressed especially strongly in the lungs of people with lung diseases (2020, DOI: 10.1101/2020.03.21.20040261). That observation may partly explain how these underlying conditions make people more susceptible to infection by the novel coronavirus.

Stronger binding with ACE2 isn’t the only clue to the virus’s power. S proteins are made up of two segments, S1 and S2, that must be cleaved at two sites to expose a peptide that initiates fusion with ACE2 on a host cell. A mutation in the SARS-CoV-2 S protein allows an enzyme called furin, which is made by many types of human cells, to do the first cut. This ability to get human enzymes to do its work primes the virus for fusing with ACE2—providing another possible explanation for this virus’s infectiousness, says Carolyn Machamer, a cell biologist at Johns Hopkins University who studies the basic biology of coronaviruses. “It can get that first clip before it even comes into contact with a receptor.”

Researchers are working to identify the best way to interfere with the S protein’s interaction with ACE2. ACE2 is present in many organs throughout the body and interfering with it may have side effects, so researchers want to avoid hitting the receptor and are instead developing antibodies or peptides that bind and disable specific segments of the S protein.

The other 28

Of the 29 SARS-CoV-2 proteins, four make up the virus’s actual structure, including the S protein. One group of the other 25 coronavirus proteins regulates how the virus assembles copies of itself and how it sneaks past the host immune system. These so-called nonstructural proteins are expressed as two huge polyproteins that are then cleaved into 16 smaller proteins. An enzyme called the main protease, which performs 11 of those cleavages, is also a highly promising drug target. Hilgenfeld and his colleagues recently reported the structure of the main protease and identified an inhibitor that can block it.

Many of the virus’s nonstructural proteins are still poorly understood. “I think we are just going to have to go brute force and probably piece through the genome and study a lot of these proteins individually,” says Anthony Fehr, a biologist at the University of Kansas. These studies will help scientists both understand the underlying biochemistry that gives SARS-CoV-2 its nasty kick and identify other ways to fight the virus.

Fehr’s lab works on a nonstructural protein called NSP3, a component of which blocks the host’s efforts to fight off the virus. In particular, this protein shuts down host enzymes called PARPs, which prevent viruses from replicating, and interferes with cellular calls for the release of virus-fighting immune proteins called interferons. This protein could be a drug target, he says. “The virus would be dead in the water if it didn’t have a way to counter the interferon response.”

The third group of proteins in the novel coronavirus are accessory proteins. Coronaviruses don’t need these proteins to replicate in a test tube, but they do need the molecules to counteract the host’s innate immune system. However, accessory proteins are the least-well understood. “I’m interested in how the differences in these small proteins affect pathogenesis,” says Susan Weiss, a microbiologist who studies coronaviruses at the University of Pennsylvania. Her lab has studied accessory proteins in other coronaviruses; they plan to determine how mutations in these genes affect SARS-CoV-2’s ability to counteract host immune response and replicate.

Structural biology continues to be a key method for studying the new virus’s proteins. The differences between SARS-CoV-2 and the earlier SARS-CoV, “must have their basis in protein structure, but it’s quite difficult to correlate the 3-D structures—if you even have the 3-D structures—with differences in function,” Hilgenfeld says.

However, researchers have other tools to investigate SARS-CoV-2, says Susan Daniel, a biomolecular engineer at Cornell University who studies how coronaviruses enter cells. For example, electron spin resonance can provide information about how peptides involved in viral fusion get buried in the membranes of host cells; circular dichroism spectroscopy can provide insight about how alpha-helical structures within viral proteins change under specific conditions; isothermal titration calorimetry can hint at how viral peptides interact with various ions, which can give clues about their conformations; and nuclear magnetic resonance can provide important structural information not found in crystal structures.

Meanwhile, other labs are rushing to explore how differences in immune system responses might affect how severely ill an infected person may get. For example, Ott’s lab is beginning to infect organoids, which are models of lungs and other organs made of complex cell cultures, with the virus to watch how infections proceed. “The scientific community has come together in an unprecedented way,” Ott says. “Everybody basically has shifted their attention to the virus and is looking for ways to contribute meaningfully.”


This story was updated on April 13, 2020, to correct a statistic about the 2002–03 SARS epidemic. That virus didn't kill more than 8,000 people worldwide; it infected that many. It killed over 700.



This article has been sent to the following recipient:

Rick (April 2, 2020 10:44 PM)
I wonder since the Coronavirus is made up of protein and fatty material. Could statins or cholesterol medication turn into a vape or if there was away to smoke it or pass it through the lungs ,could it break down the virus and could it slow it down? I was doing research on it but if it did what would happen to the good cells ? I sent this information to a lot of people but no one will give me a response back i just wish i had someone to talk to about it and learn more about it! Hope to hear from you soon and please pass this message to someone that has experienced in this situation ? My name is Ricky O. from North Carolina. Im 42 years old and please write me back. Thanks for reading
Sahil (April 8, 2020 1:47 PM)
Hello Rick,
All life is made of proteins and lipids organized in a synchrony that generates function. "Statins and cholesterol medication" as you put it, works only on a particular type of a special lipid in the cells - cholesterol. If however, you read the article carefully, you will realize that cholesterol is not the most crucial substrate for successful corona-viral attack. There are more straightforward hits to impede viral entry. Moreover, since cholesterol is ubiquitous, it is always better to target something specific to avoid side effects. Keep reading and stay safe!
Judy milett (May 4, 2020 10:54 PM)
Keep researching Don’t give up. After al who would ever thought penicillin was made from green mold on bread
Dr. Surajit Bhattacharjee, Ph.D (April 7, 2020 2:18 AM)
Respected Sir,
I have already read your information on SARS-COVID-19 and I want to know from you that 'which type of protein(s) are produced by COVID-19 in human body'? I may help you.

With regards
Surajit Bhattacharjee, Ph.D
Human Physiology
Agartala, Tripura-799006
Rh negative factor (April 19, 2020 3:43 PM)
I’m no scientist-but I do know my O negative blood type. As I read thru research of coronavirus-I keep seeing that it attacks protein on the red blood cells. Being o negative-I have a/b antibodies and no protein on my cells. As I read the %of the population between Positive/negative RH factors-I can’t help but see similar stats in the coronavirus. Ik blood types are being study but is the RH factor being looked at? Also-I’ve read all tests are being done on Rhesus monkeys-peaking my curiosity-considering they make up for the positive RH factor....if coronavirus needs the protein to invade/infect-then can the O negative blood types be immune?? Can we be the answer in fighting this? I would love to know who is researching the RH factor in coronavirus pls/Thxu. Stay safe
David J Shaw (May 27, 2020 4:41 AM)
The blood type o weather it's neg or pos was highly resistant to Sars cov 1 and showing the same for Sars cov1 cuz it was highly resistant doesn't mean you couldn't get infected with the blood type 0 POS or neg they are anti A anti B so we don't carry the sugars on the red blood cell from what study's shown for Sars cov1 0 blood type seen a very mild cases if they were to be infected but over all it's showing they are highly resistant to both Sars cov 1 Sars cov2 hopefully that helps u out
Judy milett (May 4, 2020 10:50 PM)
I do not understand all the biological terms but I am going to compare waxed apples to this little bugger. First of all the apples in the supermarket are covered with a lot of wax. To melt the wax pour hot water on it. But yet there is wax residue. Now then let’s use lemon juice poured on the apple and more hot water to dissolve the wax.
Can it be possible that maybe we should have a tablespoon of lemon juice followed with a cup of hot tea. In hopes it will melt the coating off the little bugger that may be in your throat.
Hopefully to stop before getting into the lungs. I laugh at myself about this idea because It is far fetched. Number 1 we would have to drink quite often and also brush our teeth. The acid from the lemon tea. Don’t be meant to me. Keep your laughter to your self. I will just sit back and drink lemon tea.
sarbini wono (May 28, 2020 5:37 AM)
whats is deferent protein structure in covid between protein structure in ebola, in other journal say "The Natural Product Eugenol Is an Inhibitor of the Ebola Virus In Vitro. Article in Pharmaceutical Research 36(7) · July 2019 with 186 Reads 
DOI: 10.1007/s11095-019-2629-0. so mars , ebola, convict in evolution process. Can eugenol can do inhibitor for convid ?
Christian (June 28, 2020 10:25 AM)
Everybody here in Europe is talking about the "new" Coronavirus. But what i read so far (e. g. from Professor Nikolai Petrovsky from Flinders University) the "new" Virus has more bindings to human cells than to animal cells. So this Virus cannot be new or just got over from an animal to a human. Is that correct or am I wrong?
Ken Richardson (July 16, 2020 12:28 PM)
Covid 19 will be cornered as soon as we focus on preventions, and people get the message they must read and learn about all the easy ways to kill this protein before is has time to incubate in the head cavities before it make its way to the lungs. I have been making wine for 40 years. I must kill all bad bacterias in the fruit before the chemistry of converting into wine. The use of many acids does the job well. The alcohol in the latter stages will help the surviving bacteria from rapid mutation. If we could put out a list of safe ways to dissolve the Covids outer membrane before it gets into the lungs, it would slow the spread and save lives. I spray the inside and outside of my mask every 15 minutes. I inhale small amount of known membranes dissolvers when I’m in risky areas. I carry a little spray bottle that was designed to clean my glasses. Just learn what you feel is safe to use; then use it often when in public. Prevention is a key for turning the curve. (Ken R)
Judah L. Rosner (August 12, 2020 1:39 PM)
3 basic questions: 1) Has anyone calculated or measured the total molecular weight of the virus? Are the number of spikes & other structural proteins present in fixed amounts or do they vary? Phages like T4 & lambda have fixed numbers.
2) Are there any estimates of the burst size for the virus?
3) In a moderate symptomatic infection, how many cells are infected?
Palma Valverde (September 23, 2020 6:31 PM)
Why is it that "proteins" that, if I understood correctly, make up the Corona spikes are able to penetrate the human cell? How does eating these protein affect the ability of the Virus to use its protein spikes to penetrate human cells?

Leave A Comment

*Required to comment