Advertisement

If you have an ACS member number, please enter it here so we can link this account to your membership. (optional)

ACS values your privacy. By submitting your information, you are gaining access to C&EN and subscribing to our weekly newsletter. We use the information you provide to make your reading experience better, and we will never sell your data to third party members.

ENJOY UNLIMITED ACCES TO C&EN

Biological Chemistry

Divining The Spliceosome

The missing link between humanity's small genome and huge proteome reveals some secrets

by Sarah Everts
May 4, 2009 | A version of this story appeared in Volume 87, Issue 18

Recognition
[+]Enlarge
Credit: © 2009 Nature Publishing Group
RNA-protein hybrid module of the spliceosome bound to its target, an mRNA intron splice site (pink, top left).
Credit: © 2009 Nature Publishing Group
RNA-protein hybrid module of the spliceosome bound to its target, an mRNA intron splice site (pink, top left).

WHEN WE HUMANS got a first glimpse of our genome, we had good reason to question our biological complexity. Many scientists predicted we would possess some 100,000-plus genes, but sequencers finally capped the human genome at 20,000–25,000 genes. The size of the human genome is cause for a little existential malaise, considering that the lowly worm Caenorhabditis elegans also has about 20,000 genes.

Slicing And Dicing
[+]Enlarge
Credit: Mol. Cell
Electron microscopy reveals a rough-grain picture of the spliceosome (three views shown) in one of its catalytic conformations (Mol. Cell 2006, 24, 267).
Credit: Mol. Cell
Electron microscopy reveals a rough-grain picture of the spliceosome (three views shown) in one of its catalytic conformations (Mol. Cell 2006, 24, 267).

To further complicate matters, humanity's paltry gene count doesn't add up to the diversity of proteins we produce in all our tissues throughout our entire lives—an estimated 150,000 unique proteins.

It turns out that the missing link between our small genome and huge proteome is the spliceosome, a massive protein-RNA hybrid machine that lurks in the nucleus of every human cell. Researchers are only beginning to understand at the molecular level how this inordinately complicated catalyst slices and dices the messenger RNA transcribed from DNA into myriad different forms before translation into proteins by the ribosome. During the past year, several groups have reported important structures of components of the spliceosome. These milestones give hope that a more complete understanding of the machine may soon be at hand.

The simplistic adage that one gene codes for only one protein is profoundly outdated. The first hint that the path between DNA and proteins is extremely sophisticated came in the late 1970s, when geneticists discovered that the protein-coding sections of DNA called exons were not continuous. Instead, the genome is peppered with seemingly extraneous DNA segments called introns. Now it is clear that DNA containing both exons and introns is transcribed into mRNA and that the spliceosome is responsible for snipping out introns and then joining the remaining exons together before mRNA is translated into protein.

It's the spliceosome's job to tidy up mRNA, but this machine is responsible for more than just housekeeping. It's also the workhorse for alternative splicing, the process by which unique combinations of exons of a given gene can be mixed and matched into a wealth of different mRNAs and, consequently, proteins, giving rise to the complexity of our proteome and to us as organisms.

Most eukaryotes—organisms that compartmentalize their DNA into a nucleus and include everything from yeasts to humans—possess spliceosomes. The mRNA from 90% of humans' genes gets alternatively spliced, says Reinhard Lührmann, a director at the Max Planck Institute for Biophysical Chemistry, in Göttingen, Germany. Alternative splicing outfits a cell with a proteome that suits its role in a particular organ or developmental stage while economizing the amount of DNA required.

Think about it this way: If a gene contains exons that code for all the parts of a pantsuit—from the underwear to the topcoat—then the spliceosome mixes and matches component garments to provide an outfit appropriate for a particular occasion. Perhaps it would emphasize just the shirt and pants for a casual meeting, but would include the whole costume for a fancy dinner.

On average, every human mRNA has about three alternative splicing sites, but the mRNA from some human genes gets spliced into thousands of different arrangements, Lührmann says. If that sounds like a lot, consider that the fruit fly has mRNA from one gene that gets alternatively spliced some 33,000 distinct ways.

Because the spliceosome is responsible for bequeathing humanity all its complexity, imperfect splicing is also at the root of many diseases, from genetic disorders such as spinal muscle atrophy to cancer and even certain forms of blindness. This is why "pharma has gotten interested in the spliceosome as a new drug target," says Melissa J. Moore, a biochemist at the University of Massachusetts Medical School. Some researchers have also proposed that the spliceosome may be at the root of long-term memory and cognition because an above-average amount of splicing happens in brain cells, especially for mRNAs that are translated into neurotransmission proteins.

ALTHOUGH THE spliceosome is as important to protein production as other macromolecular machines such as RNA polymerase II, which produces intron- and exon-containing mRNA from DNA, and the ribosome, which builds proteins from mRNA, "our understanding of the spliceosome's inner workings and its detailed structure is still in its infancy," Moore says.

What researchers do know is that the human spliceosome is 3 megadaltons in size and involves five RNAs (U1, U2, U4, U5, and U6) and more than 150 proteins. Because many of these proteins are extremely large or have extensive disordered sections, they are difficult to crystallize for X-ray diffraction structural studies.

One reason that spliceosome research trails that of ribosomes and RNA polymerase II is that cells don't produce a large amount of the biomachine, so it's hard to isolate enough of it to work with, says Melissa S. Jurica, a cell biologist at the University of California, Santa Cruz. The spliceosome may be essential, but it is present in scarce quantities—less than 1% of the dry weight of a cell. Compare this with the similarly sized ribosome, which is so abundant that it is about 25% of a cell's dry weight, Moore adds. To make things trickier, bacteria don't possess a spliceosome, so researchers must always deal with multicellular organisms if they want to extract samples.

These are minor headaches compared with the spliceosome's most frustrating feature: The megamachine doesn't really have a core structure, instead opting to undergo "dramatic structural rearrangements" during the course of catalysis, Lührmann says.

"Let's compare again to the ribosome," Lührmann instructs. "The ribosome is preformed from two RNAs and a handful of proteins. Although the ribosome adopts several conformations during the catalytic process, its overall structure doesn't change much." By contrast, the spliceosome involves incessantly changing rearrangements of its five RNAs and 150 proteins.

During splicing, different teams of RNA and protein form complexes on the introns, do their specific job, and fall off. Other pieces of the spliceosome then join together to carry out other tasks, and so on. "There is so much coming and going and rearranging of parts that it is hard to get a snapshot of what is happening," Lührmann explains.

Catalytic Helper
[+]Enlarge
Credit: Courtesy of Andrew MacMillan
A domain of the protein PRP8, which is present in the catalytic module of the spliceosome.
Credit: Courtesy of Andrew MacMillan
A domain of the protein PRP8, which is present in the catalytic module of the spliceosome.

Yet "this most complicated of machines only does pretty simple chemistry," Moore says. In effect, splicing catalysis involves two straightforward transesterification reactions to cut each end of the intron, formation of one phosphodiester bond between the adjoining exons, and that's it, she adds.

"It might seem strange that multicellular organisms have spliceosomes that are so complex," Jurica says. "The explanation for this complexity is regulation." The spliceosome must be able to rapidly respond to the changing needs of a cell by shifting the way it configures mRNA prior to translation.

The "elaborate, byzantine machinery exists because the spliceosome has to do its job exactly right at exactly the right time," Moore says. For example, the decision to splice out one intron or leave in another one is likely made by the team of proteins that assembles on each intron during catalysis. Certain combinations of specific proteins are required to orchestrate a specific splice, Moore explains. "It's like splicing by committee," she adds.

AN EXAMPLE of the need for such flexibility is the so-called consensus sequence, a series of seven nucleotides that signals the beginning of an intron splice site. It turns out that human consensus sequences are highly variable. Only two of the seven nucleotides are strictly conserved, and the sequences' base pairing with RNA components of the spliceosome can be pretty weak. So the spliceosome relies on both RNA and a team of helper proteins to recognize an intron's splice site, Jurica explains.

Specifically, the spliceosome's U1 RNA, in collaboration with up to 10 proteins, recognizes the splice site. This RNA-protein team converges on an intron to recruit the spliceosome's remaining catalytic machinery to the site. The initial catalytic machinery is made of four other RNA components—U2, U4, U5, and U6—plus 75 proteins.

Earlier this year, Kiyoshi Nagai and colleagues at MRC Laboratory of Molecular Biology, in Cambridge, England, solved the structure of the spliceosome's U1 recognition complex to 5.5-Å resolution (Nature 2009, 458, 475). It was the first X-ray crystal structure of any spliceosome complex to be elucidated, a task that took seven years to accomplish, Nagai tells C&EN.

"Getting the U1 complex was a huge achievement because it is responsible for identifying the splice site," Moore says. But she points out that it's just a first step on a difficult road toward understanding how catalysis works at the molecular level. "The solved U1 complex has one RNA and seven proteins, while the catalytic unit has 75 proteins plus four RNAs associated with it. Getting the U1 complex was really difficult, and getting the catalytic complex will be that much harder," she adds.

Although the picture of the catalytic process is "still rather murky," extensive biochemical and genetic experiments have narrowed things down, Jurica says.

At the moment, a prevailing view in the field is that the spliceosome's catalytic machinery pulls together the two ends of an intron early in the splicing process, making a big bow out of the intron, Jurica says. "This is quite amazing," she adds, because some introns are extremely long—up to tens of thousands of base pairs.

Catalysis begins when the hydroxyl group from an adenosine nucleotide in the intron nucleophilically attacks the 5′ end of the same intron in an SN2 transesterification reaction.

Next, a hydroxyl from the free end of the adjacent exon performs a similar nucleophilic attack on the intron's 3′ splice site, cutting out the intron. The two exons surrounding the intron then form a phosphodiester bond, producing a modified mRNA that is transported from the nucleus to the cytosol, where it is translated into a protein by the ribosome.

SEVERAL TEAMS of electron microscopists have caught the spliceosome in action, although their images are far from having atomic resolution. Others are developing multicolor fluorescent tagging techniques to observe the hustle and bustle of spliceosome components in a live cell. Many researchers are also trying to crystallize the spliceosome machinery in its catalytic conformation. But all these structural studies are impeded by the fact that only a few small molecules "can throw a wrench in the machine" to catch the spliceosome in a catalytic conformation that can be further studied, Jurica says.

Since 2007, a handful of small molecules that do seize the spliceosome have been reported, including several anticancer drug leads—spliceostatin A, isoginkgetin, and members of the pladienolide family. Suberoylanilide hydroxamic acid and several other histone deacetylase inhibitors also block the spliceosome. "I'm sure there are many more small-molecule spliceosome inhibitors in chemical libraries around the world," Jurica says. The libraries just haven't been screened for spliceosome activity, she adds. To help hasten the process, Jurica and others are developing high-throughput assays that would screen existing libraries for spliceosome blockers.

Structural biologists have also taken a stab at crystallizing the 150 proteins known to be involved in the machinery. One protein in particular, called PRP8, has long been a major target because experiments have always pinpointed it in the heart of the catalytic complex. To boot, the gene that codes for this protein is extremely conserved across all multicellular organisms that use the spliceosome.

The whopping 2,335-amino acid protein was long known in the field as a "career killer," Jurica says, because so many students and postdocs failed to crystallize even a small corner domain of it. A breakthrough came last year when three groups reported the X-ray crystal structure of a small domain at the protein's C-terminus. The domain is only about 200 amino acids long—just 10% of the protein—but its structure surprised the field.

Up until the PRP8 domain structure, most people assumed that the spliceosome's RNAs were its catalytic components and that the proteins served only to help with splice-site recognition and regulation. The logic was straightforward: Spliceosome chemistry is so easy that some mRNAs can catalyze their own reconfiguration. In fact, bacteria rely entirely on introns that double as catalytic RNAs or ribozymes to splice introns out of bacterial mRNAs. Also, evidence exists that one of the spliceosome's RNA components is involved in splicing catalysis.

Advertisement

But the structure of the PRP8 domain looks incredibly similar to that of a protein enzyme known to chop up RNA, explains Andrew M. MacMillan, a biochemist at the University of Alberta, in Edmonton. Why would a protein with such a structure not take part in catalysis in some way? Is the spliceosome a ribozyme, a protein enzyme, or a hybrid thereof? The subject is a matter of intense debate in the field.

"There is no reason to doubt the primary role of the spliceosomal RNAs in catalysis," Macmillan says. However, "there have been hints that this picture is incomplete and that proteins may be much more intimately associated with splicing catalysis than has been imagined."

Others remain convinced that spliceosomes are ribozymes, pure and simple. "Our work has shown that the spliceosomal RNAs by themselves can perform the chemistry," notes Saba Valadkhan, an RNA biochemist at Case Western Reserve University. Valadkhan is currently working to develop a minimal RNA-based model of the spliceosome's catalytic site.

In a recent commentary discussing the possible mechanisms by which the spliceosome performs catalysis, John Abelson, a biochemist at UC San Francisco, wrote, "It has been assumed that the spliceosome is a ribozyme," but the proposed effects of PRP8 on catalysis "bring this into question" (Nat. Struct. Mol. Biol. 2008, 15, 1235).

"But we really won't understand anything until we have a structure of the [catalytically] active spliceosome," Abelson adds. "I hope that happens in my lifetime." It's a goal many researchers are trying to reach and even more are hoping for.

Article:

This article has been sent to the following recipient:

0 /1 FREE ARTICLES LEFT THIS MONTH Remaining
Chemistry matters. Join us to get the news you need.