Watching fruit flies buzz around the ripe bananas in your kitchen, you might think it’s a tad ludicrous, mortifying even, that humans have a similar number of genes—about 23,000—as the lowly insects. We are certainly more complex than Drosophila melanogaster, so what gives?
The answer lies in the spliceosome, a cellular machine that, at first glance, seems to do some pretty straightforward pruning of messenger RNA (mRNA). As the cell transcribes your DNA’s nucleic acid sequence into RNA, the spliceosome lands on the newly forming mRNA strand, where it chops out unnecessary pieces, called introns, and joins together the leftover, essential sequences, called exons. The edited mRNA is then exported to the cell’s cytoplasm, where it gets translated into protein.
Most strands of unspliced mRNA, otherwise known as pre-mRNA, have about a dozen introns that can be removed. Yet the spliceosome doesn’t always link together the remaining exons in a straightforward manner. Sometimes the spliceosome intentionally skips an exon, or it reorders the exons, or it unexpectedly leaves an intron in the mix. On average, this variable editing process produces about 10 different proteins for every gene that we have. “Alternative splicing allows us to make the most out of every gene,” says Joan Steitz at Yale University School of Medicine. “Splicing is the reason we can have the same number of genes as the fruit fly Drosophila and yet be more complicated.”
This splice ‘n’ dice machine gives us our complexity, but it’s also exceedingly complex itself. So complex, in fact, that it’s taken decades and many twists and turns for scientists to figure out how it works. Because the spliceosome is so sophisticated, small hiccups in its operation can lead to biological malfunction and, ultimately, to disease. Discovery of the tiny machine has been a boon to drug developers, who are already developing drugs that target the spliceosome. They hope such molecules will treat the myriad diseases linked to splicing malfunction, including many cancers, some forms of blindness, and 10% of genetic diseases, such as spinal muscular atrophy and certain types of dwarfism.
What researchers now know is that cells assemble the spliceosome from an enormous cast of protein and RNA characters. These players work in unison, carrying out a gymnastics routine worthy of the Super Bowl halftime show. Five protein-RNA complexes, called ribonucleoproteins, and some 200 proteins come and go during different stages of human splicing. This machinery forms temporary assemblies that prep and then edit pre-mRNA, converting it into mRNA that can be read by the ribosome, another enormous ribonucleoprotein engine responsible for turning mRNA into protein.
Although it’s only about half the size of the ribosome, the spliceosome—with its ever-changing parts and rearrangements—is a much more dynamic machine, says Reinhard Lührmann of the Max Planck Institute for Biophysical Chemistry, in Göttingen, Germany. This has made the spliceosome one of structural biology’s most desirable targets and one of its most challenging foes: Many in the field say that the ribosome was a comparatively easy structure to solve, and even that was a feat so grand it earned the structural biologists who accomplished it a Nobel Prize.
So it was that scientists gasped in collective shock when a team of researchers—newcomers to the spliceosome field—published the first near-atomic-resolution structure of the splicing machinery in August. The scientists, led by Yigong Shi of Tsinghua University, in China, also published an accompanying paper on spliceosome function for good measure (Science 2015, DOI: 10.1126/science.aac7629 and 10.1126/science.aac8159). “It was a total bombshell,” Yale’s Steitz says. “I never thought we’d see a complete structure this soon.”
The structure, obtained with cryoelectron microscopy, was a milestone in the field. In addition to confirming nearly four decades of biochemical and genetic research on the spliceosome, the snapshot enables those seeking to understand the intricacies of an essential process at the root of complex, multicellular life. Two months after the structures were published, “I’m still carrying the papers around in my briefcase,” says Massachusetts Institute of Technology’s Phillip A. Sharp, who in 1993 won the Nobel Prize in Physiology or Medicine with Richard J. Roberts for discovering splicing in 1977.
The 1970s were “a very weird time in molecular biology,” Steitz says. After the structure of DNA was solved in 1953, researchers spent two decades using bacterial cells to work out biology’s central dogma—that DNA is transcribed into RNA and that RNA is translated into proteins. But bacteria do not splice. “Most people thought everything would be the same for higher organisms—perhaps there would be more bells and whistles, but nothing fundamentally different,” she says. “How wrong we were.”
Even so, at the time, scientists were noticing some peculiar things about higher organisms’ cells. For one thing, “there was a puzzlingly large amount of DNA in mammalian and amphibian cells,” Sharp says. “If all that DNA were encoding genes, there would be hundreds of thousands—maybe a million—genes in humans. In some amphibians there was 10 times as much DNA as in humans,” he says. “There was no reason to anticipate that we had so many genes, so we wondered, ‘What is all that DNA about?’ ” Researchers hadn’t yet realized that a vast majority of multicellular organisms’ DNA was composed of mostly non-protein-coding introns.
There were other peculiarities, too, Steitz says. “We knew that a lot of RNA was made in the cell’s nucleus, but only about 10% got exported out to the cytoplasm in the form of mRNA.” The rest of it—a whopping 90%—was degraded. Why that happened was a mystery that many in the field began racing to solve, she says.
Sharp and Roberts discovered splicing independently, by using electron microscopy to peer at DNA that had been matched up with, or hybridized, to its corresponding mRNA. In the microscopy images they obtained, it was clear that DNA is much longer than its corresponding mRNA. Furthermore, although the DNA was hybridized to the mRNA, segments of DNA would loop out and away from the shorter mRNA before reconnecting with it. The only explanation for the repeated loops of extra DNA was that they represented sequences that had been chopped out of the mRNA.
Sharp, who was just starting his career at MIT, went to a meeting at Cold Spring Harbor Laboratory in late spring of 1977 to present these data. At the conference, Roberts, who was working there, presented electron microscopy data showing similar DNA loops. “That’s when we realized that we were on to the same thing,” Sharp says.
The meeting changed everything. Even though Sharp and Roberts’s theory was radical—violating biology’s then-central dogma—scientists in the community bought into it wholeheartedly. Their enthusiasm was a mixed blessing, Sharp says, because as a young investigator he suddenly found himself competing with huge, established labs keen on exploring this new splicing system.
“It was a beautiful case of ‘seeing is believing,’ ” Steitz says. “The minute everybody saw those DNA loops they said, ‘Oh my God, that’s what’s going on.’ It explained all these other lingering things that had made no sense—until then.”
Within a few months, various research groups had confirmed that splicing was happening in a number of multicellular organisms. During the 1980s, model human and baker’s yeast cells made studying the spliceosome more streamlined, and the field began identifying component parts of the splicing machinery, says Kiyoshi Nagai of the MRC Laboratory of Molecular Biology (LMB). “First we thought there were 25 proteins involved in human splicing,” Lührmann says. “Then that number grew to 200,” he says.
By the 1990s, a picture of the spliceosome started to take shape. The first low-resolution structures—30 Å or so—of component proteins and ribonucleoproteins were reported, Lührmann says, and “we began to realize all the interesting conformations of the spliceosome.” In particular, Lührmann’s team published a landmark 2002 Science paper revealing its extreme structural dynamics, with “dozens of proteins flying off the spliceosome” and others being recruited during different stages of splicing (DOI: 10.1126/science.1077783).
Meanwhile, the human genome was published, and the final tally of genes it contained was shockingly lower than expected: 23,000 compared with a predicted 100,000. The discovery thrust splicing back into the limelight. Although nobody doubted that multicellular organisms were capable of splicing, until the human genome was unveiled, nobody knew just how fundamentally important the process was, Steitz says.
“It was fabulous validation—not that my ego needed it,” quips Sharp, who had already secured the Nobel Prize years earlier. Thereafter, “-omics” efforts—quests to sequence a cell’s total complement of proteins and mRNA—helped researchers tabulate just how much splicing was taking place in cells.
After nearly four decades of detailed research, scientists have a basic sketch of how splicing works. Before RNA polymerase has finished transcribing DNA into mRNA, splicing is already under way: First, two ribonucleoproteins called U1 and U2 land on the pre-mRNA, hybridizing with complementary nucleic acid sequences that signal the start and the end of an intron, LMB’s Nagai says.
Over seven steps, teams of ribonucleoproteins join and exit the spliceosome, along with a vast array of protein partners. In particular, a troupe of protein helicases wind and unwind the mRNA to rearrange base-pairing of ribonucleoproteins and massively change their conformation. These helicases “are the driving force for the various remodeling steps of the spliceosome,” Lührmann says.
All these rearrangements pull the mRNA’s 5' splice site close to an important adenosine base—called the branch point site—that is located in an intron. A hydroxyl group on that adenosine launches an attack on a phosphate group at the first intron-exon border, severing intron from exon through a transesterification reaction. After another large spliceosome rearrangement, a second transesterification reaction occurs, so that the intron gets completely excised. Eventually the remaining exons are stitched together.
Shi and colleagues’ 3.6-Å-resolution structure captured the spliceosome in between the first and second of these transesterifications. The work confirmed that the actual catalysis is mediated by RNA and two magnesium ions, all while being bolstered by a huge family of proteins, Steitz says.
“The cell brings together this huge machinery to facilitate a chemical reaction that is actually exceedingly simple—it’s just two phosphodiester transfer reactions,” says Markus Wahl, at Free University of Berlin. “You do this in your first organic chemistry class.” What’s more amazing—or confounding—is that after this complicated rigmarole, just one intron has been excised. On average, human genes have about 10 splice sites.
Wahl points out that some RNA alone can catalyze self-splicing, without the need for complex spliceosome machinery. It makes you wonder why such an arduous system for splicing exists. The answer, he says, is regulation. “Some introns are hundreds of thousands of nucleotides long, yet they must be cut out with precision,” Wahl says. “Just one nucleotide mistake, and that’s a huge error. In the ocean of nucleotides, the spliceosome knows how to recognize exactly where the authentic splice site is,” thanks to myriad regulatory proteins associated with the machinery.
“If you want precision and flexibility, then you need a highly regulated machine,” Wahl says. That’s because a single gene can produce hundreds or thousands of products: For example, in humans, alternative splicing of the pre-mRNA transcribed from muscle genes produces the many types of muscle we possess in different parts of our body.
The choice of one splicing pattern over another is key to the development of an organism from fertilized egg to reproductive adult because different protein products generated downstream of the spliceosome are needed at different stages of life, or in different organs. Different splicing decisions for the same gene, for instance, can either generate a membrane-bound protein or one that’s soluble inside the cell. They can also determine whether a cell is sent on a pathway to death or allowed to stay the course and keep dividing.
The fact that splicing can determine the ultimate fate of a cell should make it no surprise that aberrant splicing is involved in many cancers—a suite of vastly different diseases united in the fact that rogue cells refuse to die.
For cancer cells to achieve their aggressive, expansionist agenda, many of them do a lot more splicing than healthy cells—a characteristic some drug developers want to exploit for cancer treatment.
Because many cancers involve increased splicing, interfering with the spliceosome’s ability to find splice sites could be “a backdoor way to attack that cancer where it is most sensitive,” says Thomas R. Webb, a medicinal chemist at SRI International, a nonprofit R&D institute in Menlo Park, Calif.
Webb has synthesized a family of molecules that target a spliceosome protein called SF3B1, which helps guide the machine to an mRNA splice site. When Webb and his collaborators threw a chemical wrench—a molecule called sudemycin D6—at mice with breast cancer, the result was lethal for the cancer cells, yet normal cells were not significantly affected (Nature 2015, DOI: 10.1038/nature14985). Webb is not the only researcher eyeing SF3B1 as a cancer drug target: Several biotech firms are also developing molecules that modulate the spliceosome protein, with clinical trials on the horizon.
For some cancers, including several blood and lymph varieties, “the dependency on splicing becomes so critical that it’s at a tipping point. The slightest additional change in splicing fidelity puts the cancer cell on a path to death,” Webb says.
The trick is to find the right therapeutic window for any molecule that tweaks the spliceosome, Webb continues. Scientists must find a therapeutic dose that kills cancer cells but does a minimal amount of collateral damage to healthy cells. “With tens of thousands of substrates and hundreds of thousands of products, blocking splicing altogether would just be lethal,” he says. “If you stop splicing altogether, all cells would die, whether they are tumor or normal cells.”
For this reason, many drug developers long avoided targeting the spliceosome, and it was often referred to as “undruggable.” “When someone says something is undruggable, it’s always an opinion and never a fact,” Webb says.
For example, with kinases, researchers worried that the enzymes’ active site was too omnipresent in cells, making selectivity for an inhibitor to that site impossible. After much effort, though, viable inhibitors for kinases were discovered. “It’s a chicken-and-egg thing. If you call it undruggable, then you don’t even try,” Webb says. Then that statement becomes true, “but only because no one is trying,” he adds.
For the spliceosome, opinion is changing: More medicinal chemists in academia, biotech, and big pharma are working on molecules that target it, with big names such as Novartis, Pfizer, and Roche getting into the game. These scientists are not just looking at the spliceosome as a starting point for cancer treatments but also as a therapeutic target for other diseases, including spinal muscular atrophy; a dwarfism called Taybi-Linder syndrome; and retinitis pigmentosa, a common form of blindness. The field has seen a dozen or so clinical trials, although no drugs have yet been approved. The drug development strategy goes beyond simply interfering with spliceosome function. Many are attempting to enhance splicing with oligonucleotides or small molecules.
For instance, researchers at Novartis are evaluating a class of oral pyridazine-based splicing enhancers as possible therapies for spinal muscular atrophy, a common,fatal genetic disease in children. Inefficient splicing of a motor neuron’s mRNA produces a truncated, unstable protein that leads to the paralysis seen in patients. Novartis scientists recently reported that their clinical trial candidate, called NVS-SM1, enhances binding of the spliceosome’s U1 ribonucleoprotein to an intron splice site to help restore proper splicing in mouse models (Nat. Chem. Biol. 2015, DOI: 10.1038/nchembio.1837). That report came on the heels of one by researchers at Roche, who reported last year in Science that their orally available splicing modulator, SMN-C3, “improved motor function and longevity in mice with spinal muscular atrophy” (DOI: 10.1126/science.1250127).
With so many players and such a complicated sequence of steps required to properly splice mRNA, it may seem incredible that the spliceosome ever works perfectly at all. “The spliceosome makes errors, and it’s very complicated,” Sharp says. “But there is beauty in the fact that splicing is so arduous,” he adds.
Sharp suggests evolution may have found some advantage in going with a system so complex that it is doomed to fail with some regularity. “The ability to excise introns in different patterns and to shuffle exons is integral to the evolution of all complex, multicellular organisms,” he says. In other words, by having an enormous, error-prone strategy for creating complexity, evolution has a platform to experiment.
Indeed, the University of Toronto’s Benjamin Blencowe recently showed that when you tweak just a single component of splicing in embryonic chickens—namely mutating a splicing regulator so that an mRNA loses its ninth exon—the chicken’s brain develops into something more mammalian than birdlike (Science 2015, DOI: 10.1126/science.aaa8381).
Nearly 40 years after splicing was discovered, there’s no doubt about its fundamental importance to biology—why humans and fruit flies and chickens are different and how evolution made it so. Yet there is still much to learn. Many of the 200 proteins involved in the spliceosome regulate the process in unknown ways. Some of these regulators act as communication vectors between the spliceosome and other complex cellular processes: For example, researchers have found that when the spliceosome jams in yeast cells, protein messengers signal for the cell to switch the kinds of genes made available to RNA polymerase for transcription. Thinking about all these connections “can make your eyes cross,” Sharp says.
Even the first near-atomic-resolution structure of the spliceosome is just a snapshot of one of the machine’s many conformations. “Ideally, we’d like a whole movie of all the conformations,” Tsinghua University’s Shi says.
“We’re still just scratching the surface on so much right now,” Webb says. But therein lies an opportunity, he adds. With so many unknowns, there are many ways for scientists, especially chemists, to contribute—from structural biology of the spliceosome to the discovery of small molecules that can modulate it. Spliceosome research, he says, is a fertile place for discovery.