When Bradford W. Gibson started using chemical cross-linking mass spectrometry more than a decade ago to map the structure of proteins, he knew there would be challenges in both the chemistry and the data analysis.
Starting in the late 1990s, Gibson, then at the University of California, San Francisco, was one of the first mass spectrometrists to use chemical cross-linking to provide distance constraints for structural biology. The cross-linking community has grown since then, but the method still isn’t easy.
“All the problems that we encountered and identified back then still exist at some level,” says Gibson, now director of the chemistry and mass spectrometry core at the Buck Institute for Research on Aging, in Novato, Calif. “How do you get enough cross-links to get the information you need? How do you filter the data and identify true cross-links? How do you get around all the side reactions?”
Scientists still wrestle with these questions, but they are also beginning to move past them. They are expanding the palette of available cross-linking reagents. They are developing new data analysis algorithms that fish out the tiny number of cross-linked peptides from the vast sea of non-cross-linked ones. And they are harnessing those improvements to move beyond studying the structure of individual purified proteins and the makeup of multiprotein complexes. Scientists are now trying to analyze such proteins and complexes in organelles and even whole cells.
In cross-linking MS, a bifunctional reagent reacts with amino acids that are close enough to one another for the reagent to bridge the distance between them. The amino acids can be on the same protein or on separate proteins or peptides in a complex. After cross-linking, the proteins are digested into peptides and analyzed by MS to determine which amino acids are connected and thus which domains or proteins are near each other. Cross-linking MS can complement other structural techniques, but it can also tackle systems that other methods can’t. For example, cross-linking MS is particularly useful in analyzing the structures of proteins that don’t crystallize.
From the earliest days of cross-linking MS, scientists have relied on amine-reactive reagents. The reagents—which typically feature terminal activated N-hydroxysuccinimide esters—react with primary amines in proteins or peptides, namely lysine residues and the amino terminus. The primary amines displace the succinimides, forming a covalent link anchored by a pair of amide bonds.
Such reactions “work great,” says Juri Rappsilber, a mass spectrometrist who splits his time between the University of Edinburgh, in Scotland, and the Technical University of Berlin, Germany, but they are limited to primary amines. That limits the number of distance constraints the reagents can reveal, he says.
Another problem with amine-reactive cross-linkers is that the distance information is “soft,” says Andrea Sinz, a professor of pharmaceutical chemistry at Martin Luther University of Halle-Wittenberg, Germany, who uses cross-linking for structural biology studies. “You have to account for the high flexibility of the lysine side chains.” A cross-linker with an 8-Å spacer between the reactive groups can bridge lysines with distances between α-carbons of as much as 25 or 30 Å, she says.
To get around these problems many groups are devising new kinds of cross-linkers that work with a wider variety of amino acids and provide more precise distance information.
One candidate is a type of photoaffinity label being developed by Sinz’s collaborator Olaf Jahn of the Max Planck Institute for Experimental Medicine, in Göttingen, Germany. In this approach, Jahn incorporates a photoreactive amino acid derivative into a binding partner, typically a peptide, of a target protein. His preferred amino acid derivative is p-benzoylphenylalanine, which has a benzophenone group that forms a diradical upon illumination with ultraviolet light. The diradical can insert into any nearby C–H bond to form a C–C bond. If no binding partner is nearby, the radical relaxes and can be reactivated.
“From a photoaffinity-labeling experiment, we usually get a low number of cross-links, often just one or two,” Jahn says, “but these constraints are real contact points.” Each cross-link represents a distance less than 8 Å.
In principle, the amino acid derivative could be incorporated in any protein or peptide, but in practice Jahn has been limited to peptides shorter than about 50 amino acids because he synthesizes the probe by solid-phase peptide synthesis.
Jahn says he has the most success substituting the photoactive amino acids for other bulky or hydrophobic amino acids in the natural sequence, such as tryptophan or phenylalanine.
Sinz is working on ways to incorporate photoactive amino acids in full-length proteins. To do this, she includes photoreactive amino acids like photoleucine and photomethionine, both of which contain diazirine rings, in growth media used during cell culture. The cell’s natural protein synthesis machinery readily takes up these simple amino acid analogs.
“You want every methionine in your protein to be replaced by photomethionine,” she says. “You just irradiate the cells, and the proteins will cross-link with their interaction partners directly in the cells.” She is also working with methods such as those developed by Peter Schultz at Scripps Research Institute and others for incorporating nonnatural amino acids into proteins.
Meanwhile, Christoph H. Borchers and his coworkers at the University of Victoria Genome British Columbia Proteomics Centre have developed an array of new chemical cross-linkers, many of which are commercially available from Creative Molecules, a company Borchers started.
“We have so far made 33 cross-linkers,” Borchers says, all of which are amine-reactive. Some of the cross-linkers have affinity groups like biotin that can be used to pull the cross-linked peptides out of a complex mixture. Some are chemically cleavable, and others can be cleaved in the mass spectrometer. And some of the cross-linkers have all these features. Within these categories, the cross-linkers come in different lengths. No single cross-linker works in all cases. “The variety is what really counts,” Borchers says.
All of Borcher’s cross-linkers are isotopically coded as light and heavy versions. The isotope coding means that scientists can easily identify cross-linked peptides in the mass spectrum as doublets.
Other groups are finding new cross-linkers that provide other ways to pull the connected peptides out of the mixture. Earlier this year, Jesse L. (Jack) Beauchamp of California Institute of Technology and coworkers reported new amine-reactive cross-linkers with a terminal alkyne that can be used to attach an affinity tag via copper-catalyzed azide cycloaddition, so-called click chemistry (Anal. Chem., DOI: 10.1021/ac202637n). Their work builds on earlier work by Joshua Adkins and coworkers at Pacific Northwest National Laboratory (Anal. Chem., DOI: 10.1021/ac900853k).
Beauchamp’s reagent is small, water-soluble, cationic, and cell-permeable. Its modular construction allows the cross-links’ chain lengths to vary.
“The advantage of having the click chemistry is that you don’t start off with an affinity tag already chemically attached to your reagent,” Beauchamp says. “We use the click chemistry to add whatever is appropriate for the experiment.”
For example, they can use the click chemistry reaction to add a biotin affinity tag. To identify cross-linked peptides in the mass spectrometer, a nucleophilic displacement reaction cleaves off the biotin label, which serves as a reporter ion in the mass spectrum. Any tandem mass spectrum containing that ion would have come from a cross-linked peptide.
Effective cross-linkers with good affinity tags are hard to design, says Ruedi Aebersold, a professor at the Institute of Molecular Systems Biology at the Swiss Federal Institute of Technology, Zurich. “We’re asking a lot of such a molecule,” he says. It has to be soluble, stable, and an effective cross-linker. It has to remain stable during the isolation but not so stable that it drastically changes the fragmentation patterns of the cross-linked peptides. “We have not come up with a molecule that fulfills all these criteria.”
As much as the need for a wider variety of available cross-linkers continues, everybody agrees that data analysis and interpretation are the real bottlenecks.
One aspect of the data analysis challenge is that the more complicated the mixture, the smaller the fraction of cross-linked peptides. A typical protein contains 50 to 200 peptides after digestion. As the number of peptides increases linearly, the number of possible cross-links increases quadratically. At the simplest level, going from one to two proteins approximately doubles the number of peptides but quadruples the number of possible cross-links, Rappsilber explains.
“The cross-linked peptides aren’t in the databases of sequences,” says David Goodlett, a medicinal chemistry professor at the University of Washington, Seattle. “That creates a problem right away for standard database searches.”
The trick is to treat the cross-link like a more conventional posttranslational modification such as phosphorylation, Goodlett says. “We take a targeted de novo approach,” he says. “This approach assumes that each cross-linked peptide is simply a peptide that’s been posttranslationally modified with some other peptide, the mass of which we don’t know.”
Another trick is not to acquire data for charge states lower than 4+. “Cross-linked peptides have two C-termini and two N-termini. They inherently have a higher number of protons,” Goodlett explains. “By starting the data acquisition at 4+ or higher, we filter out a lot of data to begin with.”
Part of the problem of data analysis is that the mass spectrum of the cross-linked pair is a mixture of the mass spectra of the individual peptides. “A normal search engine chokes because it doesn’t anticipate that,” Aebersold says. “We had to develop a search engine that deals with these mixed spectra.”
That search engine, called xQuest, was published in 2008 (Nat. Methods, DOI: 10.1038/nmeth.1192). It simplifies the sequence space that the software needs to search. But identification of the peptides is only part of the problem. A key question is whether that identification is actually right. To answer that question, Aebersold and his coworkers developed another algorithm, published in September, called xProphet (Nat. Methods, DOI: 10.1038/nmeth.2103).
“For every fragment ion, you have a match to a database, but you don’t know whether this match is correct,” Aebersold says. “xProphet assigns a probability to distinguish true from false matches.”
As tough as data analysis is for circumscribed systems of only a few proteins, the problem becomes exponentially harder for in vivo systems. “When you’re dealing with just a few proteins, you’ve got many amino acids that could be combined,” Goodlett says. “When you’re dealing with the whole cell, you’ve got exponentially more possibilities for cross-links. As you have more possible correct answers, you also have many more possible incorrect answers.”
Despite the challenges, a few people are venturing into in vivo applications of cross-linking. For example, Beauchamp and his collaborators have used their clickable cross-linkers to cross-link proteins in cultured kidney cells.
Others, such as Rappsilber, also want to do in vivo cross-linking MS experiments. Rappsilber has even coined the term “3-D proteomics” for such approaches. But so far he has focused on purified systems. For example, he cross-linked mitotic chromosomes, which have more than 4,000 proteins, but focused on just two of the protein complexes, condensin and cohesin.
Jahn has done cross-linking experiments in mitochondria. “The cross-linking itself happened in the organelle, so the proteins were in their natural environment,” he says. He then uses affinity tags to pull the cross-linked proteins out of the mixture for MS analysis. “This is an important move. It’s often not enough anymore to show an interaction or get structural insights in a test tube with purified components. You have to show that this plays a role in vivo.”
The only lab really doing whole-cell cross-linking MS experiments is that of James E. Bruce, a professor of genome sciences at the University of Washington, Seattle.
Bruce and his colleagues have designed cross-linkers specifically for working in cells. “There’s a whole untapped universe of complexes that will never be purified by conventional means,” he says. “The minute you lyse the cell, the minute you add detergent, the minute you do something, you’re going to disrupt some interactions.” By performing the cross-linking reaction in the cell, they avoid many such disruptions.
Their cross-linkers contain affinity tags to fish them out of the mixture, but more importantly they contain cleavable bonds that allow the isolated pair to fragment in predictable ways. Bruce measures the mass of the cross-linked peptides and then releases the peptides by using UV light or other means to cleave the bonds. “Now it’s just back to an ordinary proteome problem,” he says. “I can take that peptide mass and fragmentation pattern and search the database. We can use standard search tools to identify the cross-linked peptides.”
Bruce has used the photocleavable cross-linkers in bacterial cells (J. Proteome Res., DOI: 10.1021/pr200775j). His team identified more than 1,600 labeled peptides and manually validated 53 cross-linked peptide pairs. Using another not-yet-published analysis algorithm called real-time analysis for cross-link technology, or REACT, they have now validated approximately 500 peptide pairs and have identified one of the peptides in another 500 pairs.
In the meantime, the chemical cross-linking community is optimistic about the future. “Right now it’s being used by the leaders in the field,” Borchers says. “The next step is getting people feeling encouraged about their success. We see that community is growing—maybe not as rapidly as we’d like to see, but it’s constantly growing.”