If you have an ACS member number, please enter it here so we can link this account to your membership. (optional)

ACS values your privacy. By submitting your information, you are gaining access to C&EN and subscribing to our weekly newsletter. We use the information you provide to make your reading experience better, and we will never sell your data to third party members.



Designing Repeat Proteins

Computational design allows researchers to build structures that nature hasn’t yet tried

by Celia Henry Arnaud
January 11, 2016 | A version of this story appeared in Volume 94, Issue 2

Ribbon structures of repeat proteins that form rodlike structures.
Credit: Nature
Repeat proteins made with similar helix-loop-helix-loop motifs form different structures depending on the length and sequence. In these images, the crystal structure (yellow) is superimposed on the model structure. The insets show the overall shape of the repeat protein.

Researchers want to design proteins to make new materials or catalysts or to better understand how natural proteins evolved. Repeat proteins take some of the work out of the process. In such proteins, as the name suggests, a particular amino acid sequence is repeated multiple times.

“Repeat proteins can form very large binding surfaces, but when you’re designing them, you’re really only building a small piece,” says Philip Bradley, a protein designer at Fred Hutchinson Cancer Research Center. “You can make a 300-amino acid protein but only have to design 30 amino acids at a time.”

Some naturally occurring repeat proteins exist. But computational design can yield a much larger set of diverse structures, two overlapping research teams reported last month. Bradley led one team. David Baker of the University of Washington led the other. Both researchers are members of the University of Washington’s Institute for Protein Design.

The two teams started with a simple helix-loop-helix-loop repeating motif and varied the sequences and lengths of the α-helices and loops. But they used those similar building blocks to make very different structures.

Baker’s team explored the range of rodlike structures they could design from proteins containing four repeats. In each repeat, the helices varied from 10 to 28 residues and the loops from one to four residues, for a total of 5,776 possible combinations (Nature 2015, DOI: 10.1038/nature16162). The researchers used protein design software to calculate which combinations should be stable. They picked 83 structures to experimentally characterize, including 15 for which they solved X-ray crystal structures. Those 15 proteins had structures that span a broad range of curvatures.

“Some of them go straight, some of them twist, and some of them bend back on themselves,” Baker says. And each structure is just one member of a whole family of possible proteins. “They can be indefinitely extended or contracted,” Baker says.

Bradley’s team put an extra constraint on their proteins. They designed proteins so that the first and last repeats interact, forcing the proteins to fold into closed architectures (Nature 2015, DOI: 10.1038/nature16191). As the protein length increases, the size of the hole in the middle likewise increases.

Top and side views of ribbon structures of five designed closed-architecture repeat proteins, with each repeat in a different color from blue to red as the chain proceeds from the N to C terminus.
Credit: Nature
In these three closed-architecture repeat proteins (top and side views shown), the size and shape depends on the number of repeats and the sequence. Each repeat is a different color.

“What we like about the closed architecture is there’s a clear geometric constraint you have to satisfy to close up nicely,” Bradley says. His team is also designing proteins with geometric constraints to make the structures capable of binding particular DNA sequences.

Bradley and coworkers designed a diverse array of doughnut-shaped, or toroidal, structures. They focused primarily on structures with left-handed curvature because those aren’t represented in databases of natural repeat proteins. They solved X-ray crystal structures of four representative left-handed proteins.

Even though no left-handed closed repeating proteins have been observed in nature, Bradley thinks some are out there waiting to be discovered. “Based on our results, there’s not any strong protein biochemistry reason you can’t have left-handed helical bundles that close up,” he says.

Bradley and Baker both hope to use their repeat proteins to design new materials.

“Imagine materials that are built from rods and nodes from which those rods protrude,” Baker says. His team’s designed proteins would be the rods. “We now have a variety of ways of sticking these rods together into nodes. We’re starting to build all sorts of things from them,” he says.

Bradley wants to use his toroidal proteins to make artificial channels in membranes and as scaffolds for presenting other proteins or peptides. The designed proteins are stable enough that they should tolerate the insertion of other peptide sequences—ones researchers want displayed—into the loops without compromising the overall architecture, Bradley says.

Andreas Plückthun, a protein researcher at the University of Zurich, points out that neither of the teams has yet made proteins that can bind other biomolecules. Such proteins have many biochemical applications, including use as probes. So until the teams present such a design, “the general significance of the work will remain somewhat limited for those outside the protein design community,” he says.

But, says Ingemar André, who designs proteins at Lund University, “both studies demonstrate that it is possible to design proteins with conformations not found in nature. The full repertoire of repeat proteins has not been fully explored in evolution, and novel types of assemblies can be engineered.”

Baker agrees: The papers “really demonstrate that what you see in nature is a very small fraction of what’s possible. You can sort of see it from the numbers game. For a 200-amino acid protein, there are 20200 different sequences. You can argue that 1010 or 1012 have been sampled during evolution. We know there’s a huge amount of sequence space that evolution never touched.”  



This article has been sent to the following recipient:

Chemistry matters. Join us to get the news you need.