If you have an ACS member number, please enter it here so we can link this account to your membership. (optional)

ACS values your privacy. By submitting your information, you are gaining access to C&EN and subscribing to our weekly newsletter. We use the information you provide to make your reading experience better, and we will never sell your data to third party members.


Protein Folding

Protein design with AI quicker than ever before

New algorithms take a desired protein shape and predict the amino acid sequence needed to make it

by Laura Howes
September 15, 2022 | A version of this story appeared in Volume 100, Issue 33


A ribbon structure of a protein is assembled from blocks.
Credit: Ian C. Haydon/UW Institute for Protein Design
An artistic rendering of the structure of a protein designed using new algorithms

In the past few years, huge advances have been made in predicting protein structures using artificial intelligence. But the reverse problem—taking a protein shape and then predicting how to build it from a sequence of amino acids—has proved trickier. A series of three papers by biologists at the University of Washington School of Medicine now shows that a new machine learning algorithm can design protein molecules faster and more accurately than before. Designed proteins could help build new vaccines, drugs, and sustainable biomaterials.

At their simplest, proteins are chains of amino acids strung together with peptide bonds. The interplay of the various side chains along the length of the polymer—between each other and the surrounding environment—causes the floppy chains to twist and curl into different 3D shapes. But the forms found in nature are just a fraction of what UW’s David Baker thinks are possible. Baker has founded several companies to take designed proteins in different useful directions; the firms include Monod Bio, a protein-based diagnostics spin-off, and Vilya, for designing therapeutics.

The new algorithms created by Baker’s lab offer what postdoc Basile I. M. Wicky calls a “one-pot approach,” which can help researchers design a useful protein shape and then predict the amino acid sequence that will make it.

The first paper, published in July, describes a new tool that can produce protein designs in one of two ways. In the first, the AI can create a design by iteratively improving on simple prompts, such as needing a particular type of fold or binding motif. The AI is “trying to dream up a structure,” as postdoc Jue Wang puts it. The alternative approach involves taking parts of an existing structure and then asking the AI to fill in the gaps (Science 2022, DOI: 10.1126/science.abn2100).

In two new follow-up papers just published, researchers from the same lab demonstrate how another algorithm, called ProteinMPNN, can start from designed 3D shapes and assemblies of multiple protein subunits and determine in about 1 sec the protein sequences needed to make them efficiently. The team tested and refined the predicted sequences by running them through protein structure prediction algorithms and by synthesizing the proteins in the lab. The researchers then verified the protein structures using X-ray crystallography and measured the shapes the proteins combined to make using cryo-electron microscopy (Science 2022, DOI: 10.1126/science.add2187; 10.1126/science.add1964).

“The protein design field is undergoing a tremendous revolution,” says Christian Dallago, a computational biologist at NVIDIA who was not involved in the work. “Ultimately, through these new tools, we can become less reliant on time-consuming, classic approaches.” He says the hope is that such tools will reduce the distance between generating a hypothesis and providing solutions.

Researchers in Baker’s lab say that ProteinMPNN has become their go-to algorithm. They are now working to improve the tools and create proteins for a variety of uses.


This article has been sent to the following recipient:

Chemistry matters. Join us to get the news you need.