Although the myriad proteins found in all life are largely built from a set of 20 amino acids, many other amino acids exist in nature, and it remains a curiosity as to why some were ultimately incorporated into proteins and others left out. In a new study investigating what could have driven this selectivity, researchers made peptides out of several sets of amino acids and compared their solubility and foldability—how readily they arranged themselves into protein-like structures (J. Am. Chem. Soc. 2023, DOI: 10.1021/jacs.2c12987).
The 20 canonical amino acids can be divided into “early” and “late” groups based on when they were added to the amino acid alphabet. Starting from a core of seven early amino acids, Klára Hlouchová of Charles University, Stephen D. Fried of Johns Hopkins University, and coworkers added sets of other amino acids, made peptide libraries from each set, and compared them with peptides made from 19 of the canonical amino acids. (The researchers left out cysteine because they wanted to avoid the complication of needing to keep it in a reduced form.)
Some of the peptide libraries were made of smaller sets of the canonical amino acids. Two of the libraries included noncanonical—but prebiotically available—amino acids. One included amino acids with linear side chains such as norvaline and norleucine. The other included the amino acid 2,4-diaminobutyric acid, which is basic. Both libraries were significantly more soluble than the early amino acid library. In fact, the peptides made from canonical amino acids were the least soluble of the ones the researchers tested.
Using circular dichroism spectroscopy, they also showed that the canonical library formed more secondary structures like α helices and β sheets. Peptides made using only early amino acids were less likely to form secondary structures than the full canonical library but more likely to do so than the libraries that contained one or more noncanonical amino acids.
The researchers hypothesize that acidic and basic amino acids need to have side chains of different lengths. “We found that the combination of short negative and short positive amino acids doesn’t work very well,” Hlouchova says. Nature uses short negative and long positive side chains, but maybe the reverse setup could produce another alphabet of folding amino acids, she adds. The researchers are exploring this.
Stephen Freeland, a computational biologist at the University of Maryland Baltimore County who identifies design principles for synthetic amino acid alphabets, says in an email that this relationship between side chain length and charge is “exciting” and wants to see more experiments like this. “The trick is always to find these biophysical generalizations without merely abstracting or inventing rules of no real significance which just happen to match the one alphabet we have.”