A new machine-learning approach assesses the accuracy of structural models of RNA molecules. From a training set of known RNA structures, the algorithm learns the characteristic features of RNA, such as base pairs, hydrogen bonds, helices, and hairpins and uses what it learns to choose the best from a group of candidate structures. Being able to identify accurate RNA structures can help in understanding RNA function or designing synthetic RNAs.
RNA structures are more difficult to predict computationally than protein structures, in part because far fewer RNA structures than protein structures have been determined experimentally, especially at high resolution, says Ron O. Dror of Stanford University, who led the work in collaboration with Rhiju Das, also of Stanford. But at the same time, “It is most valuable to predict the structures of types of molecules that are very hard to solve experimentally,” he says.
The researchers used experimentally determined structures of 18 small RNA molecules to teach their algorithm the key characteristics of RNA. They gave the program the 3D coordinates of each atom in the structures.
With that limited amount of information, the program—called Atom Rotationally Equivariant Scorer (ARES)—learned enough about RNA structures to be able to select the best match for a given RNA molecule from a large pool of candidate structures generated using a publicly available structure-prediction tool provided by Rosetta software (Science 2021, DOI: 10.1126/science.abe5650). ARES “simply looks at the coordinates. And then using what it has learned about RNA structure in general, it evaluates the accuracy of each prediction and picks the one that is most accurate,” Dror says.
The structures used to train ARES were much smaller than the ones used to test it. It was still able to select the best structures from the candidates. “We were surprised that our method, when trained on this data set that consisted entirely of small structures, could make good predictions for large structures as well as for small structures,” Dror says.
“The authors’ clever idea makes the [program] much more easily trainable and accurate than I would have predicted, but at the same time, it depends on how good the initial sampling of possible structures is,” says Adrian E. Roitberg, a computational chemist at the University of Florida. “As always, methods such as this one need to be tested by other groups, but the fact that the authors made everything available, including a web server, means people will be able to push this and help improve it.”