Volume 96 Issue 8 | p. 8 | News of The Week
Issue Date: February 19, 2018 | Web Date: February 15, 2018

Machine learning predicts organic reaction performance

Using data from thousands of reactions, algorithm points chemists to the best reagents to use in an amination reaction
Department: Science & Technology
Keywords: Informatics, machine learning, synthetic methods, Buchwald-Hartwig amination

When chemists develop new types of reactions, they generate a lot of data on what works and how well, along with what doesn’t work at all. Much of the data are never used, says Abigail G. Doyle, a chemistry professor at Princeton University. “We publish only a small fraction and usually only the best results,” she says. Doyle thinks that by using machine learning—in which computer algorithms find patterns in data—it might be possible to use all the data chemists generate to predict the best conditions for a reaction even when the substrate has never been used in that transformation before.

Doyle and Princeton’s Derek T. Ahneman and Jesús G. Estrada, along with Merck & Co.’s Spencer D. Dreher and Shishi Lin, take a step in this direction by using machine learning to predict the yield of a Buchwald-Hartwig amination (example shown). Their algorithm allowed for variation in the aryl halide substrate, palladium catalyst ligand, base, and the addition of an isoxazole (Science 2018, DOI: 10.1126/science.aar5169). The chemists added isoxazole to the mix because this motif is popular in druglike molecules but sometimes poisons these reactions. The team hoped to get a better idea of what conditions and specific isoxazole structures were problematic.

Using Merck’s ultra-high-throughput reaction technology, the chemists performed 4,608 reactions and used the data from a portion of those to build an algorithm that would predict the outcome of the remaining reactions. After trying several algorithms, the chemists found that the so-called random forest model performed the best.

This algorithm accurately predicted which isoxazole additives would poison the reaction, even those that weren’t included in the data used to build the model. The results could help chemists pick which ligand and base combination to use to maximize yields for the C–N coupling when a given isoxazole motif is part of their substrate.

“We’re most excited by the idea that you can apply this method to any sort of new problem that you identify in reactivity,” Dreher says, although both he and Doyle say that kind of predictive power is still a long way off.

The team’s use of machine learning “is marvelous and long overdue for the field of homogeneous catalysis and chemical synthesis in general,” says Richmond Sarpong, an expert in organic synthesis at the University of California, Berkeley.

Chemical & Engineering News
ISSN 0009-2347
Copyright © American Chemical Society
Philip Raby (Wed Feb 21 06:33:37 EST 2018)
A small indication of the future of R&D for synthesis chemists. The improvements in discovery efficiency will create "virtual labs" in the future. AI is a game changer in every industry it touches.

Leave A Comment

*Required to comment