Chemistry matters. Join us to get the news you need.

If you have an ACS member number, please enter it here so we can link this account to your membership. (optional)

ACS values your privacy. By submitting your information, you are gaining access to C&EN and subscribing to our weekly newsletter. We use the information you provide to make your reading experience better, and we will never sell your data to third party members.



Chemists test computer-planned syntheses for the first time

Can Chematica plan better synthetic routes than people can? Some chemists are skeptical

by Sam Lemonick
March 1, 2018 | APPEARED IN VOLUME 96, ISSUE 10

Chematica's synthetic route (right) to an ATR kinase inhibitor took fewer steps but had a similar yield compared with a published route (left).

Planning efficient synthetic routes can seem like a dark art or feel like a Herculean labor of literature review. Chemists, for the first time, have tested a computer program’s ability to plan complete syntheses without human help, following the proposed routes in the lab (Chem 2018, DOI: 10.1016/j.chempr.2018.02.002).

The idea of a computer planning chemical syntheses isn’t new. Elias J. Corey of Harvard University developed the first version of such a program, called Logic and Heuristics Applied to Synthetic Analysis, in the 1970s, but it never lived up to its promise. Chematica is one of several new contenders that have popped up in the last couple of years. Bartosz Grzybowski at Ulsan National Institute of Science & Technology and the Polish Academy of Sciences worked on the program for 15 years before selling it to MilliporeSigma in May 2017. Another contender, ChemPlanner from John Wiley & Sons, will be integrated into SciFindern, a product from CAS, a division of the American Chemical Society.

Grzybowski and his colleagues have programmed Chematica to follow about 50,000 rules of synthesis. On the basis of reactions published in the chemical literature and insights from organic chemists on the team, each rule tells the program what transformations are possible from any given molecule. Chematica’s algorithms navigate this network of options to generate synthetic routes to identified targets, looking for novel, efficient, and selective paths.

To demonstrate Chematica’s skill at synthesis planning, Grzybowski and his collaborators plugged eight targets into the program. MilliporeSigma had chosen six of the targets, all commercially viable molecules with pharmaceutical potential. One of the six had no previously published synthetic route. Grzybowski’s group had picked the seventh, a molecule with several patented syntheses, and coauthor of the new paper Milan Mrksich of Northwestern University had chosen the eighth, a natural product without a published synthesis.

Chematica took about 15 to 20 minutes to plan each synthetic route. It suggested reaction conditions, which the chemists were allowed to adjust to optimize the syntheses. MilliporeSigma chemists carried out four of the syntheses. Graduate students and postdocs in Grzybowski’s and Mrksich’s labs performed the other four as part of a U.S. Department of Defense grant through the Defense Advanced Research Projects Agency to explore whether non-expert chemists could use programs like Chematica to synthesize chemicals.

The chemists successfully followed the program’s planned route to all eight targets. For most of the targets, the Chematica route improved the yield or reduced the number of steps, total time, or cost compared with published routes. For two targets, the chemists performed the first published synthesis of the molecule.

"These encouraging results should serve as a spark for another advancement in organic synthesis,” says K. C. Nicolaou, a synthetic chemist at Rice University, adding that Chematica could increase speed and productivity in chemistry labs, especially if paired with automated synthesis machines.

But other chemists question how much benefit Chematica could provide researchers. Several of the routes identified in the new paper represented only modest improvements in yield, or none at all. The group reports that the Chematica route to an ATR kinase inhibitor had a 22% yield, while the original paper reported 24% yield. The program was able to shorten the synthesis from seven steps to four and saved almost 20 hours compared with the published route.

John Maxwell, vice president of chemistry at Tango Therapeutics, says the paper’s comparisons don’t prove that Chematica plans better routes than human chemists. The chemists whose routes act as benchmarks, he points out, weren’t necessarily optimizing their syntheses for yield or length.

Some chemists wonder how Chematica and MilliporeSigma will handle intellectual property. Richmond Sarpong, a synthetic chemist at the University of California, Berkeley, says researchers may hesitate to use Chematica unless MilliporeSigma is clear about what access the company will have to the molecules that users input or who will own the intellectual property of the routes Chematica generates. Sarah Trice, head of commercial development for cheminformatics technologies at MilliporeSigma, says that molecule searches can be seen only by the user who performs them and that MilliporeSigma will not control the intellectual property on the synthetic paths Chematica proposes.

While Chematica may help chemists save time and money in synthesizing targets, Grzybowski says it won’t replace human ingenuity. “This doesn’t in any way take away from the discovery of new reactions,” he says.

MilliporeSigma has not announced when or how Chematica will become publicly available, but the company has been recruiting top synthetic chemists to test the program.

CORRECTIONS: This story was updated on March 7, 2018, to add Polish Academy of Sciences to Bartosz Grzybowski’s affiliations, to indicate that CAS will include a competitor to Chematica in SciFindern, to clarify how Chematica’s 50,000 rules were developed, and to correct the number of tested targets that had previously published synthetic routes.

This story was updated March 2, 2018, to correct the comment from K. C. Nicolaou regarding how Chematica could improve lab productivity.

This article has been translated into Spanish by and can be found here.



This article has been sent to the following recipient:

Dr. Niteen A. Vaidya (March 7, 2018 3:59 PM)
would like to test these two software products, Because here at Viridischem we are looking into synthesis & retrosythesis analysis using Green metrics (toxicological profiling all reagents/solvents).
nobuyuki ishibe (March 7, 2018 8:03 PM)
It is very interesting to know the machine learning is applied to find various routes for synthesis of the target compound. It is worth to further explore the machine learning approach for the useful organic compounds since the machine learning is advanced everyday. The article did not mention what machine learning process lime Baysiean, evolution, or other approach was used while the information would be interesesting.
Thomas Bark (March 8, 2018 4:13 PM)
To Nouyuki Ishibe: As Grzybowski made clear that this is exactly not about machine learning, but human beings using their intellect and knowledge to actually teach the machine the aformentioned 50,000 chemistry rules. It is obvious that, in particular, reasons for failure of a reaction (chemical incompatibility, steric constraints etc.) cannot be extracted by "machine learning" from a heap of published reactions, as such negative results are not published. Machine learning is probably very inefficient (or ineffective at all) in getting computers doing genuinly creative stuff. I guess Deep Blue did not get its Chess capabilities from analysing published Chess matches.
Tom Swann (March 11, 2018 1:22 PM)
Looks more like a modern example of a rules based Expert System.
Robert Buntrock (March 19, 2018 4:12 PM)
Interesting program. The authors cite additional synthesis planners other than the Corey-Wipke program which was called LHASA (sp?). To say it never lived up to its promise does not tell the whole story. The program was developed with federal funding (NIH?) and could not be commercialized. After Todd Wipke left Harvard he was one of the cofounders of Molecular Design (MDL) and the commercialized program was called SECS. This was purchased and maintained and widely used in the chemical and pharmaceuticals industry. These and other programs involved calculated reverse synthetic schemes where the target molecule was entered and the programs calculated possible synthetic paths, based on the existing literature. The possible starting materials and intermediates were called Synthons and the reverse paths were shown with reverse arrows with double stems. Subsequent examples can be found throughout the literature. The success of these programs was limited by the computing power available which of course has grown immensely along with the synthesis literature.

Leave A Comment

*Required to comment