If you have an ACS member number, please enter it here so we can link this account to your membership. (optional)

ACS values your privacy. By submitting your information, you are gaining access to C&EN and subscribing to our weekly newsletter. We use the information you provide to make your reading experience better, and we will never sell your data to third party members.


Computational Chemistry

Hey, chatbot, can you synthesize this molecule?

Language models behind ChatGPT could lower research barriers and take us a step closer to automated labs

by Prachi Patel
February 26, 2024 | A version of this story appeared in Volume 102, Issue 6


Laptop sitting in front of a liquid handler setup in a laboratory.
Credit: Carnegie Mellon University
Using natural language prompts, a new artificial intelligence system called Coscientist designs, plans, and runs experiments on automated equipment such as this liquid handler.

A little over a year ago, the experimental chatbot ChatGPT unleashed a frenzy with its prowess for humanlike conversations and creativity. It quickly became a go-to for helping with homework and writing speeches. Scientists have used it to cowrite scientific papers, grant proposals, and computer code.

OpenAI created ChatGPT using a generative pretrained transformer (GPT), a type of computer algorithm called a large language model (LLM). The LLM that OpenAI based ChatGPT on has been evolving to become even more humanlike. GPT-4, the iteration OpenAI released last March, made a giant leap over GPT-3. And scientists are starting to put the technology’s abilities to use for chemistry and materials research.

In December, Carnegie Mellon University (CMU) chemical scientist and engineer Gabe Gomes and his colleagues reported an artificial intelligence agent dubbed Coscientist that showcases the power of LLMs for chemistry research (Nature 2023, DOI: 10.1038/s41586-023-06792-0). Users can give Coscientist a simple instruction, or ask a question in English, and within minutes the system will, for example, learn a needed reaction, predict necessary procedures, and write code that a laboratory with robotic instruments can use to execute the required experiments.

The CMU team had published a preprint of its study just a month after GPT-4’s release. Then, 1 h after the CMU group posted its preprint, chemical engineer Andrew D. White of the University of Rochester and his colleagues unveiled a similar LLM-based engine called ChemCrow (arXiv 2023, arXiv: 2304.05376). White and Gomes had known of each other’s work and aligned their preprints to post within the same hour.

By transforming how researchers interact with robotic systems using natural language, LLMs could “transform the future of chemistry,” says White, who is now head of science at FutureHouse, a San Francisco–based nonprofit that is building an AI scientist. “These models can be a really powerful way to jump-start self-driving laboratories. ”

Self-driving, or automated, laboratories have been long-standing dreams for scientists working in drug and materials discovery. Much of this research is conducted through a painstaking iterative process of designing, executing, and refining experiments. AI-driven robotic labs can carry out these complex tasks without human intervention, speeding up scientific discovery and freeing time for humans to pursue creative, intellectual endeavors.

Communication, however, has been a hurdle. “The current problem with self-driving laboratories is you incorporate all this equipment, all these tools, robotics, databases, and predictive computer models, ” White says, “and it becomes an enormous nightmare to get all this stuff to work together.” LLMs could work as both a translator and a conductor, orchestrating all the instruments in an automated laboratory to create harmony.

ChemCrow and Coscientist are based on GPT-4. Like all machine learning models, LLMs are trained on immense datasets to recognize patterns and make predictions. Experts agree that a model’s results are only as good as the datasets it is trained on.

Because GPT-4 is trained on volumes of text and not chemical data, White says, he quickly found that it lacks the ability to answer even basic chemistry questions, such as the number of alcohol groups or aromatic rings in a compound. So the team combined it with 18 types of chemistry tools: general tools for searching web, literature, and chemical vendor databases; molecular tools that compare molecules and represent chemical structures as text; and several tools developed by IBM Research that predict chemical reactions and do retrosynthesis, working backward from a target molecule to propose reactions for its creation.

Speak easy
Researchers can communicate with laboratory robots in natural language using a new artificial intelligence agent called Coscientist, which harnesses the computer models behind ChatGPT. Coscientist designs, plans, and executes chemistry experiments, improving iteratively until it reaches the desired objective.
A schematic shows a person asking “Can you synthesize molecule A” and Coscientist iteratively going through the steps of literature search, protocol selection, translation into code, and experiment.
Credit: Adapted fromNature

Given the prompt “Plan and execute the synthesis of an insect repellent,” ChemCrow succeeded in searching the web to learn what an insect repellent is, conducted a literature review to find examples, and converted compound names to structures. It used a retrosynthesis predictor to design a synthesis process, and finally, it sent instructions over the cloud to instruments at IBM’s automated laboratory to make a sample of a known repellent. ChemCrow also synthesized three organocatalysts and, when given data on wavelengths of light absorbed by chromophores, proposed a novel compound with a specific absorption wavelength.

Coscientist, meanwhile, consists of multiple modules. A planning module controls other modules, each of which manages distinct jobs, such as scouring the internet and academic papers, reading operation manuals for robotic equipment, and writing code in the Python programming language to control those robots.

In testing, Coscientist accurately planned the synthesis procedures for seven molecules, including aspirin and ibuprofen. It designed and executed two types of reactions, Sonogashira and Suzuki-Miyaura cross-coupling reactions, which are often used in drug discovery work to form carbon-carbon bonds. To do that, Coscientist learned the subtle differences between the two types of cross couplings and chose the reagents needed for these reactions on the basis of reactivity rules. It also optimized conditions for other chemical reactions to increase yield.

In August, computer scientists at the University of Toronto released CLAIRify, an interface that translates natural language instructions into a task plan for robots to execute chemistry experiments. And a team from the University of California, Berkeley, trained ChatGPT to scour research papers and summarize synthesis information for making metal-organic frameworks.

The idea behind all these systems is “to accelerate discovery,” says Philippe Schwaller, a professor of chemical sciences and engineering at the Swiss Federal Institute of Technology, Lausanne (EPFL), who worked with White on ChemCrow. “I’m really excited about this human–AI collaboration, where we can have knowledge from the human expert and from the LLM system combined to work together towards a common goal,” Schwaller says.

It is important to remember that ChemCrow and Coscientist serve as assistants, their developers say. These platforms can reduce tedious work by automating chemical synthesis or predict molecules according to certain parameters. They are not inventing reactions or designing new compounds. And they are not meant to replace humans.

“People ask, ‘What has it discovered?’ ” Gomes says. “The answer is, ‘Nothing.’ It hasn’t discovered anything yet. In fact, it probably won’t, because it’s a tool. Human scientists controlling it are the ones who will enable those discoveries. This is just a tool, but it’s an awesome tool.”

The LLM-based AI agents are expected to make research more accessible. For example, they could enable specialists across disciplines and in laboratories across the world to work together. Chemists and biologists would not have to learn programming languages to write the code for controlling robotic instruments or pore through instruction manuals for the latest laboratory equipment, White says. Instead, researchers could give a page of documentation or source code to a language model, which would learn how to use that tool and create a natural language interface for the researcher. “Now you can use a hundred tools, and you can still communicate your intent in natural language,” he says.

Tiago Rodrigues, a medicinal chemist at the University of Lisbon, agrees that LLM-powered AI labs would level the playing field. “People don’t need to be the top experts in one field. I think it’s going to democratize science in a way.”

Although the development of Coscientist and ChemCrow are huge steps forward, Rodrigues cautions that the platforms are preliminary proofs of concept. LLMs are an emerging technology, he says, and it is too early to know where they fit in the research landscape. “Most research questions are very complex, and they might involve knowledge from disciplines other than chemistry,” he says.

As with other AI advances, this technology brings risks. AI-based chemistry agents such as Coscientist and ChemCrow could, for instance, be used to produce chemical weapons or illicit drugs. Gomes says it is important for technology companies, the physical sciences community, and policymakers to work together to develop guardrails.

The accuracy of these engines is limited by the information available to them. The researchers admit that both systems sometimes generate incorrect and strange responses. The teams are working on training these AI engines with more chemistry tools to improve their accuracy. But they will always need human intervention for ethical and safety reasons, Gomes says.

A big challenge for the community right now is managing expectations about AI-driven labs in general, Rodrigues adds. “There is this hype curve, and we are still going up. Maybe in 1 or 2 years we will go down very sharply, and only then we will understand how we can best use these kinds of tools.”


This article has been sent to the following recipient:

Chemistry matters. Join us to get the news you need.