If you have an ACS member number, please enter it here so we can link this account to your membership. (optional)

ACS values your privacy. By submitting your information, you are gaining access to C&EN and subscribing to our weekly newsletter. We use the information you provide to make your reading experience better, and we will never sell your data to third party members.


Metal-Organic Frameworks

ChatGPT lab assistant accelerates MOF synthesis

An algorithm analyzes synthesis datasets to provide precise predictions and instructions to lab chemists

by Fernando Gomollón-Bel, special to C&EN
August 11, 2023

Researchers have trained ChatGPT to create a chemistry lab assistant that summarizes information about synthesis from papers with high accuracy (J. Am. Chem. Soc. 2023, DOI: 10.1021/jacs.3c05819). In particular, this program extracts over 26,000 parameters from peer-reviewed articles and supporting information about metal-organic frameworks (MOFs). Once trained, the interactive chatbot is able to answer questions about the preparation of MOFs quickly and accurately.

The image shows a screenshot of a chatbot that explains how to make MOFs in a detailed step by step synthesis.
Credit: J. Am. Chem. Soc.
The chemistry chatbot based on ChatGPT creates an interface for assisted literature searches.

“We’ve always been interested in simplifying and speeding up chemical synthesis,” says Omar Yaghi from the University of California, Berkeley, lead author of the study. The ChatGPT models mined the supporting information of hundreds of MOF papers, where information on synthesis is unstructured and sparse, often extended over hundreds of pages. Thus, “we developed a filtering strategy that excludes the least relevant sections—like references, crystal coordinates, acknowledgments—increasing the efficiency,” adds Yaghi.

Large language models, like ChatGPT, can be prone to what are called hallucinations. These are responses that seem correct, but aren’t. The team minimized the emergence of misleading affirmations with careful prompt engineering. “It’s a means of training ChatGPT,” says Yaghi. “We ensure the prompt contains information . . . to help improve the response.” This approach advises the algorithm to recognize uncertainty, rather than fabricating fake answers. For example, when asked about a MOF not present in the training database, the program will simply say: “I do not know.” Additionally, this iterative approach helps researchers get structured responses such as bulleted lists and step-by-step synthesis of many MOFs, which reference the correct sources.

This process of curation could have taken a chemist months, but ChatGPT scans and registers synthetic procedures in a fraction of the time, says Yaghi. “It only takes one minute per paper.”

The team envisions researchers applying their publicly available ChatGPT model to other fields of chemistry, after training it with the relevant papers and datasets. Eventually, the chatbot could predict the outcome of chemical reactions, or propose potential synthetic routes leveraging its knowledge and understanding of the chemical space.



This article has been sent to the following recipient:

Chemistry matters. Join us to get the news you need.