When chemists want to model the structural and electronic properties of atoms or molecules, they often turn to a computational technique called density functional theory (DFT). When DFT fails, chemists use approaches like coupled cluster (CC) or Møller–Plesset perturbation (MP2) theories. These generate more reliable values than DFT does, but they require thousands of times as much computational power as DFT, even for small molecules.
Thomas F. Miller and colleagues at California Institute of Technology now demonstrate that machine learning might be the best of both worlds—as accurate as CC or MP2 and no more costly than DFT (J. Chem. Theory Comput. 2018, DOI: 10.1021/acs.jctc.8b00636).
The researchers wanted to predict electronic structure correlation energies—a measure of interactions between electrons that helps chemists model how a molecule behaves. Their machine-learning approach predicts these values based on a set of known data.
Miller’s group trained its algorithm on localized molecular orbitals of a set of small molecules. Because molecular orbitals are agnostic to the underlying bonds and atoms, Miller says the new algorithm could predict properties for many different molecules with a small starting set of data.
In one example, the researchers trained their algorithm on the molecular orbitals of water, then predicted the correlation energies of ammonia, methane, and hydrogen fluoride. For methane, the algorithm’s value was just 0.24% off from the one generated by CC, and that was the least accurate result of the three they found. The algorithm’s calculation for a cluster of six water molecules took two minutes with machine learning, compared with 28 hours for CC.
The team’s system did poorly in predicting values for butane and isobutane after training on methane and ethane. Including propane in the training set led to more accurate results.
Miller emphasizes that these early results are still a long way from a system that anyone could use. Others agree. “I think it’s an excellent idea, and looks promising, but it can be harder than it looks to make it a general-purpose tool,” Kieron Burke, a computational chemist at the University of California, Irvine, says.