Tuesday, April 24, 2012

Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning

M. Rupp, A. Tkatchenko, K.-R. Müller, and O. A. von Lilienfeld Phys. Rev. Lett., 108, 058301 (2012)                                                                                  

In this study, regression algorithms from machine learning are used to create a model predicting atomization energies of a large class of organic molecules. Training the model with hybrid DFT data yields a prediction error of ~10 kcal/mol.

The power of this technique lies not in improving upon existing DFT approximations, but rather enabling extremely fast calculation of a particular large class of molecules at a near-DFT accuracy. Applications include chemical design, where one attempts to solve the inverse electronic structure problem (i.e. which molecule gives the desired property?), molecular dynamics, and chemical reactions.

We thought the choice of the Coulomb matrix in representing a molecule was an elegant and natural way to satisfy the requirements of invariance under symmetry operations. It would be interesting to know how sensitive the results are to the authors' choice of the diagonal of the Coulomb matrix, as well as the type of kernel used in the kernel ridge regression.
(Summary prepared by John Snyder).