Sunday, September 29, 2019

Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning

Frank Noé, Simon Olsson, Jonas Köhler, Hao Wu (2019)
Highlighted by Jan Jensen

Figure 1A from the paper (Copyright © 2019 The Authors, some rights reserved)

The paper presents a novel method to predict free energy differences much more efficiently.  Currently, this is usually done by MD simulations and observing how often each state is visited along the trajectory. However, transitions between each state are rare, which means very long and costly trajectories, even when using various tricks to force the transitions.

The solution Noé et al. present is to "train a deep invertible neural network to learn a coordinate transformation from x to a so-called “latent” representation z, in which the low-energy configurations of different states are close to each other and can be easily sampled."

For example, the user supplies a few examples of each state [px(x)] and trains the NN to find a new set of variables (z) with a much simpler probability distribution [pz(z)], by minimising the difference using the Kullback-Leibler divergence as a loss function.

Since the NN is invertible, one can now sample repeatedly from pz(z) to get a more accurate px(x), which can then be reweighed to give the Boltzmann distribution. Since pz(z) is a simple Gaussian most of the sampled structure will have a high Boltzmann probability, so you don't have to sample that many structures.

The technical main advance is the use of an invertible NN, that allow you to go both from x to z and z to x. This is done by using a NN architecture where only simple mathematically operations (addition and multiplication) that can be reversed (subtraction and division) are allowed.

It would be very interesting to see if a similar approach can be used for inverse design.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Wednesday, September 25, 2019

Deflate to Understand Complex Molecular Kinetics

Contributed by Jesper Madsen

Dimensionality reduction is at the core of understanding and making intuitive sense of complex dynamic phenomena in chemistry.  It is usually assumed that the slowest mode is the one of primary interest; however, it is critical to realize that this is not always so! A conceptual example hereof is a protein folding simulation (Lindorff-Larsen et al. Science 334, 517-520, 2011) where the slowest dynamical mode is not the folding itself (see Figure). What is the influence, then, of “non-slowest” modes in this process and how can it most appropriately be elucidated?

FIG: Figure 2 from the preprint: "(A) Sampled villin structures from the MD trajectory analyzed. Helical secondary structure is colored and coils are white. Each image represents five structures sampled from similar locations in TIC space as determined by a 250-center k-means model built upon the first three original TICs. The purple structure represents the folded state, and the blue structure represents the denatured state. The green structure is a rare helical misfolded state that we assert is an artifact. (B) Two-dimensional histograms for TICA transformations constructed from villin contact distances. Dashed lines indicate the regions corresponding to the sampled structures of the same color. The first TIC tracks the conversion to and from the rare artifact only. The second TIC tracks the majority of the folding process and correlates well with RMSD to the folded structure."

This work by Husic and Noé show how deflation can provide an answer to these questions. Technically speaking deflation is a collection of methods for how to modify a matrix after the largest eigenvalue is known in order to find the rest. In their provided example of the folding simulation, the dominant Time-lagged Independent Component (TIC) encapsulates the "artifact" variation that we are not really interested in. Thus, a constructed kinetic (Markov-state) model will be contaminated in several undesirable ways as discussed by the authors in great detail.  

In principle, this should be a very common problem since chemical systems have complex Hamiltonians. Perhaps the reason why we don’t see it discussed more is that ultra-rare events – real or artifact – may not usually be sampled during conventional simulations. So, with the increasing computational power available to us, and simulations approaching ever-longer timescales, this is likely something that we need to be able to handle. This preprint describes well how one can think about attacking these potential difficulties.   

Saturday, August 31, 2019

Physical machine learning outperforms "human learning" in Quantum Chemistry

Highlighted by Jan Jensen

Figure 1 from the paper

The paper presents a method to estimate DFT or CCSD(T) energies (computed using large basis sets) based only on HF/cc-pVDZ densities and energies. In order to avoid overfitting, the method must also estimate the corresponding DFT or CCSD densities. The method is trained and validated on a subset of the QM9 data set (i.e. on relative small molecules). It is first trained on DFT data (using 89K molecules) and then retrained on CC data for a smaller subset (3.6K molecules), both being subsets of the QM9 data set (i.e. relatively small molecules). The input density is evaluated in a 3D grid that is big enough to accommodate the largest molecule in the data set, so a new model would have to be trained for significantly larger molecules.

This “physical machine learning in Quantum Chemistry” (PML-QC) model reaches a mean absolute error in energies of molecules with up to eight non-hydrogen atoms as low as 0.9 kcal/mol relative to CCSD(T) values, which is quite impressive. In fact the authors speculate that 
With ML, it may become not required that an accurate quantum chemical method works fast enough for every new molecule that an end user may be interested in. Instead, the focus shifts to generating highly accurate results only for a finite dataset to be used for training, while the efficiency in practical applications is to be achieved via improvements in DNNs to make them faster and more accurate.
Much will depend how much the number of outliers can be reduced. For example, for PML-QCDFT ca 5% of molecules have errors greater than 2.6 kcal/mol and this effect can be magnified for relative energies if the individual errors have opposite sign.

Wednesday, July 31, 2019

Popular Integration Grids Can Result in Large Errors in DFT-Computed Free Energies

Highlighted by Jan Jensen

 Figure 1B from the paper (CC BY-NC-ND 4.0)

 This paper has already been highlighted here and here, so I'll just briefly summarise.

The grid used for the numerical integration in DFT calculations is defined relative to the Cartesian axes, so rotating the molecule will change the integration grid and, hence, the energy. This has been known for some time and, f.eks. Gaussian09 uses a default grid size (Fine, 75,302) where the effect on the electronic energy variation is usually negligible.

Bootsma and Wheeler show that the vibrational entropy and, hence, the free energy is significantly more sensitive to grid size than the electronic energy. Using the Fine grid, the differences in relative free energy changes can be as large as 4 kcal/mol, which could significantly change conclusion regarding mechanisms, etc. The effect comes from the variation in low frequency vibrational modes and the effect can be reduced a little by scaling these frequencies. 

However, the errors really only become acceptable when using the UltraFine grid size, which is the default in Gaussian16, especially combined with frequency scaling (which one should do anyway to get consistent results). If you are using Gaussian09 or some other quantum program to compute relative free energies it is definitely a good idea to look at the default grid size and perform some tests.

Note that if you want to perform such tests yourself, you need to re-optimise the molecule after you rotate it because the gradient is also affected by the rotation.

Thursday, July 4, 2019

Combining the Power of J Coupling and DP4 Analysis on Stereochemical Assignments: The J-DP4 Methods

Grimblat, N.; Gavín, J. A.; Hernández Daranas, A.; Sarotti, A. M., Org. Letters 2019, 21, 4003-4007
Contributed by Steven Bachrach
Reposted from Computational Organic Chemistry with permission

I have written quite a number of posts on using quantum mechanics computations to predict NMR spectra that can aid in identifying chemical structure. Perhaps the most robust technique is Goodman’s DP4 method (post), which has seen some recent revisions (updated DP4DP4+). I have also posted on the use of computed coupling constants (posts).

Grimblat, Gavín, Daranas and Sarotti have now combined these two approaches, using computed 1H and 13C chemical shifts and 3JHH coupling constants with the DP4 framework to predict chemical structure.1

They describe two different approaches to incorporate coupling constants:
  • dJ-DP4 (direct method) incorporates the coupling constants into a new probability function, using the coupling constants in an analogous way as chemical shifts. This requires explicit computation of all chemical shifts and 3JHH coupling constants for all low-energy conformations.
  • iJ-DP4 (indirect method) uses the experimental coupling constants to set conformational constraints thereby reducing the number of total conformations that need be sampled. Thus, large values of the coupling constant (3JHH > 8 Hz) selects conformations with coplanar hydrogens, while small values (3JHH < 4 Hz) selects conformations with perpendicular hydrogens. Other values are ignored. Typically, only one or two coupling constants are used to select the viable conformations.

The authors test these two variants on 69 molecules. The original DP4 method predicted the correct stereoisomer for 75% of the examples, while dJ-DP4 correct identifies 96% of the cases. As a test of the indirect method, they examined marilzabicycloallenes A and B (1 and 2). DP4 predicts the correct stereoisomer with only 3.1% (1) or <0.1% (2) probability. dJ-DP4 predicts the correct isomer for 1 with 99.9% probability and 97.6% probability for 2. The advantage of iJ-DP4 is that using one coupling constant reduces the number of conformations that must be computed by 84%, yet maintains a probability of getting the correct assignment at 99.2% or better. Using two coupling constants to constrain conformations means that only 7% of all of the conformations need to be samples, and the predictive power is maintained.


Both of these new methods clearly deserve further application.


1. Grimblat, N.; Gavín, J. A.; Hernández Daranas, A.; Sarotti, A. M., “Combining the Power of J Coupling and DP4 Analysis on Stereochemical Assignments: The J-DP4 Methods.” Org. Letters 201921, 4003-4007, DOI: 10.1021/acs.orglett.9b01193.


1: InChI=1S/C15H21Br2ClO4/c1-8-15(20)14-6-10(17)12(19)7-11(18)13(22-14)5-9(21-8)3-2-4-16/h3-4,8-15,19-20H,5-7H2,1H3/t2-,8-,9+,10-,11+,12+,13+,14+,15-/m0/s1
2: InChI=1S/C15H21Br2ClO4/c1-8-15(20)14-6-10(17)12(19)7-11(18)13(22-14)5-9(21-8)3-2-4-16/h3-4,8-15,19-20H,5-7H2,1H3/t2-,8-,9-,10-,11+,12+,13+,14+,15-/m0/s1

This work is licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License.

Wednesday, June 26, 2019

The logic of translating chemical knowledge into machine-processable forms: A modern playground for physical-organic chemistry

Karol Molga, Ewa P. Gajewska, Sara Szymkuć, and Bartosz A. Grzybowski (2019)
Highlighted by Jan Jensen
Figure 11 from the paper (c) RSC

This paper offers a, to me, fascinating "look behind the scenes" of Chematica. At the core this program has 75,000 handcrafted reaction rules (SMARTS and Reaction SMARTS strings as shown in the above figure) extracted from the literature (which took over a decade). The authors estimate that there ca 3000-5000 new reaction classes/types appearing in the literature each years and "that there are on the order of 100,000 distinct reaction classes constituting the body of modern organic chemistry. So their work is almost done :).

The paper does a really excellent job of outlining the challenges involved in constructing these rules and present several cases where the rules must be augmented by ML, MM, and Hückel calculations in order to take non-local structural (e.g. strain and steric hindrance) and electronic effects (e.g. on regioselectivity) into account. Such calculations must be done on the millisecond time scale as many thousand intermediates must be inspected during a retrosynthetic search. At the same time they must be very accurate as inaccuracies accumulate with each step on the retrosynthetic path.

It will be very interesting to see if purely ML-based alternatives can beat this approach!

This work is licensed under a Creative Commons Attribution 4.0 International License.

Wednesday, June 12, 2019

Vibrational Signatures of Chirality Recognition Between α-Pinene and Alcohols for Theory Benchmarking

Medel, R.; Stelbrink, C.; Suhm, M. A., Angew. Chem. Int. Ed. 2019, 58, 8177
Contributed by Steven Bachrach
Reposted from Computational Organic Chemistry with permission

Can vibrational spectroscopy be used to identify stereoisomers? Medel, Stelbrink, and Suhm have examined the vibrational spectra of (+)- and (-)-α-pinene, (±)-1, in the presence of four different chiral terpenes 2-5.1 They recorded gas phase spectra by thermal expansion of a chiral α-pinene with each chiral terpene.

For the complex of 4 with (+)-1 or (-)-1 and 5 with (+)-1 or (-)-1, the OH vibrational frequency is identical for the two different stereoisomers. However, the OH vibrational frequencies differ by 2 cm-1 with 3, and the complex of 3/(+)-1 displays two different OH stretches that differ by 11 cm-1. And in the case of the complex of α-pinene with 2, the OH vibrational frequencies of the two different stereoisomers differ by 11 cm-1!

The B3LYP-D3(BJ)/def2-TZVP optimized geometry of the 2/(+)-1 and 2/(-)-1 complexes are shown in Figure 2, and some subtle differences in sterics and dispersion give rise to the different vibrational frequencies.


Figure 2. B3LYP-D3(BJ)/def2-TZVP optimized geometry of the 2/(+)-1 and 2/(-)-1

Of interest to readers of this blog will be the DFT study of these complexes. The authors used three different well-known methods – B3LYP-D3(BJ)/def2-TZVP, M06-2x/def2-TZVP, and ωB97X-D/def2-TZVP – to compute structures and (most importantly) predict the vibrational frequencies. Interestingly, M06-2x/def2-TZVP and ωB97X-D/ def2-TZVP both failed to predict the vibrational frequency difference between the complexes with the two stereoisomers of α-pinene. However, B3LYP-D3(BJ)/def2-TZVP performed extremely well, with a mean average error (MAE) of only 1.9 cm-1 for the four different terpenes. Using this functional and the larger may-cc-pvtz basis set reduced the MAE to 1.5 cm-1 with the largest error of only 2.5 cm-1.

As the authors note, these complexes provide some fertile ground for further experimental and computational study and benchmarking.


1. Medel, R.; Stelbrink, C.; Suhm, M. A., “Vibrational Signatures of Chirality Recognition Between α-Pinene and Alcohols for Theory Benchmarking.” Angew. Chem. Int. Ed. 201958, 8177-8181, DOI: 10.1002/anie.201901687.


(-)-1, (-)-α-pinene: InChI=1S/C10H16/c1-7-4-5-8-6-9(7)10(8,2)3/h4,8-9H,5-6H2,1-3H3/t8-,9-/m0/s1
(+)-1, (-)-α-pinene: InChI=1S/C10H16/c1-7-4-5-8-6-9(7)10(8,2)3/h4,8-9H,5-6H2,1-3H3/t8-,9-/m1/s1
2, (-)borneol: InChI=1S/C10H18O/c1-9(2)7-4-5-10(9,3)8(11)6-7/h7-8,11H,4-6H2,1-3H3/t7-,8+,10+/m0/s1
3, (+)-fenchol: InChI=1S/C10H18O/c1-9(2)7-4-5-10(3,6-7)8(9)11/h7-8,11H,4-6H2,1-3H3/t7-,8-,10+/m0/s1
4, (-1)-isopinocampheol: InChI=1S/C10H18O/c1-6-8-4-7(5-9(6)11)10(8,2)3/h6-9,11H,4-5H2,1-3H3/t6-,7+,8-,9-/m1/s1
5, (1S)-1-phenylethanol: InChI=1S/C8H10O/c1-7(9)8-5-3-2-4-6-8/h2-7,9H,1H3/t7-/m0/s1

This work is licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License.