Wednesday, July 31, 2019

Popular Integration Grids Can Result in Large Errors in DFT-Computed Free Energies

Highlighted by Jan Jensen

 Figure 1B from the paper (CC BY-NC-ND 4.0)

 This paper has already been highlighted here and here, so I'll just briefly summarise.

The grid used for the numerical integration in DFT calculations is defined relative to the Cartesian axes, so rotating the molecule will change the integration grid and, hence, the energy. This has been known for some time and, f.eks. Gaussian09 uses a default grid size (Fine, 75,302) where the effect on the electronic energy variation is usually negligible.

Bootsma and Wheeler show that the vibrational entropy and, hence, the free energy is significantly more sensitive to grid size than the electronic energy. Using the Fine grid, the differences in relative free energy changes can be as large as 4 kcal/mol, which could significantly change conclusion regarding mechanisms, etc. The effect comes from the variation in low frequency vibrational modes and the effect can be reduced a little by scaling these frequencies. 

However, the errors really only become acceptable when using the UltraFine grid size, which is the default in Gaussian16, especially combined with frequency scaling (which one should do anyway to get consistent results). If you are using Gaussian09 or some other quantum program to compute relative free energies it is definitely a good idea to look at the default grid size and perform some tests.

Note that if you want to perform such tests yourself, you need to re-optimise the molecule after you rotate it because the gradient is also affected by the rotation.

Thursday, July 4, 2019

Combining the Power of J Coupling and DP4 Analysis on Stereochemical Assignments: The J-DP4 Methods

Grimblat, N.; Gavín, J. A.; Hernández Daranas, A.; Sarotti, A. M., Org. Letters 2019, 21, 4003-4007
Contributed by Steven Bachrach
Reposted from Computational Organic Chemistry with permission

I have written quite a number of posts on using quantum mechanics computations to predict NMR spectra that can aid in identifying chemical structure. Perhaps the most robust technique is Goodman’s DP4 method (post), which has seen some recent revisions (updated DP4DP4+). I have also posted on the use of computed coupling constants (posts).

Grimblat, Gavín, Daranas and Sarotti have now combined these two approaches, using computed 1H and 13C chemical shifts and 3JHH coupling constants with the DP4 framework to predict chemical structure.1

They describe two different approaches to incorporate coupling constants:
  • dJ-DP4 (direct method) incorporates the coupling constants into a new probability function, using the coupling constants in an analogous way as chemical shifts. This requires explicit computation of all chemical shifts and 3JHH coupling constants for all low-energy conformations.
  • iJ-DP4 (indirect method) uses the experimental coupling constants to set conformational constraints thereby reducing the number of total conformations that need be sampled. Thus, large values of the coupling constant (3JHH > 8 Hz) selects conformations with coplanar hydrogens, while small values (3JHH < 4 Hz) selects conformations with perpendicular hydrogens. Other values are ignored. Typically, only one or two coupling constants are used to select the viable conformations.

The authors test these two variants on 69 molecules. The original DP4 method predicted the correct stereoisomer for 75% of the examples, while dJ-DP4 correct identifies 96% of the cases. As a test of the indirect method, they examined marilzabicycloallenes A and B (1 and 2). DP4 predicts the correct stereoisomer with only 3.1% (1) or <0.1% (2) probability. dJ-DP4 predicts the correct isomer for 1 with 99.9% probability and 97.6% probability for 2. The advantage of iJ-DP4 is that using one coupling constant reduces the number of conformations that must be computed by 84%, yet maintains a probability of getting the correct assignment at 99.2% or better. Using two coupling constants to constrain conformations means that only 7% of all of the conformations need to be samples, and the predictive power is maintained.

1

2
Both of these new methods clearly deserve further application.


References

1. Grimblat, N.; Gavín, J. A.; Hernández Daranas, A.; Sarotti, A. M., “Combining the Power of J Coupling and DP4 Analysis on Stereochemical Assignments: The J-DP4 Methods.” Org. Letters 201921, 4003-4007, DOI: 10.1021/acs.orglett.9b01193.


InChIs

1: InChI=1S/C15H21Br2ClO4/c1-8-15(20)14-6-10(17)12(19)7-11(18)13(22-14)5-9(21-8)3-2-4-16/h3-4,8-15,19-20H,5-7H2,1H3/t2-,8-,9+,10-,11+,12+,13+,14+,15-/m0/s1
InChIKey=APNVVMOUATXTFG-NTSAAJDMSA-N
2: InChI=1S/C15H21Br2ClO4/c1-8-15(20)14-6-10(17)12(19)7-11(18)13(22-14)5-9(21-8)3-2-4-16/h3-4,8-15,19-20H,5-7H2,1H3/t2-,8-,9-,10-,11+,12+,13+,14+,15-/m0/s1
InChIKey=APNVVMOUATXTFG-SSBNIETDSA-N



'
This work is licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License.

Wednesday, June 26, 2019

The logic of translating chemical knowledge into machine-processable forms: A modern playground for physical-organic chemistry

Karol Molga, Ewa P. Gajewska, Sara Szymkuć, and Bartosz A. Grzybowski (2019)
Highlighted by Jan Jensen
Figure 11 from the paper (c) RSC

This paper offers a, to me, fascinating "look behind the scenes" of Chematica. At the core this program has 75,000 handcrafted reaction rules (SMARTS and Reaction SMARTS strings as shown in the above figure) extracted from the literature (which took over a decade). The authors estimate that there ca 3000-5000 new reaction classes/types appearing in the literature each years and "that there are on the order of 100,000 distinct reaction classes constituting the body of modern organic chemistry. So their work is almost done :).

The paper does a really excellent job of outlining the challenges involved in constructing these rules and present several cases where the rules must be augmented by ML, MM, and Hückel calculations in order to take non-local structural (e.g. strain and steric hindrance) and electronic effects (e.g. on regioselectivity) into account. Such calculations must be done on the millisecond time scale as many thousand intermediates must be inspected during a retrosynthetic search. At the same time they must be very accurate as inaccuracies accumulate with each step on the retrosynthetic path.

It will be very interesting to see if purely ML-based alternatives can beat this approach!


This work is licensed under a Creative Commons Attribution 4.0 International License.

Wednesday, June 12, 2019

Vibrational Signatures of Chirality Recognition Between α-Pinene and Alcohols for Theory Benchmarking

Medel, R.; Stelbrink, C.; Suhm, M. A., Angew. Chem. Int. Ed. 2019, 58, 8177
Contributed by Steven Bachrach
Reposted from Computational Organic Chemistry with permission

Can vibrational spectroscopy be used to identify stereoisomers? Medel, Stelbrink, and Suhm have examined the vibrational spectra of (+)- and (-)-α-pinene, (±)-1, in the presence of four different chiral terpenes 2-5.1 They recorded gas phase spectra by thermal expansion of a chiral α-pinene with each chiral terpene.


For the complex of 4 with (+)-1 or (-)-1 and 5 with (+)-1 or (-)-1, the OH vibrational frequency is identical for the two different stereoisomers. However, the OH vibrational frequencies differ by 2 cm-1 with 3, and the complex of 3/(+)-1 displays two different OH stretches that differ by 11 cm-1. And in the case of the complex of α-pinene with 2, the OH vibrational frequencies of the two different stereoisomers differ by 11 cm-1!

The B3LYP-D3(BJ)/def2-TZVP optimized geometry of the 2/(+)-1 and 2/(-)-1 complexes are shown in Figure 2, and some subtle differences in sterics and dispersion give rise to the different vibrational frequencies.

2/(+)-1

2/(-)-1
Figure 2. B3LYP-D3(BJ)/def2-TZVP optimized geometry of the 2/(+)-1 and 2/(-)-1

Of interest to readers of this blog will be the DFT study of these complexes. The authors used three different well-known methods – B3LYP-D3(BJ)/def2-TZVP, M06-2x/def2-TZVP, and ωB97X-D/def2-TZVP – to compute structures and (most importantly) predict the vibrational frequencies. Interestingly, M06-2x/def2-TZVP and ωB97X-D/ def2-TZVP both failed to predict the vibrational frequency difference between the complexes with the two stereoisomers of α-pinene. However, B3LYP-D3(BJ)/def2-TZVP performed extremely well, with a mean average error (MAE) of only 1.9 cm-1 for the four different terpenes. Using this functional and the larger may-cc-pvtz basis set reduced the MAE to 1.5 cm-1 with the largest error of only 2.5 cm-1.

As the authors note, these complexes provide some fertile ground for further experimental and computational study and benchmarking.


Reference

1. Medel, R.; Stelbrink, C.; Suhm, M. A., “Vibrational Signatures of Chirality Recognition Between α-Pinene and Alcohols for Theory Benchmarking.” Angew. Chem. Int. Ed. 201958, 8177-8181, DOI: 10.1002/anie.201901687.


InChIs

(-)-1, (-)-α-pinene: InChI=1S/C10H16/c1-7-4-5-8-6-9(7)10(8,2)3/h4,8-9H,5-6H2,1-3H3/t8-,9-/m0/s1
InChIKey=GRWFGVWFFZKLTI-IUCAKERBSA-N
(+)-1, (-)-α-pinene: InChI=1S/C10H16/c1-7-4-5-8-6-9(7)10(8,2)3/h4,8-9H,5-6H2,1-3H3/t8-,9-/m1/s1
InChIKey=GRWFGVWFFZKLTI-RKDXNWHRSA-N
2, (-)borneol: InChI=1S/C10H18O/c1-9(2)7-4-5-10(9,3)8(11)6-7/h7-8,11H,4-6H2,1-3H3/t7-,8+,10+/m0/s1
InChiKey=DTGKSKDOIYIVQL-QXFUBDJGSA-N
3, (+)-fenchol: InChI=1S/C10H18O/c1-9(2)7-4-5-10(3,6-7)8(9)11/h7-8,11H,4-6H2,1-3H3/t7-,8-,10+/m0/s1
InChIKey=IAIHUHQCLTYTSF-OYNCUSHFSA-N
4, (-1)-isopinocampheol: InChI=1S/C10H18O/c1-6-8-4-7(5-9(6)11)10(8,2)3/h6-9,11H,4-5H2,1-3H3/t6-,7+,8-,9-/m1/s1
InChIKey=REPVLJRCJUVQFA-BZNPZCIMSA-N
5, (1S)-1-phenylethanol: InChI=1S/C8H10O/c1-7(9)8-5-3-2-4-6-8/h2-7,9H,1H3/t7-/m0/s1
InChIKey=WAPNOHKVXSQRPX-ZETCQYMHSA-N



'
This work is licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License.

Wednesday, May 29, 2019

Activity-Based Screening of Homogeneous Catalysts through the Rapid Assessment of Theoretically Derived Turnover Frequencies

Matthew D. Wodrich, Boodsarin Sawatlon, Ephrath Solel, Sebastian Kozuch, and Clémence Corminboeuf (2019)
Highlighted by Jan Jensen

Figure 1. Adapted from images in the preprint posted under the CC-BY-NC-ND 4.0 license

LFESRs linearly relate the reaction energies of barrier heights to a single reaction energy. In this work the all the barriers and reaction energies in Figure 1a is computed via the free energy difference between 1 and 4 [ΔG(4)]

The volcano plot is then obtained by plotting the largest free energy difference in the cycle as a function of ΔG(4). In this particular case that is the barrier between 1 and 4 when ΔG(4) is small and the energy difference between 2 and 3 when  ΔG(4) is large. The optimum catalysts is the one with a ΔG(4) for which these two lines meet and one can screen for such catalyst by computing a single free energy difference.

One problem with thus approach is that the largest free energy difference in the cycle is not always directly related to the turn over frequency (TOF), which is what is measured experimentally. In principle, the TOF should be determined by microkinetic modeling for each value of ΔG(4) to find the maximum TOF. But in this work TOFs are efficiently estimated by the energy span model, which basically considers all energy differences in the cycle (e.g. also between 1 and 3).

Using the TOF plot different energy differences between important and the optimum ΔG(4) value decreases (Figure 1b). The points in Figure 1b show the corresponding TOFs computed without the LFESRs and demonstrate the accuracy of this approach.

Monday, April 29, 2019

Exploration of Chemical Compound, Conformer, and Reaction Space with Meta-Dynamics Simulations Based on Tight-Binding Quantum Chemical Calculations

Highlighted by Jan Jensen


The paper describes a new way to search for conformers, chemical reactions, and estimate barriers using the semiempirical GFNn-XTB method using meta-dynamics. A force term is included that scales exponentially with the Cartesian RMSD from previously found structures, thereby forcing the MD explore new areas of phase space. For simulations with more than one molecule it is necessary to add a constraining potential so that the RMSD cannot be increased simply by increasing the distance between molecules. Each individual MD can be relatively short and most of the CPU time is actually spend on energy minimising the snapshots that are saved.

The results depend on a few hyperparameters, so several MD simulations with different values are run in parallel. Because of the extra force the temperature is also a hyperparameters so the method doesn't necessarily tell you what reactions are most likely to occur at, say, 300K.

The conformational search is tested on 22 (mostly) organic molecules and includes the GFN2-xTB energies of the lowest energy conformer for each molecules. This is a valuable benchmark set for other conformational search algorithms designed to find the global minimum.


This work is licensed under a Creative Commons Attribution 4.0 International License.

Wednesday, April 10, 2019

Ambimodal Trispericyclic Transition State and Dynamic Control of Periselectivity

Xue, X.-S.; Jamieson, C. S.; Garcia-Borràs, M.; Dong, X.; Yang, Z.; Houk, K. N., J. Am. Chem. Soc. 2019, 141, 1217
Contributed by Steven Bachrach
Reposted from Computational Organic Chemistry with permission

A major topic of this blog has been the growing body of studies that demonstrate that dynamic effects can control reaction products (see these posts). Often these examples crop up with valley ridge inflection points. Another cause can be bispericyclic transition states, first discovered by Caramello et al for the dimerization of cyclopentadiene.1 The Houk group now reports on the first trispericyclic transition state.2

Using ωB97X-D/6-31G(d), they examined the reaction of the tropone derivative 1 with dimethylfulvene 2. Three possible products can arrive from different pericyclic reactions: 3, the [4+6] product; 4, the [6+4] product; and 5, the [8+2] product. The thermodynamic product is predicted to be 5, but it is only 1.2 kcal mol-1 lower in energy than 4 and 6.2 kcal mol-1 lower than 3.


They identified one transition state originating from the reactants TS1. Hypothesizing that it would be trispericyclic, they performed a molecular dynamics study with trajectories starting from TS1. They ran a total of 142 trajectories, and 87% led to 3, 3% led to 4, and 3% led to 5. This demonstrates the unusual nature of TS1 and the dynamic effects on this reaction surface.


TS1

TS2

TS3
Figure 1. ωB97X-D/6-31G(d) optimized geometries of TS1-TS3.

Additionally, there are two different Cope rearrangements (through TS2 and TS3) that convert 3 into 4 and 5. Some trajectories can pass from TS1 and then directly through either TS2 or TS3 and these give rise to products 4 and 5. In other words, some trajectories will pass from a trispericyclic transition state and then through a bispericyclic transition state before ending in product.


References

1. Caramella, P.; Quadrelli, P.; Toma, L., “An Unexpected Bispericyclic Transition Structure Leading to 4+2 and 2+4 Cycloadducts in the Endo Dimerization of Cyclopentadiene.” J. Am. Chem. Soc. 2002124, 1130-1131, DOI: 10.1021/ja016622h
2. Xue, X.-S.; Jamieson, C. S.; Garcia-Borràs, M.; Dong, X.; Yang, Z.; Houk, K. N., “Ambimodal Trispericyclic Transition State and Dynamic Control of Periselectivity.” J. Am. Chem. Soc. 2019141, 1217-1221, DOI: 10.1021/jacs.8b12674.


InChIs

1: InChI=1S/C10H6N2/c11-7-10(8-12)9-5-3-1-2-4-6-9/h1-6H
InChIKey=KAWLLELUFONBGI-UHFFFAOYSA-N
2: InChI=1S/C8H10/c1-7(2)8-5-3-4-6-8/h3-6H,1-2H3
InChIKey=WXACXMWYHXOSIX-UHFFFAOYSA-N
3: InChI=1S/C18H16N2/c1-11(2)17-15-7-8-16(17)14-6-4-3-5-13(15)18(14)12(9-19)10-20/h3-8,13-16H,1-2H3
InChIKey=DRPXVBLNTKGMTB-UHFFFAOYSA-N
4: InChI=1S/C18H16N2/c1-18(2)13-6-8-14(12(10-19)11-20)15(9-7-13)16-4-3-5-17(16)18/h3-9,13,15-16H,1-2H3
InChIKey=FSIPGNLAWKVXDD-UHFFFAOYSA-N
5: InChI=1S/C18H16N2/c1-12(2)13-8-9-16-17(13)14-6-4-3-5-7-15(14)18(16,10-19)11-20/h3-9,14,16-17H,1-2H3/t14?,16-,17-/m1/s1
InChIKey=SYLWEGLODFLARZ-VNCLPFQGSA-N



'
This work is licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License.