Monday, April 29, 2019

Exploration of Chemical Compound, Conformer, and Reaction Space with Meta-Dynamics Simulations Based on Tight-Binding Quantum Chemical Calculations

Highlighted by Jan Jensen

The paper describes a new way to search for conformers, chemical reactions, and estimate barriers using the semiempirical GFNn-XTB method using meta-dynamics. A force term is included that scales exponentially with the Cartesian RMSD from previously found structures, thereby forcing the MD explore new areas of phase space. For simulations with more than one molecule it is necessary to add a constraining potential so that the RMSD cannot be increased simply by increasing the distance between molecules. Each individual MD can be relatively short and most of the CPU time is actually spend on energy minimising the snapshots that are saved.

The results depend on a few hyperparameters, so several MD simulations with different values are run in parallel. Because of the extra force the temperature is also a hyperparameters so the method doesn't necessarily tell you what reactions are most likely to occur at, say, 300K.

The conformational search is tested on 22 (mostly) organic molecules and includes the GFN2-xTB energies of the lowest energy conformer for each molecules. This is a valuable benchmark set for other conformational search algorithms designed to find the global minimum.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Wednesday, April 10, 2019

Ambimodal Trispericyclic Transition State and Dynamic Control of Periselectivity

Xue, X.-S.; Jamieson, C. S.; Garcia-Borràs, M.; Dong, X.; Yang, Z.; Houk, K. N., J. Am. Chem. Soc. 2019, 141, 1217
Contributed by Steven Bachrach
Reposted from Computational Organic Chemistry with permission

A major topic of this blog has been the growing body of studies that demonstrate that dynamic effects can control reaction products (see these posts). Often these examples crop up with valley ridge inflection points. Another cause can be bispericyclic transition states, first discovered by Caramello et al for the dimerization of cyclopentadiene.1 The Houk group now reports on the first trispericyclic transition state.2

Using ωB97X-D/6-31G(d), they examined the reaction of the tropone derivative 1 with dimethylfulvene 2. Three possible products can arrive from different pericyclic reactions: 3, the [4+6] product; 4, the [6+4] product; and 5, the [8+2] product. The thermodynamic product is predicted to be 5, but it is only 1.2 kcal mol-1 lower in energy than 4 and 6.2 kcal mol-1 lower than 3.

They identified one transition state originating from the reactants TS1. Hypothesizing that it would be trispericyclic, they performed a molecular dynamics study with trajectories starting from TS1. They ran a total of 142 trajectories, and 87% led to 3, 3% led to 4, and 3% led to 5. This demonstrates the unusual nature of TS1 and the dynamic effects on this reaction surface.



Figure 1. ωB97X-D/6-31G(d) optimized geometries of TS1-TS3.

Additionally, there are two different Cope rearrangements (through TS2 and TS3) that convert 3 into 4 and 5. Some trajectories can pass from TS1 and then directly through either TS2 or TS3 and these give rise to products 4 and 5. In other words, some trajectories will pass from a trispericyclic transition state and then through a bispericyclic transition state before ending in product.


1. Caramella, P.; Quadrelli, P.; Toma, L., “An Unexpected Bispericyclic Transition Structure Leading to 4+2 and 2+4 Cycloadducts in the Endo Dimerization of Cyclopentadiene.” J. Am. Chem. Soc. 2002124, 1130-1131, DOI: 10.1021/ja016622h
2. Xue, X.-S.; Jamieson, C. S.; Garcia-Borràs, M.; Dong, X.; Yang, Z.; Houk, K. N., “Ambimodal Trispericyclic Transition State and Dynamic Control of Periselectivity.” J. Am. Chem. Soc. 2019141, 1217-1221, DOI: 10.1021/jacs.8b12674.


1: InChI=1S/C10H6N2/c11-7-10(8-12)9-5-3-1-2-4-6-9/h1-6H
2: InChI=1S/C8H10/c1-7(2)8-5-3-4-6-8/h3-6H,1-2H3
3: InChI=1S/C18H16N2/c1-11(2)17-15-7-8-16(17)14-6-4-3-5-13(15)18(14)12(9-19)10-20/h3-8,13-16H,1-2H3
4: InChI=1S/C18H16N2/c1-18(2)13-6-8-14(12(10-19)11-20)15(9-7-13)16-4-3-5-17(16)18/h3-9,13,15-16H,1-2H3
5: InChI=1S/C18H16N2/c1-12(2)13-8-9-16-17(13)14-6-4-3-5-7-15(14)18(16,10-19)11-20/h3-9,14,16-17H,1-2H3/t14?,16-,17-/m1/s1

This work is licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License.

Wednesday, March 27, 2019

A Universal Density Matrix Functional from Molecular Orbital-Based Machine Learning: Transferability across Organic Molecules

Highlighted by Jan Jensen

Figure 3c from the paper, showing results for MP2 correlation energies

Some years ago I wrote about the ∆-ML approach where ML is used to estimate the energy difference between expensive and cheap methods based on the molecular structure. I remember wondering at the time whether additional information could be extracted from the cheap method and used as descriptors. 

This has now been tested for correlation energies and it does indeed lead to a significant improvement in accuracy. The method uses Fock, Coulomb, and exchange matrix elements in an LMO basis (which makes me wonder why it's called a density matrix functional) and Gaussian process regression (GPR) to machine learn the LMO contributions to MP2, CCSD, and CCSD(T) correlation energies.

Using just 140 molecules with 7 heavy atoms the MOB-ML method can be trained to give reasonably accurate results for molecules with 13 heavy atoms (see figure above), and offer a significant improvement over the ∆-ML approach. An MAE of 0.25 mH/heavy atom translates into an MAE of roughly 2 kcal/mol for a molecule with 13 heavy atoms, which can translate into 4 kcal/mol ∆E-errors depending on the sign, so the method may not be quite accurate enough for many purposes yet. Unfortunately, it doesn't look like training on more molecules leads to additional improvements for transferability to larger molecules, but this is definitely a promising step in the right direction.

Planar rings in nano-Saturns and related complexes

Bachrach, S. M., Chem. Commun. 2019, 55, 3650-3653
Contributed by Steven Bacharach
Reposted from Computational Organic Chemistry with permission

For the past twelve years, I have avoided posting on any of my own papers, but I will stoop to some shameless promotion to mention my latest paper,1 since it touches on some themes I have discussed in the past.

Back in 2011, Iwamoto, et al. prepared the complex of C60 1 surrounded by [10]cycloparaphenylene 2 to make the Saturn-like system 3.2 Just last year, Yamamoto, et al prepared the Nano-Saturn 5a as the complex of 1 with the macrocycle 4a.3 The principle idea driving their synthesis was to utilize a ring that is flatter than 2. The structures of 3 and 5b (made with the parent macrocycle 4b) are shown in side view in Figure 1, and clearly seen is the achievement of the flatter ring.



Figure 1. Computed structures of 3, 5, and 7.

However, the encompassing ring is not flat, with dihedral angles between the anthrenyl groups of 35°. This twisting is due to the steric interactions of the ortho-ortho’ hydrogens. A few years ago, my undergraduate student David Stück and I suggested that selective substitution of a nitrogen for one of the C-H groups would remove the steric interaction,4 leading to a planar poly-aryl system, such as making twisted biphenyl into the planar 2-(2-pyridyl)-pyridine (Scheme 1)

Scheme 1.

Following this idea leads to four symmetrical nitrogen-substituted analogues of 4b; and I’ll mention just one of them here, 6.

As expected, 6 is perfectly flat. The ring remains flat even when complexed with (as per B3LYP-D3(BJ)/6-31G(d) computations), see the structure of 7 in Figure 1.

I also examined the complex of the flat macrocycle 6 (and its isomers) with a [5,5]-nanotube, 7. The tube bends over to create better dispersion interaction with the ring, which also become somewhat non-planar to accommodate the tube. Though not mentioned in the paper, I like to refer to 7 as Beyoncene, in tribute to All the Single Ladies.
Figure 2. Computed structure of 7.

My sister is a graphic designer and she made this terrific image for this work:


1. Bachrach, S. M., “Planar rings in nano-Saturns and related complexes.” Chem. Commun. 201955, 3650-3653, DOI: 10.1039/C9CC01234F.
2. Iwamoto, T.; Watanabe, Y.; Sadahiro, T.; Haino, T.; Yamago, S., “Size-Selective Encapsulation of C60 by [10]Cycloparaphenylene: Formation of the Shortest Fullerene-Peapod.” Angew. Chem. Int. Ed. 201150, 8342-8344, DOI: 10.1002/anie.201102302
3. Yamamoto, Y.; Tsurumaki, E.; Wakamatsu, K.; Toyota, S., “Nano-Saturn: Experimental Evidence of Complex Formation of an Anthracene Cyclic Ring with C60.” Angew. Chem. Int. Ed. 2018 57, 8199-8202, DOI: 10.1002/anie.201804430.
4. Bachrach, S. M.; Stück, D., “DFT Study of Cycloparaphenylenes and Heteroatom-Substituted Nanohoops.” J. Org. Chem. 201075, 6595-6604, DOI: 10.1021/jo101371m


4b: InChI=1S/C84H48/c1-13-61-25-62-15-3-51-33-75(62)43-73(61)31-49(1)50-2-14-63-26-64-16-4-52(34-76(64)44-74(63)32-50)54-6-18-66-28-68-20-8-56(38-80(68)46-78(66)36-54)58-10-22-70-30-72-24-12-60(42-84(72)48-82(70)40-58)59-11-23-71-29-69-21-9-57(39-81(69)47-83(71)41-59)55-7-19-67-27-65-17-5-53(51)35-77(65)45-79(67)37-55/h1-48H
6: InChI=1S/C72H36N12/c1-2-38-14-44-20-45-25-67(73-31-50(45)13-37(1)44)57-9-4-39-15-51-32-74-68(26-46(51)21-61(39)80-57)58-10-5-40-16-52-33-75-69(27-47(52)22-62(40)81-58)59-11-6-41-17-53-34-76-70(28-48(53)23-63(41)82-59)60-12-7-42-18-54-35-77-71(29-49(54)24-64(42)83-60)72-78-36-55-19-43-3-8-56(38)79-65(43)30-66(55)84-72/h1-36H

This work is licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License.

Thursday, March 21, 2019

More DFT benchmarking

Contributed by Steven Bacharach
Reposted from Computational Organic Chemistry with permission

Selecting the appropriate density functional for one’s molecular system at hand is often a very confounding problem, especially for non-expert or first-time users of computational chemistry. The DFT zoo is vast and confusing, and perhaps what makes the situation worse is that there is no lack of benchmarking studies. For example, I have made more than 30 posts on benchmark studies, and I made no attempt to be comprehensive over the past dozen years!

One such benchmark study that I missed was presented by Mardirossian and Head-Gordon in 2017.1 They evaluated 200 density functional using the MGCDB84 database, a combination of data from a number of different groups. They make a series of recommendations for local GGA, local meta-GGA, hybrid GGA, and hybrid meta-GGA functionals. And when pressed to choose just one functional overall, they opt for ωB97M-V, a range-separated hybrid meta-GGA with VV10 nonlocal correlation.

Georigk and Mehta2 just recently offer a review of the density functional zoo. Leaning heavily on benchmark studies using the GMTKN553 database, they report a number of observations. Of no surprise to readers of this blog, their main conclusion is that accounting for London dispersion is essential, usually through some type of correction like those proposed by Grimme.

These authors also note the general disparity between the most accurate, best performing functional per the benchmark studies and the results of the DFT poll conducted for many years by Swart, Bickelhaupt and Duran. It is somewhat remarkable that PBE or PBE0 have topped the poll for many years, despite the fact that many newer functionals perform better. As always, when choosing a functional caveat emptor.


1.  Mardirossian, N.; Head-Gordon, M., “Thirty years of density functional theory in computational chemistry: an overview and extensive assessment of 200 density functionals.” Mol. Phys. 2017115, 2315-2372, DOI: 10.1080/00268976.2017.1333644.
2. Goerigk, L.; Mehta, N., “A Trip to the Density Functional Theory Zoo: Warnings and Recommendations for the User.” Aust. J. Chem. 2019, ASAP, DOI: 10.1071/CH19023.
3. Goerigk, L.; Hansen, A.; Bauer, C.; Ehrlich, S.; Najibi, A.; Grimme, S., “A look at the density functional theory zoo with the advanced GMTKN55 database for general main group thermochemistry, kinetics and noncovalent interactions.” Phys. Chem. Chem. Phys. 201719, 32184-32215, DOI: 10.1039/C7CP04913G.

This work is licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License.

Tuesday, March 19, 2019

Artificial Intelligence Assists Discovery of Reaction Coordinates and Mechanisms from Molecular Dynamics Simulations

Contributed by Jesper Madsen

Here, I highlight a recent preprint describing an application of Artificial Intelligence/Machine Learning (AI/ML) methods to problems in computational chemistry and physics. The group previously published the intrinsic map dynamics (iMapD) method, which I also highlighted here on Computational Chemistry Highlights. The basic idea in the previous study was to use an automated trajectory-based approach (as opposed to a collective variable-based approach) to explore the free-energy surface a computationally expensive Hamiltonian that describes a complex biochemical system.

Fig 1: Schematic flow chart of the AI-assisted MD simulation algorithm.

The innovation in their current approach is the combination of the sampling scheme, statistical inference, and deep learning to construct a framework where sampling and mechanistic interpretation happens simultaneously – an important milestone towards completely “autonomous production and interpretation of MD simulations of rare events,” as the authors themselves remark.

It is reassuring to see that the method correctly identifies known results for benchmark cases (the alanine dipeptide and LiCl dissociation) and out-competes traditional approaches such as transition path sampling in terms of efficiency. In these simple model cases, however, complexity is relatively low and sampling is cheap. I will be looking forward to seeing the method applied to a much more complex problem in the future; E.g. a problem where ergodicity is a major issue other challenges, such as hysteresis, plays a significant role.

Another much appreciated aspect of general interest in this paper that I am emphasizing is the practical approach to interpretation of the constructed neural networks. All in all, there are many useful comments and observations in this preprint and I would recommend reading it thoroughly for those who seek to use modern AI-based methods on molecular simulations.

Wednesday, February 27, 2019

Ultra-large library docking for discovering new chemotypes

Jiankun Lyu, Sheng Wang, Trent E. Balius, Isha Singh, Anat Levit, Yurii S. Moroz, Matthew J. O’Meara, Tao Che, Enkhjargal Algaa, Kateryna Tolmachova, Andrey A. Tolmachev, Brian K. Shoichet, Bryan L. Roth & John J. Irwin (2019)
Highlighted by Jan Jensen

Figure 3a from the paper. (c) Nature

This paper has already been thoroughly highlighed several places, such as here and here, so I'll just summarise what the main take-home messages are for me.
  • The size of the libraries (99 and 138 million) that are screened are truly impressive, especially when you realise that they sampled 280 conformations for each molecule! This required 1.2 calendar days on 1,500 cores.
  • The libraries where made from 70,000 commercially available building blocks, which where combined using 130 known reactions. The molecules in the library should therefore be easy to synthesise
  • Indeed, for one target they selected 589 molecules for synthesis and successfully made 549, for which they measured affinities.
  • The selected molecules spanned the whole range of docking score, which results in a thorough test of the accuracy. As shown in the figure above, the scores can only really be used to weed out the very weak binders.
  • As Derek Lowe notes "That definitely argues for setting up these virtual libraries according to expected ease of synthesis, because otherwise you could spend a lot of time making tough compounds that don’t do anything. People have."
Very commendably, the authors have made the libraries available as a public database.

This work is licensed under a Creative Commons Attribution 4.0 International License.