Sunday, June 25, 2017

A Deep Neural Network with Minimal Chemistry Knowledge Matches the Performance of Expert-developed QSAR/QSPR Models

Highlighted by Jan Jensen

Figure 1: The key difference in using deep learning algorithms as a machine learning tool as opposed to a “machine intelligence” tool is the assistance, augmentation and possible replacement, for human-led tasks like feature engineering in computational chemistry.

A lot of machine learning research in chemistry is focussed on finding the best descriptors for the property of interest.  This paper shows that simply using 2D images of molecules leads to similarly accurate predictions of solvation free energies, in vitro HIV activity, and in vivo toxicity. 

This seems to me an appropriate "null-model" that all machine learning studies should include. Another option would be SMILES strings or some representation thereof. If your fancy descriptor doesn't lead to significantly better predictions then it's back to the drawing board.

The manuscript doesn't mention code availability but one of the co-authors tells me that they plan to make to code available when it is ready.

Saturday, June 24, 2017

London Dispersion Enables the Shortest Intermolecular Hydrocarbon H···H Contact.

Rösel, S.; Quanz, H.; Logemann, C.; Becker, J.; Mossou, E.; Cañadillas-Delgado, L.; Caldeweyher, E.; Grimme, S.; Schreiner, P. R.,  J. Am. Chem. Soc. 2017, 139, 7428–7431
Contributed by Steven Bacharach
Reposted from Computational Organic Chemistry with permission

Following on previous work (see these posts on ladderane and hexaphenylethane), Schreiner, Grimme and co-workers have examined the structure of the all-meta tri(di-t-butylphenyl)methane dimer 12.1 In the study of hexaphenylethane,2 Schreiner and Grimme note that t-butyl groups stabilize highly congested structures through dispersion, identifying them as “dispersion energy donors”.3 The idea here is that the dimer of 1 will be stabilized by these many t-butyl groups. In fact, the neutron diffraction study of the crystal structure of 12 shows an extremely close approach of the two methane hydrogens of only 1.566 Å, the record holder for the closest approach of two formally non-bonding hydrogen atoms.


To understand the nature of this dimeric structure, they employed a variety of computational techniques. (Shown in Figure 1 is the B3LYPD3ATM(BJ)/def2-TZVPP optimized geometry of 12.) The HSE-3c (a DFT composite method) optimized crystal structure predicts the HH distance is 1.555 Å. The computed gas phase structure lengthens the distance to 1.634 Å, indicating a small, but essential, role for packing forces. Energy decomposition analysis of 12 at B3LYP-D3ATM(BJ)/def2-TZVPP indicates a dominant role for dispersion in holding the dimer together. While 12 is bound by about 8 kcal mol-1, the analogue of 12lacking all of the t-butyl groups (the dimer of triphenylmethane 22) is unbound by over 8 kcal mol-1. Topological electron density analysis does show a bond critical point between the two formally unbound hydrogen atoms, and the noncovalent interaction plot shows an attractive region between these two atoms.

Figure 1ATM(BJ)/def2-TZVPP optimized geometry of 12, with most of the hydrogens suppressed for clarity. (Selecting the molecule will launch Jmol with the full structure, including the hydrogens.)


References

1) Rösel, S.; Quanz, H.; Logemann, C.; Becker, J.; Mossou, E.; Cañadillas-Delgado, L.; Caldeweyher, E.; Grimme, S.; Schreiner, P. R., "London Dispersion Enables the Shortest Intermolecular Hydrocarbon H···H Contact." J. Am. Chem. Soc. 2017139, 7428–7431, DOI: 10.1021/jacs.7b01879.
2) Grimme, S.; Schreiner, P. R., "Steric Crowding Can Stabilize a Labile Molecule: Solving the Hexaphenylethane Riddle." Angew. Chem. Int. Ed. 2011, 50 (52), 12639-12642, DOI: 10.1002/anie.201103615.
3) Grimme, S.; Huenerbein, R.; Ehrlich, S., "On the Importance of the Dispersion Energy for the Thermodynamic Stability of Molecules." ChemPhysChem 2011, 12 (7), 1258-1261, DOI: 10.1002/cphc.201100127.


InChIs

1: InChI=1S/C43H64/c1-38(2,3)31-19-28(20-32(25-31)39(4,5)6)37(29-21-33(40(7,8)9)26-34(22-29)41(10,11)12)30-23-35(42(13,14)15)27-36(24-30)43(16,17)18/h19-27,37H,1-18H3
InChIKey=VFNQDWKFTWSJAU-UHFFFAOYSA-N

'
This work is licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License.

Saturday, June 10, 2017

Dynamic Effects Responsible for High Selectivity in a [3,3] Sigmatropic Rearrangement Featuring a Bispericyclic Transition State

Villar López, R.; Faza, O. N.; Silva López, C., J. Org. Chem. 2017, 82 (9), 4758-4765
Contributed by Steven Bacharach
Reposted from Computational Organic Chemistry with permission

Bispericyclic reactions occur when two different pericyclic reactions merge to have a single transition state. An example of this is the joining of two [3,3]-sigmatopic rearrangements of 1 that merge to have a single transition state. Lopez, Faza, and Lopez have examined the dynamics of this reaction.1


Because of the symmetry of the species along this reaction pathway, the products of the two different rearrangements are identical, and will be formed in equal amounts, though they are produced from a single transition state with the reaction pathway bifurcating due to a valley-ridge inflection post TS.

The interesting twist that is explored here is when 1 is substituted in order to break the symmetry. The authors have examined 3x with either fluorine, chlorine, or bromine. The critical points on the reactions surface were optimized at M06-2X/Def2TZVPP. In all three cases a single bispericyclic transition state 3TS1x is found, which leads to products 4a and 4b. A second transition state 4TSx corresponds to the [3,3]-rearrangement that interconverts the two products. The structures of 1TS3TS1F, and 3TS1Cl are shown in Figure 1.

1TS

3TS1F

3TS1Cl
Figure 1. M06-2X/Def2TZVPP optimized geometries of 1TS3TS1F, and 3TS1Cl.

The halogen substitution breaks the symmetry of the reaction path. This leads to a number of important changes. First, the C4-C5 and C7-C8 distances, which are identical in 1TS, are different in the halogen cases. Interestingly, the distortions are dependent on the halogen: in 3TS1F C4-C5 is 0.2 Å longer than C7-C8, but in 3TS1Cl C7-C8 is much longer (by 0.65 Å) than C4-C5. Second, the products are no longer equivalent with the halogen substitution. Again, this is halogen dependent: 4bF is 4.0 kcal mol-1 lower in energy than 4aF, while 4aCl is 8.2 kcal mol-1 lower than 4bCl.

These difference manifest in very different reaction dynamics. With trajectories initiated at the first (bispericyclic) transiting state, 89% end at 4bF and 9% end at 4aF, a ratio far from unity that might be expected from both products resulting from passage through the same TS. The situation is even more extreme for the chlorine case, where all 200 trajectories end in 4aCl. This is yet another example of the role that dynamics play in reaction outcomes (see these many previous posts).


References

1) Villar López, R.; Faza, O. N.; Silva López, C., "Dynamic Effects Responsible for High Selectivity in a [3,3] Sigmatropic Rearrangement Featuring a Bispericyclic Transition State." J. Org. Chem. 2017, 82 (9), 4758-4765, DOI: 10.1021/acs.joc.7b00425.


InChIs

1: InChI=1S/C9H12/c1-3-9-6-4-8(2)5-7-9/h1-2,4-7H2
InChIKey=RRXCPJIEZVQPSZ-UHFFFAOYSA-N
2: InChI<=1S/C9H12/c1-7-4-5-8(2)9(3)6-7/h1-6H2
InChIKey=AMBNQWVPTPHADI-UHFFFAOYSA-N
3F: InChI=1S/C9H8F4/c1-3-7-5-4-6(2)8(10,11)9(7,12)13/h1-2,4-5H2
InChIKey=VZFAQFJKHDWJDN-UHFFFAOYSA-N
3Cl: InChI=1S/C9H8Cl4/c1-3-7-5-4-6(2)8(10,11)9(7,12)13/h1-2,4-5H2
InChIKey=AIVUHFMHIMNOJB-UHFFFAOYSA-N
4aF: InChI=1S/C9H8F4/c1-5-4-6(8(10)11)2-3-7(5)9(12)13/h1-4H2
InChIKey=NAUUHIHYMAOMIF-UHFFFAOYSA-N
4aCl: InChI=1S/C9H8Cl4/c1-5-4-6(8(10)11)2-3-7(5)9(12)13/h1-4H2
InChIKey=MMCKDJXQYSGQEH-UHFFFAOYSA-N
4bF: InChI=1S/C9H8F4/c1-5-4-6(2)8(10,11)9(12,13)7(5)3/h1-4H2
InChIKey=LMFNAIRCNARWSX-UHFFFAOYSA-N
4bCl: InChI=1S/C9H8Cl4/c1-5-4-6(2)8(10,11)9(12,13)7(5)3/h1-4H2
InChIKey=NOFFASDSCUGRTP-UHFFFAOYSA-N

'
This work is licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License.

Saturday, May 27, 2017

Synthesis of a carbon nanobelt

Povie, G.; Segawa, Y.; Nishihara, T.; Miyauchi, Y.; Itami, K. Science 2017, 356, 172-175
Contributed by Steven Bacharach
Reposted from Computational Organic Chemistry with permission

The synthesis of components of nanostructures (like fullerenes and nanotubes) has dramatically matured over the past few years. I have blogged about nanohoops before, and this post presents the recent work of the Itami group in preparing the nanobelt 1.1

1

The synthesis is accomplished through a series of Wittig reactions with an aryl-aryl coupling to stitch together the final rings. The molecule is characterized by NMR and x-ray crystallography. The authors have also computed the structure of 1 at B3LYP/6-31G(d), shown in Figure 1. The computed C-C distances match up very well with the experimental distances. The strain energy of 1, presumably estimated by Reaction 1,2 is computed to be about 119 kcal mol-1.

1
Figure 1. B3LYP/6-31G(d) optimized structure of 1.
Rxn 1
NICS(0) values were obtained at B3LYP/6-311+G(2d,p)//B3LYP/6-31G(d); the rings along the middle of the belt have values of -7.44ppm and are indicative of normal aromatic 6-member rings, while the other rings have values of -2.00ppm. This suggests the dominant resonance structure shown below:

References

1) Povie, G.; Segawa, Y.; Nishihara, T.; Miyauchi, Y.; Itami, K., "Synthesis of a carbon nanobelt." Science 2017, 356, 172-175, DOI: 10.1126/science.aam8158.
2) Segawa, Y.; Yagi, A.; Ito, H.; Itami, K., "A Theoretical Study on the Strain Energy of Carbon Nanobelts." Org. Letters 2016, 18, 1430-1433, DOI: 10.1021/acs.orglett.6b00365.

InChIs:

1: InChI=1S/C48H24/c1-2-26-14-40-28-5-6-31-20-44-32(19-42(31)40)9-10-34-24-48-36(23-46(34)44)12-11-35-21-45-33(22-47(35)48)8-7-30-17-41-29(18-43(30)45)4-3-27-15-37(39(26)16-28)25(1)13-38(27)41/h1-24H
InChIKey=KJWRWEMHJRCQKK-UHFFFAOYSA-N



'
This work is licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License.

Sunday, May 21, 2017

Solving the Density Functional Conundrum: Elimination of Systematic Errors To Derive Accurate Reaction Enthalpies of Complex Organic Reactions


Highlighted by Jan Jensen



Sengupta and Raghavachari present a quick and efficient way to increase the accuracy of computed reaction energies (ΔE).  For example, it is difficult to compute the reaction energy for Rxn1 because the bonding changes a lot: in effect, two double bonds are changed to 4 single bonds. By the same logic, it should be much easier to compute an accurate reaction energy for Rxn2.  

ΔE(Rxn1) = ΔE(Rxn2) - ΔE(Rxn3) 

So one should be able to get a good estimate of ΔE(Rxn1) by computing ΔE(Rxn2) and ΔE(Rxn3) at a relatively low and high level of theory, respectively. The accuracy can be further increased by larger fragments, either in Rxn3 or in an additional reaction.

Sengupta and Raghavachari test a four-reaction approach for 25 different reactions and a large variety of methods (DFT, HF, MP2, and CCSD(T)) and show that the mean absolute error relative to G4 can be reduced to ca 2 kcal/mol or less using the 6-311++G(3df,2p) basis set. For M06-2X they also tested the effect of basis set and showed that the MAE only increases from 2.2 to 2.6 kcal/mol on to the 6-31G(d) basis set.

Of course the high level calculations on the small fragments only have to be done once and a relatively small number of different fragments will be needed to cover most organic reactions.

Wednesday, May 10, 2017

Progress in DFT development and the density they predict

Medvedev, M. G.; Bushmarinov, I. S.; Sun, J.; Perdew, J. P.; Lyssenko, K. A., "Density functional theory is straying from the path toward the exact functional." Science 2017, 355, 49-52
Hammes-Schiffer, S., "A conundrum for density functional theory." Science 2017, 355, 28-29
Korth, M., "Density Functional Theory: Not Quite the Right Answer for the Right Reason Yet." Angew. Chem. Int. Ed. 2017, 56, 5396-5398
Contributed by Steven Bacharach
Reposted from Computational Organic Chemistry with permission

“Getting the right answer for the right reason” – how important is this principle when it comes to computational chemistry? Medvedev and co-workers argue that when it comes to DFT, trends in functional development have overlooked this maxim in favor of utility.1 Specifically, they note that
There exists an exact functional that yields the exact energy of a system from its exact density.
Over the past two decades a great deal of effort has gone into functional development, mostly in an empirical way done usually to improve energy prediction. This approach has a problem:
[It], however, overlooks the fact that the reproduction of exact energy is not a feature of the exact functional, unless the input electron density is exact as well.
So, these authors have studied functional performance with regards to obtaining proper electron densities. Using CCSD/aug-cc-pwCV5Z as the benchmark, they computed the electron density for a number of neutral and cationic atoms having 2, 4, or 10 electrons. Then, they computed the densities with 128 different functionals of all of the rungs of Jacob’s ladder. They find that accuracy was increasing as new functionals were developed from the 1970s to the early 2000s. Since then, however, newer functionals have tended towards poorer electron densities, even though energy prediction has continued to improve. Medvedev et al argue that the recent trend in DFT development has been towards functionals that are highly parameterized to fit energies with no consideration given to other aspects including the density or constraints of the exact functional.

In the same issue of Science, Hammes-Schiffer comments about this paper.2 She notes some technical issues, most importantly that the benchmark study is for atoms and that molecular densities might be a different issue. But more philosophically (and practically), she points out that for many chemical and biological systems, the energy and structure are of more interest than the density. Depending on where the errors in density occur, these errors may not be of particular relevance in understanding reactivity; i.e., if the errors are largely near the nuclei but the valence region is well described then reactions (transition states) might be treated reasonably well. She proposes that future development of functionals, likely still to be driven by empirical fitting, might include other data to fit to that may better reflect the density, such as dipole moments. This seems like a quite logical and rational step to take next.
A commentary by Korth3 summarizes a number of additional concerns regarding the Medvedev paper. The last concern is the one I find most striking:
Even if there really are (new) problems, it is as unclear as before how they can be overcome…With this in mind, it does not seem unreasonable to compromise on the quality of the atomic densities to improve the description of more relevant properties, such as the energetics of molecules.
Korth concludes with
In the meantime, while theoreticians should not rest until they have the right answer for the right reason, computational chemists and experimentalists will most likely continue to be happy with helpful answers for good reasons.
I do really think this is the correct take-away message: DFT does appear to provide good predictions of a variety of chemical and physical properties, and it will remain a widely utilized tool even if the density that underpins the theory is incorrect. Functional development must continue, and Medvedev et al. remind us of this need.


References

1) Medvedev, M. G.; Bushmarinov, I. S.; Sun, J.; Perdew, J. P.; Lyssenko, K. A., "Density functional theory is straying from the path toward the exact functional." Science 2017, 355, 49-52, DOI: 10.1126/science.aah5975.
2) Hammes-Schiffer, S., "A conundrum for density functional theory." Science 2017, 355, 28-29, DOI: 10.1126/science.aal3442.
3) Korth, M., "Density Functional Theory: Not Quite the Right Answer for the Right Reason Yet." Angew. Chem. Int. Ed. 201756, 5396-5398, DOI: 10.1002/anie.201701894.


'
This work is licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License.

Tuesday, May 9, 2017

Puckering Energetics and Optical Activities of [7]Circulene Conformers

Hatanaka, M., J. Phys. Chem. A 2016, 120 (7), 1074-1083
Contributed by Steven Bacharach
Reposted from Computational Organic Chemistry with permission

I have discussed the circulenes in a few previous posts. Depending on their size, they can be bowls, flat disks, or saddles. A computational study of [7]circulene noted that C2 structure is slightly higher in energy than the Cs form,1 though the C2 form is found in the x-ray structure.2
Now, Miao and co-workers have synthesized the tetrabenzo[7]circulene 1 and also examined its structure using DFT.3


As with the parent compound, a C2 and Cs form were located at B3LYP/6-31G(d,p), and are shown in Figure 1. The C2 form is 7.6 kcal mol-1 lower in energy than the Cs structure, and the two are separated by a transition state (also shown in Figure 1) with a barrier of 12.2 kcal mol-1. The interconversion of these conformations takes place without going through a planar form. The x-ray structure contains only the C2structure. It should be noted that the C2 structure is chiral, and racemization would take place by the path: 1-Cs ⇆ 1-Cs ⇆ 1-C2*, where 1-C2* is the enantiomer of 1-C2.

1-C2

1-TS

1-Cs
Figure 1. B3LYP/6-31G(d,p) optimized structures of 1.


References

1) Hatanaka, M., "Puckering Energetics and Optical Activities of [7]Circulene Conformers." J. Phys. Chem. A 2016, 120 (7), 1074-1083, DOI: 10.1021/acs.jpca.5b10543.
2) Yamamoto, K.; Harada, T.; Okamoto, Y.; Chikamatsu, H.; Nakazaki, M.; Kai, Y.; Nakao, T.; Tanaka, M.; Harada, S.; Kasai, N., "Synthesis and molecular structure of [7]circulene." J. Am. Chem. Soc. 1988, 110 (11), 3578-3584, DOI: 10.1021/ja00219a036.
3) Gu, X.; Li, H.; Shan, B.; Liu, Z.; Miao, Q., "Synthesis, Structure, and Properties of Tetrabenzo[7]circulene." Org. Letters 2017, DOI: 10.1021/acs.orglett.7b00714.


InChIs

1: InChI=1S/C44H22/c1-5-13-28-24(9-1)32-19-17-23-18-20-33-25-10-2-6-14-29(25)38-31-16-8-4-12-27(31)35-22-21-34-26-11-3-7-15-30(26)37(28)43-39(32)36(23)40(33)44(38)42(35)41(34)43/h1-22H
InChIKey=KVMXYGAVHDZMNP-UHFFFAOYSA-N

Friday, May 5, 2017

Empirical D3 dispersion as a replacement for ab initio dispersion terms in density functional theory-based symmetry-adapted perturbation theory

Robert Sedlak and Jan Řezáč (2017)
Highlighted by Amelia Fitzsimmons


Sedlak and Rezac presented an approximation to DFT-SAPT that replaces the ab initio dispersion terms in the popular but expensive SAPT calculation with a potential that is based on Grimme’s D3 dispersion. The D3 dispersion correction has become a popular way to improve the accuracy of DFT geometry optimizations and thermochemistry calculations for systems involving noncovalent interactions, and as applied to DFT-SAPT improves the efficiency of DFT-SAPT calculations. They demonstrated with the S66X8 and S66X6 test sets that this correction has root mean square errors of less than 1 kcal/mol for non-charge transfer species. I think this could be useful to anyone who is looking for a more efficient way to do energy decomposition analysis who has previously used DFT-SAPT, or anyone who is interested in noncovalent interactions and dispersion corrections to DFT. 

Saturday, April 29, 2017

Cheap but accurate calculation of chemical reaction rate constants from ab initio data, via system-specific, black-box force fields

Julien Steffen and Bernd Hartke 2017
Highlighted by Jan Jensen

Figure 1 from the paper. A flowchart of the EVB-QMDFF program implemented in this work, for the case of a DG-EVB-QMDFF calculation.


A few years ago I highlighted Grimme's General Quantum Mechanically Derived Force Field (QMDFF) - a black box approach that gives you a system-specific force field from a single QM Hessian calculation.  I missed the fact that Hartke and Grimme extended this approach to TSs using EVB, a year later. This EVB-QMDFF approach constructs EVB potentials connecting each pair of minima described by QMDFF.  To get the EVB parameters you need to supply the TS and 10-100 energies (and possibly 5-10 Hessian calculations) along the reaction path, depending on how complex an EVB potential is needed to describe the reaction.

What's the point of a system-specific reactive force field when you already have the TS and reaction path? Well, Steffen and Hartke show is that EVB-QMDFF can be used to perform the additional calculations needed for, for example, variational TS theory or ring polymer MD calculations to get more accurate rate constants.

Furthermore, just like for QMDFF for minima you could do all this for one conformation of ligands and use EVB-QMDFF for a conformer search or use the gas phase parameterized model to study the effect of explicit solvation.  It might even be possible to parameterize EVB-QMDFF using small ligands and then model the effect of larger ligands using the QMDFF parameters obtained for the minima.  However, all these potential uses still need to be tested.

I thank Jean-Philip Piquemal for bringing this paper to my attention



This work is licensed under a Creative Commons Attribution 4.0 International License.

Friday, April 28, 2017

Local Fitting of the Kohn−Sham Density in a Gaussian and Plane Waves Scheme for Large-Scale Density Functional Theory Simulations

Dorothea Golze, Marcella Iannuzzi, and Jürg Hutter (2017)
Highlighted by Michael Banck

Reprinted (adapted) with permission from Dorothea Golze, Marcella Iannuzzi, and Jürg Hutter. Journal of Chemical Theory and Computation, 2017 ASAP,  Copyright 2017 American Chemical Society.

Hutter et al. have published their LRIGPW (local resolution-of-the-identity gaussian and plane waves method) paper in JCTC. The image above taken from that paper highlights that much of the total runtime for conventional GPW (the main method implemented in the CP2K package) is spent on the description of the total charge density on real-space grids ("GPW grid", dark blue). Can you spot the orange bars for the same work done in the LRIGPW approach? This takes CP2K another big step forward.

Wednesday, April 19, 2017

New CCH contributors wanted

CCH has been been around for about five years and it's time to shake things up a bit in terms of how to contribute.

Who can contribute?
Anyone at any level of their career.  If you write a good highlight (in my opinion) then I'll post it (see below).  You don't have to be an expert in the subject of the highlight.  For example, I frequently highlight papers that use machine learning because I find it fascinating and want to learn about it, but I have yet to publish in that area.

In general, I would like CCH to be a diverse as possible in terms of subjects, career-stage, gender, geography, etc.  For example, the current highlights are mostly on small molecule, electronic structure-related papers and I'd love to have some more highlights on solid state and dynamics.  But more small molecule, electronic structure-related highlights are also fine.

How can I contribute?
If you are interested in contributing highlights to CCH send a highlight in any format to compchemhighlights@gmail.com.  If I like it, I'll post it. If you continue to send me highlights on a fairly regular basis (e.g. one every 2-3 months), then I'll add you to the editorial board and give you access to the site so you can post yourself.  If you make it on to the editorial board but don't post highlight  a 12 month period, I'll remove you from the editorial board again. This also goes for current editors, from today on.

Note that you don't commit yourself to be a regular contributor by sending a highlight. Contributing once is just fine.

You can also post a highlight on your own blog and send me a link for cross-posting.

What are the requirements for a highlight?
You can't highlight a paper you have co-authored.  The paper should be published within the current year, or last two years.  So in 2017, the paper must be published in 2015-2017.  You should highlight papers you (mostly) like and agree with - not papers of which you are highly critical.  You need at least a sentence or two on why you think the paper is interesting, and it should be of general interest to some fairly large subgroup of computational chemists. One thing I am not interested in is a steady stream of papers related to a very specific subject. It is fine to highlight a preprint that has been deposited but not accepted or published yet.

A bit about CCH
As mentioned above, CCH has been around for about five years.  CCH receives about 2200 page views per month, has about 450 Twitter followers, 1000 Facebook likes, and 330 followers on LinkedIn.


This work is licensed under a Creative Commons Attribution 4.0

Sunday, April 16, 2017

Quantifying Possible Routes for SpnF-Catalyzed Formal Diels–Alder Cycloaddition

Medvedev, M. G.; Zeifman, A. A.; Novikov, F. N.; Bushmarinov, I. S.; Stroganov, O. V.; Titov, I. Y.; Chilov, G. G.; Svitanko, I. V., J. Am. Chem. Soc. 2017, 139, 3942-3945
Contributed by Steven Bacharach
Reposted from Computational Organic Chemistry with permission

Medvedev, et al. have examined the cyclization step in the formation of Spinosyn A, which is catalyzed by the putative Diels-Alderase enzyme SpnF.1 This work follows on the computational study done by Houk, Singleton and co-workers,2 which I have discussed in this post (Dynamics in a reaction where a [6+4] and [4+2] cycloadditons compete). In fact, I recommend that you read the previous post before continuing on with this one. In summary, Houk, et al. found that a single transition state connects reactant 1 to both 2 and 3. The experimental product with the enzyme SpnF is 3. In the absence of enzyme, Houk, et al. suggest that reactions will cross the bispericyclic transition state TS-BPC (TS1 in the previous post) leading primarily to 2, which then undergoes a Cope rearrangement to get to product 3. Some molecules will follow pathways that go directly to 3.
The PCM(water)/M06-2x/6-31+G(d) study by Medvedev, et al. first identifies 560 conformations of 3. Next, they identified 384 TSs lying within 30 kcal mol-1 from the lowest TS. These can be classified as either TS-DA (leading directly to 3) or TS-BPC (which may lead to 2 by steepest descent, but can bifurcate towards 3). They opt to utilize the Atoms-in-Molecules theory to identify bond critical points to categorize these TS, and find that 144 are TS-BPC and 240 are TS-DA. (The transition state found by Houk, et al. is the second lowest energy TS found in this study, 0.29 kcal mol-1 higher in energy that the lowest TS and also of TS-BPC type.)

They also examined two alternative routes. First, they propose a path that first takes 1 to 4 via an alternative Diels-Alder reaction, and a second Cope rearrangement (TS-Cope2) takes this to 2, which can then convert to 3 via TS-Cope1. The other route involves a biradical pathway to either A or B. These alternatives prove to be non-competitive, with transition state energies significantly higher than either TS-DA or TS-BPC.

Returning to the set of TS-DA and TS-BPC transition states, while the former are more numerous, the latter are lower in energy. In summary, this study further complicates the complex situation presented by Houk, et. al. In the absence of catalyst, 1 can undergo either a Diels-Alder reaction to 3, or pass through a bispericyclic transition state that can lead to 3, but principally to 2 and then undergo a Cope rearrangement to get to 3. The question that ends my previous post on this subject — “ just what role does the enzyme SpnF play?” — remains to be answered.


References

1) Medvedev, M. G.; Zeifman, A. A.; Novikov, F. N.; Bushmarinov, I. S.; Stroganov, O. V.; Titov, I. Y.; Chilov, G. G.; Svitanko, I. V., "Quantifying Possible Routes for SpnF-Catalyzed Formal Diels–Alder Cycloaddition." J. Am. Chem. Soc. 2017, 139, 3942-3945, DOI: 10.1021/jacs.6b13243.
2) Patel, A.; Chen, Z.; Yang, Z.; Gutiérrez, O.; Liu, H.-w.; Houk, K. N.; Singleton, D. A., "Dynamically Complex [6+4] and [4+2] Cycloadditions in the Biosynthesis of Spinosyn A." J. Am. Chem. Soc. 2016, 138, 3631-3634, DOI: 10.1021/jacs.6b00017.


InChIs

1: InChI=1S/C24H34O5/c1-3-21-15-12-17-23(27)19(2)22(26)16-10-7-9-14-20(25)13-8-5-4-6-11-18-24(28)29-21/h4-11,16,18-21,23,25,27H,3,12-15,17H2,1-2H3/b6-4+,8-5+,9-7+,16-10+,18-11+/t19-,20+,21-,23-/m0/s1
InChIKey=JEKALMRMHDPSQK-ZTRRSECRSA-N
2: InChI=1S/C24H34O5/c1-3-19-8-6-10-22(26)15(2)23(27)20-12-11-17-14-18(25)13-16(17)7-4-5-9-21(20)24(28)29-19/h4-5,7,9,11-12,15-22,25-26H,3,6,8,10,13-14H2,1-2H3/b7-4-,9-5+,12-11+/t15-,16-,17-,18-,19+,20+,21-,22+/m1/s1
InChIKey=AVLPWIGYFVTVTB-PTACFXJJSA-N
3: InChI=1S/C24H34O5/c1-3-19-5-4-6-22(26)15(2)23(27)11-10-20-16(9-12-24(28)29-19)7-8-17-13-18(25)14-21(17)20/h7-12,15-22,25-26H,3-6,13-14H2,1-2H3/b11-10+,12-9+/t15-,16+,17-,18-,19+,20-,21-,22+/m1/s1
InChIKey=BINMOURRBYQUKD-MBPIVLONSA-N



'
This work is licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License.

Sunday, March 26, 2017

The Elephant in the Room of Density Functional Theory Calculations

Stig Rune Jensen, Santanu Saha, José A. Flores-Livas, William Huhn, Volker Blum, Stefan Goedecker, and Luca Frediani (2017) (Updated paywalled version)
Contributed by Jan Jensen



While basis set convergence sounds straightforward (though time-consuming) it is hard to rule out that underlying assumptions in  the design of the basis set influences the results.  However, converged basis set DFT results are needed to separate basis set errors from errors due to the functional. Multiwavelets, a systematic and adaptive multiresolution numerical solution of the one-electron problem, appear to be a way around this.

The paper presents PBE and PBE0 total energies, atomization energies, and dipoles moments for 211 molecules that are converged with respect to basis set to μHartree accuracy, and benchmarks Gaussian-type orbitals (GTOs), all-electron numeric atom-centered orbitals (NAOs) and full-potential augmented plane wave (APW) calculations. 

In the case of atomization energies, a quintuple GTO basis set (aug-cc-pV5Z) is needed to reach a 1 kcal/mol accuracy in both MAE and RMSE. For aug-cc-pVQZ the MAE is below 1 kcal/mol, but the RMSE is about 1.5 kcal/mol.  Perhaps more importantly, the maxAE goes from ca 10 to 2-5 kcal/mol on going from quadruple to pentuple basis set.  So even aug-cc-pV5Z cannot consistently reach the basis set limit for atomization energies!  It would have been very interesting to see whether extrapolated-CBS values are able to do this.

This dataset will be an important resource for developers of both DFT and basis sets.


This work is licensed under a Creative Commons Attribution 4.0

Simulation-Based Algorithm for Two-Dimensional Chemical Structure Diagram Generation of Complex Molecules and Ligand–Protein Interactions

Frączek, T. J. Chem. Inform. Model. 2016, 56, 2320-2335
Contributed by Steven Bacharach
Reposted from Computational Organic Chemistry with permission

Making a good drawing of a chemical structure can be a difficult task. One wants to prepare a drawing that provides a variety of different information in a clean and clear way. We tend to want equal bond lengths, angles that are representative of the atom’s hybridization, symmetrical rings, avoided bond crossings, and the absence of overlapping groups. These ideals may be difficult to manage. Sometimes we might also want to represent something about the actual 3-dimensional shape. So for example, the drawing on the left of Figure 1 properly represents the atom connectivity with no bond crossing, but the figure on the right is probably the image all organic chemists would want to see for cubans.

Figure 1. Two drawing of cubane

For another example, the drawing on the left of Figure 2 nicely captures the relative stereo relationships within D-glucose, but the drawing on the right adds in the fact that the cyclohexyl ring is in a chair conformation. Which drawing is better? Well, it likely is in the eye of the beholder, and the context of the chemistry at hand.
Figure 2. Two drawings of D-glucose.

Frączek has reported on an automated procedure for creating aesthetically pleasing 2-D drawings of chemical structures.1 The method involves optimizing distances between atoms projected onto a 2-D plane, along with rules to try to keep atom lengths and angles similar, and symmetrical rings, and minimize overlapping bonds. He shows a number of nice examples, especially of natural products, where his automated procedure PSM (physical simulation method) provides some very nice drawings, often noticeably superior to those generated by previously proposed schemes for preparing drawings.

Using the web site he has developed (http://omnidepict.p.lodz.pl/), I recreated the structures of some of the molecules I have discussed in this blog. In Figure 3, these are shown side-by-side to my drawings. My drawings were generally done with MDL/Isis/Accelrys/Biovia Draw (available for free for academic users) with an eye towards representing what I think is a suitable view of the molecule based on what I am discussing in the blog post. For many molecules, PSM does a very nice job, sometimes better than what I have drawn, but in some cases PSM produces an inferior drawing. Nonetheless, creating nice chemical drawings can be tedious and PSM offers a rapid option, worthy of at least trying out. Ultimately, what we decide to draw and publish is often an aesthetic choice and each individual must decide on one’s own how best to present one’s work.

My Drawing
PSM
Figure 3. Comparison of my drawings vs. drawing made by PSM.


References

1) Frączek, T., "Simulation-Based Algorithm for Two-Dimensional Chemical Structure Diagram Generation of Complex Molecules and Ligand–Protein Interactions." J. Chem. Inform. Model. 2016, 56, 2320-2335, DOI: 10.1021/acs.jcim.6b00391.


'
This work is licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License.