Friday, September 28, 2012

A simple, exact DFT embedding scheme

Fredrick R. Manby, Martina Stella, Jason D. Goodpaster and Thomas F. Miller III

JCTC, 2012, 8, 2564-2568

 

Embedding methods have become a useful tool to perform molecular electronic structure calculations on large systems at relatively modest computational cost, and (hopefully) acceptable accuracy, compared to full blown calculations on the entire system. These methods rely on the short-range nature of chemical interactions between molecular fragments, which allows (most) systems to be subdivided. 

Manby and co-workers have developed embedding schemes that enforce Pauli exclusion by a projection technique to ensure orthogonality of the orbitals of the interacting subsystems. The authors use the fact that Kohn-Sham density functional theory (KS-DFT) provides a framework to perform exact calculations on a subsystem embedded in its full QM environment. This is achieved by partitioning the total density into subsystem densities. If the subsystem densities are constructed with non orthogonal orbitals, a non-additive term arises in the kinetic energy.  However, if the orbitals are orthogonal, this term vanishes. Methods to enforce orthogonality have been employed previously. Manby et al exploit one of these methods to formulate a formally exact DFT embedding scheme. To this end they employ a level shifting operator to keep the orbitals of  a subsystem orthogonal to those of other subsystems. They show that increasing the value of the level shifting parameter reduces the error in energy up to a point (numerical instabilities arise at very large values). The  interesting innovation comes in the use of perturbation theory to eliminate the dependence on the level shifting parameter.

The authors present applications with different embedding schemes combining DFT and wave-function methods. The first example consists of embedding the hydroxyl moiety of ethanol in the environment of the ethyl backbone. DFT-in-DFT embedding calculations at the PBE/6-31G* level with the level-shift and perturbation correction agree with the full DFT calculation to 7 pEH. The second example consists of the deprotonation reaction of ethanol in gas-phase. The PBE result on the full system is 10 mEH lower than the CCSD(T) reference. This can be compared with CCSD(T)-in-PBE embedding results which are as close as 1.5 mEH. Similar trends are observed for the activation barrier for the SN2 reaction of chloride with propyl chloride and water trimer.

As the authors note, this method is "limited to applications for which the electronic structure can be reasonably described using KS theory". However, it provides a simple and straightforward method to perform embedding calculations and can be easily implemented in most electronic structure programs.

Wednesday, September 19, 2012

Dynamics, transition states, and timing of bond formation in Diels–Alder reactions

Black, K.; Liu, P.; Xu, L.; Doubleday, C.; Houk, K. N. Proc. Nat. Acad. Sci. USA, 2012, 109, 12860 (Paywall)
Contributed by Steven Bachrach.
Reposted from Computational Organic Chemistry with permission

Has there been an organic reaction more examined by computational methods than the Diels-Alder reaction? You’d think we would have covered all aspects of this reaction by now, but no, it appears that this reaction remains fertile hunting grounds.

Doubleday and Houk have examined the Diels-Alder reaction with an eye towards its synchronicity,1 an area that Houk has delved into throughout his career. While most experiments show significant stereoselectivity, a few examples display a small amount of stereo loss. Computed transition states tend to have forming C-C bond distances that are similar, though with proper asymmetric substitution, the asymmetry of the TS can be substantial. In this paper,1 they utilize reaction dynamics specifically to assess the time differential between the formation of the two new C-C single bonds. They examined the eight reactions shown below. The first six (R1-R6) have symmetric transition states, though with the random sampling about the TS for the initial condition of the trajectories, a majority of asymmetric starting conditions are used. The last two (R7 and R8) reactions have asymmetric TSs and the random sampling amplifies this asymmetry.
Nonetheless, the results of the dynamics are striking. The time gap, the average time between the formations of the first and second new C-C bond, for R1-R6 is less than 5 fs, much shorter than a C-C vibration. These reactions must be considered as concerted and synchronous. Even the last two reactions (R7 and R8), which are inherently more asymmetric, still have very short time gaps of 15 and 56 fs, respectively. One might therefore reasonably conclude that they too are concerted and synchronous.

There are some exceptions – a few trajectories in the last two reactions involve a long-lived (~1000 fs) diradical intermediate. At very high temperature, about 2% of the trajectories invoke a diradical intermediate. But the overall message is clear: the Diels-Alder reaction is inherently concerted and synchronous.

References
(1) Black, K.; Liu, P.; Xu, L.; Doubleday, C.; Houk, K. N. "Dynamics, transition states, and timing of bond formation in Diels–Alder reactions," Proc. Nat. Acad. Sci. USA2012109, 12860-12865, DOI:10.1073/pnas.1209316109

Tuesday, September 11, 2012

The Last Globally Stable Extended Alkane

Lüttschwager, N. O. B.; Wassermann, T. N.; Mata, R. A.; Suhm, M. A. Angew. Chem. Int. Ed. 2012, ASAP
Contributed by Steven Bachrach.
Reposted from Computational Organic Chemistry with permission

The role of dispersion in understanding organic chemistry, both structure and reactivity, is truly coming into prominence (see for example this blog post for a compound whose stability is the result of dispersion). This has been driven in part by new computational techniques to properly account for dispersion. An interesting recent example is the structure of long chain alkanes, with a question posed and answered by Mata and Suhm:1 What is the largest alkane whose most stable conformation is the extended chain?

The question is attacked by computation and experiment. The computational methodology involves corrections to the local MP2-F12 energy involving the separation of orbital pairs that are treated with a coupled clusters method. The straight chain (having only anti arrangements about the C-C bonds) and the hairpin conformer (having three gauche arrangements) were completely optimized. The C17H36hairpin isomer is shown in Figure 1. For chains with 16 or fewer carbons, the all-anti straight chain is lower in energy, but for chains with 17 or more carbon atoms, the hairpin is lower in energy. Gas-phase low temperature IR and Raman spectra suggest that dominance of the hairpin occurs when the chain has 18 carbons, though careful analysis suggests that this is likely an upper bound. At least tentatively the answer to the question is that heptadecane is likely the longest alkane with a straight chain, but further lower temperature experiments are needed to see if the C16 chain might fold as well.

Figure 1. Optimized geometry of the hairpin conformation of heptadecane.

(I thank Dr. Peter Schreiner for bringing this paper to my attention.)

References
(1) Lüttschwager, N. O. B.; Wassermann, T. N.; Mata, R. A.; Suhm, M. A. "The Last Globally Stable Extended Alkane," Angew. Chem. Int. Ed. 2012, ASAP, DOI: 10.1002/anie.201202894.

InChIs

Heptadecane: InChI=1S/C17H36/c1-3-5-7-9-11-13-15-17-16-14-12-10-8-6-4-2/h3-17H2,1-2H3
InChIKey=NDJKXXJCMXVBJW-UHFFFAOYSA-N

Sunday, September 9, 2012

Dispersion corrections and bio-molecular structure and reactivity

Richard Lonsdale, Jeremy N. Harvey, and Adrian J. Mulholland "Effects of Dispersion in Density Functional Based QM/ MM Calculations on Cytochrome P450 Catalysed Reactions"Journal of Chemical Theory and Computation 2012, ASAP (Paywall)


The DFT dispersion correction developed by Grimme and co-workers was recently highlighted in Computation Chemistry Highlights. Recently two papers have appeared that quantify the importance of the dispersion correction on modeling bio-molecular structure and reactivity.

Cytochrome P450 barrier heights in better agreement with experiment using B3LYP-D2 and -D3
Lonsdale et al. used QM/MM to compute barrier heights for oxidation reactions, catalyzed by P450$_{cam}$, with an without dispersion corrections in the QM region.  Invariably the dispersion correction lowered the barrier significantly (usually by ca 5 kcal/mol), yielding results that were in better agreement with experimental values.  The effect of the dispersion correction on the transition state geometries was less pronounces though not negligible, with bond lengths changing by as much as 0.2 Å.

It is worth noting that the QM region contains a conjugated porphyrin ring and that three of the four substrates considered in the study contain one or more double bonds.  Thus, the QM region contains very polarizable functional groups and, since the magnitude of dispersion interactions increases with the polarizability of the groups involved, it is possible that the effect of dispersion on barrier heights for other enzymes will be less than observed here.  It will be interesting to find out.

MP2 quality Trp-cage structure using RHF-D
Nagata et al. have implemented analytical MP2/PCM gradients for the fragment molecular orbital method and used it to geometry optimize the small protein Trp-cage at the MP2/6-31(+)G(d) level of theory [the (+) indicates diffuse functions on carboxylate groups].  The resulting structure compared well with the experimental NMR structures, with a backbone RMSD of only 0.426 Å. This is a significant improvement in agreement compared to the corresponding RHF/PCM optimized structure (RMSD 1.107 Å) and demonstrates the importance dispersion in bio-molecular structure.  Interestingly, the corresponding RHF-D structure compared equally well to experiment (0.414 Å) and was virtually identical to the MP2 structure (RMSD 0.068 Å).

Disclaimer: I was involved in the implementation of the FMO RHF/PCM interface.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License.  

Wednesday, August 29, 2012

Effect of Hydrogen Bonds on pKa Values: Importance of Networking

Shokri, A.; Abedin, A.; Fattahi, A.; Kass, S. R.  J. Am. Chem. Soc. 2012, 134, 10646 (Paywall)
Contributed by Steven Bachrach.
Reposted from Computational Organic Chemistry with permission

Can a hydrogen bonding network affect acidity? Kass has examined the polyol 1 whose conjugate base1cb can potentially be stabilized by a large hydrogen bonding network.1 Kass had previously found a significant acidy enhancement in comparing t-butanol (ΔG(deprotonation) = 369.2 kcal mol-1) with that of 2 (ΔG(deprotonation) = 334.4 kcal mol-1).2

Table 1 lists the computed and experimental free energies of deprotonation of 1. The experimental values are computed at M06-2x/maug-cc-pVT(+d)Z. The structure of 1cb is drawn in Figure 1.

Table 1. Computed and experimental free energies (kcal mol-1 of deprotonation of some alcohols.

MO6-2x
Expt
t-butanol
368.6
369.3
2
335.0
334.4
1
320.2
313.5
The difference in the acidity of t-butanol and 2, some 30 kcal mol-1, reflects the stability afforded by three intramolecular hydrogen bonds to the oxyanion. In going from 2cb to 1cb, each of the hydroxyl groups that donate to the oxyanion act as the acceptor of a hydrogen bond from the more removed hydroxyl groups. There is in effect a first and second layer of hydrogen bond network in 1cb. These secondary hydrogen bonds lead to further stabilization of the anion, as reflected in the diminished DPE of 1 over 2: 320.2 vs. 335.0 kcal mol-1. Note that this secondary layer does not stabilize the anion to the same degree as the primary layer, but nonetheless its effect is large and quite striking.

1cb
Figure 1. M06-2x/maug-cc-pVT(+d)Z optimized structure of 1cb.

Even in solution these more remote hydrogen bonds can stabilize the anion. So, using the CPCM approach and modeling DMSO, 2 is predicted have a pKa that is 15 units below that of t-butanol, and 1is predicted to be 3 pKa units more acidic than 2. Experiments verify this prediction with the pKas of 16.1 for 2 and 11.4 for 1.

References

(1) Shokri, A.; Abedin, A.; Fattahi, A.; Kass, S. R. "Effect of Hydrogen Bonds on pKa Values: Importance of Networking," J. Am. Chem. Soc. 2012134, 10646-10650, DOI: 10.1021/ja3037349
(2) Tian, Z.; Fattahi, A.; Lis, L.; Kass, S. R. "Single-Centered Hydrogen-Bonded Enhanced Acidity (SHEA) Acids: A New Class of Brønsted Acids," J. Am. Chem. Soc. 2009131, 16984-16988, DOI:10.1021/ja9075106

InChIs

1: InChI=1S/C13H28O7/c14-4-1-10(17)7-13(20,8-11(18)2-5-15)9-12(19)3-6-16/h10-12,14-20H,1-9H2
InChIKey=HGTVPOTWAYDRSM-UHFFFAOYSA-N
1cb: InChI=1S/C13H27O7/c14-4-1-10(17)7-13(20,8-11(18)2-5-15)9-12(19)3-6-16/h10-12,14-19H,1-9H2/q-1
InChIKey=UQPPTNIHRICITD-UHFFFAOYSA-N
2: InChI=1S/C7H16O4/c8-4-1-7(11,2-5-9)3-6-10/h8-11H,1-6H2
InChIKey=FAQWYKIIWYVDPQ-UHFFFAOYSA-N
2cb: InChI=1S/C7H15O4/c8-4-1-7(11,2-5-9)3-6-10/h8-10H,1-6H2/q-1
InChIKey=WSCPTRIAWKZJFZ-UHFFFAOYSA-N



Thursday, August 23, 2012

Refinement of protein structure homology models via long, all-atom molecular dynamics simulations

Alpan Raval, Stefano Piana, Michael P. Eastwood, Ron O. Dror, and David E. Shaw, Proteins 2012, 80, 2071-2079 (Paywall)
Contributed by Victor Guallar


Many theoretical chemists work routinely on biological systems and, in particular, on proteins. While it might not be their main interest, predicting the conformational sampling associated to these systems is certainly a concern.  Those who have been around for a while have seen how the necessary conformational sampling has moved from few picoseconds to hundreds of nanoseconds and even microseconds (while I do not agree, molecular dynamics has almost the exclusivity as a sampling technique). Clearly the latest development of special-purpose computers, such as the remarkable effort from the D. E. Shaw Research group, together with the development of molecular dynamics for graphical processing units, have contributed to this time expansion. Along these advances we surely had the following questions: are the force fields up to it?, how meaningful are these long molecular dynamics simulations?

The Shaw group has probably already answered these questions for us. In a comprehensive study1 they produce at least a hundred microseconds simulation for 24 proteins used in recent CASP competitions. They frame their study under the capabilities of molecular dynamics (and force fields) in refining homology models. Thus, for each system they produce a trajectory from both an initial homology model and from the native X-ray structure (or NMR). This study followed a previous one where the simulations were capable of accurately reproducing the native state on several fast-folders. The results this time, however, are quite surprising and even worrisome. For most of the systems the structures drift away from the native state. Furthermore, this drift occurs even when starting from the native state. Overall the results indicate that for most systems the force field minimum is not consistent with the X-ray or NMR experimental structures. While the authors only used two force fields (considered to be the best ones), they conclude that most likely this is a limitation for all available force fields.

The authors obtain better results when imposing constraints to the simulation (limiting the drift away from the native structure). Thus, one can conclude from this work that brute force molecular dynamics simulations are still far away from being accurate. Obviously similar conclusion could be applied to other sampling techniques using the same force fields (for example Monte Carlo techniques). While we wait for better force fields (maybe polarizable ones such AMOEBA?), we probably should use molecular dynamics as a local exploration rather than to predict novel conformations, or to score significantly different ones. Of course these limitations might not apply to those systems with a strong preference for a state, such as fast-folder proteins.

References
Alpan Raval, Stefano Piana, Michael P. Eastwood,Ron O. Dror, and David E. ShawProteins 2012, 80, 2071-2079 

Saturday, August 11, 2012

Transition states of native and drug-resistant HIV-1 protease are the same



What is the rate determining step?
The general features of the HIV-1 protease mechanism appear to be 1) nucleophilic attack on the amide carbonyl group by a water molecule to form a gem-diol intermediate, 2) protonation of the amide nitrogen, and 3) cleavage of the amide C-N bond.

As pointed out in the introduction, computational studies such as Okimoto et al. and. Piana et al. both  seem to indicate that Step 2 is the rate limiting step.

However, a closer look at these studies suggest that the issue is not well settled.  Okimoto et al.'s (small gas phase) model suggests that the gem-diol intermediate is more stable than the enzyme-substrate complex and computed the barrier of the second step relative to this intermediate.  Piana et al.'s (QM/MM) model suggests that the gem-diol intermediate is less stable than the enzyme-substrate complex and computed the barrier of the second step relative to this enzyme substrate.

If one switches the reference state in both studies Okimoto et al.'s model would predict the first two steps to have essentially the same barrier, while Piana et al.'s model would predict the first step to be rate determining.  Furthermore, the primary $^{15}$N isotope effect computed by Piana et al. (0.97$\pm$0.2) is larger than the experimental value obtained by Rodrigez et al. (0.995$\pm$0.002) (large isotope effects mean significant deviations from 1).

Kinetic isotope effects confirm the rate limiting step and the transition state structure
Schramm and co-workers have used labelled substrates to measure primary $^{14}$C and $^{15}$N and secondary $^3$H and $^{18}$O isotope effects.  Furthermore, they used small gas phase models to compute the corresponding isotope effects for all five stationary points at the ONIOM (M06-2X/6-31+G**:am1) level of theory.

Overall the measured isotope effects best match the computed values for the transition state for Step 2, which is therefore likely to be the rate determining step.  This isotope effect was computed relative to the enzyme-substrate complex; it would be interesting to recompute this value relative to the gem-diol intermediate, which is sufficiently stable to be observed experimentally at low-pH conditions (Das et al.)

Transition states and drug design
Very similar isotope effects was also measured for a mutant (I84V) which has displayed resistance to all nine FDA approved-inhibitors, indicating that the transition state structure in this mutant is quite similar to that of the wild-type.  Thus, transition state-mimics would likely inhibit both forms of the enzyme and may lead to new inhibitors that are less prone to resistance.

Acknowledgement: Thanks to Luca De Vico for alerting me to this paper

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License.