## Saturday, June 23, 2012

### Accurate Predictions of Nonpolar Solvation Free Energies Require Explicit Consideration of Binding-Site Hydration

ab initio prediction of protein-ligand binding free energies
Fast computers and clever algorithms have made it possible to calculate protein-ligand binding energies ab initio using levels of theory that have been shown to give accurate gas phase binding energies.  PCM or closely related approaches like COSMO, that are the solvation methods of choice for most ab initio studies have now also been implemented in a way that allows for calculations on protein size systems.  However, the combined use of ab initio and PCM (without any further fitting) typically results in absolute protein-ligand binding energies that bear little resemblance to experimental values.

Trouble from an unexpected source
Given the complexity of the problem there are of course many possible sources of error but in 2010 Ryde and co-workers$^1$ identified one major source of error that surprised me.  They used thermodynamic integration (TI) to compute the non-polar solvation free energy contribution $(\Delta G_{np})$ to the binding free energy of benzene to T4 lysozyme mutant L99A and showed that the $\Delta G_{np}$ computed by PCM was off by almost 50 kJ/mol!  For comparison, an approach based on the solvent accessible surface area (SASA) came within 4 kJ/mol of the TI result.

As both experiment and calculations suggested that the binding pocket was solvent free in the ligand-free state, the PCM energy of the apoprotein was recomputed with a "dummy" benzene molecule in the pocket to yield a $\Delta G_{np}$ energy within 2 kJ/mol of the TI results.

Exposed binding pockets are even trickier
In the current paper Ryde and co-workers perform similar comparisons for four additional protein-ligand complexes with varying degrees of solvent exposure of the empty binding binding pocket.  The dummy-ligand approach works surprisingly well for the two proteins with the most buried binding pockets ($\Delta G_{np}$ are within 2.4 kJ/mol of the TI results).  "Surprisingly" because both experiment and MD calculations indicate that several water molecules are displaced from the binding pocket upon ligand binding.

The dummy-ligand approach does not work well for the more expose binding pockets with deviations from TI-results as large as 73 kJ/mol for galectin-3.  While this is not too surprising, the traditional (non-dummy ligand approach) approach is not much better, with an error of 58 kJ/mol for galectin-3.  For comparison the SASA-based prediction is in error by 28 kJ/mol: better, but not really useful.  Clearly, treating water molecules bound in very solvent-exposed pockets as a continuum is not a good approximation with the current models of non-polar solvation.

New approaches needed
The question now is whether a hybrid PCM/explicit solvation approach is needed or whether the problem can be fixed by reformulating and/or parameterizing the non-polar solvation energy terms.  For the altter approach the authors indicate that the cavitation term may present a particular challenge as it includes the entropy change due to the water molecules displaced upon binding.

References
1. S. Genheden, J. Kongsted, P. Söderhjelm, and U. Ryde J. Chem Theory, Comput 2010, 6, 3558. DOI: 10.1021/ct100272s

## Wednesday, June 20, 2012

### Tunnelling control of chemical reactions – the organic chemist’s perspective

Ley, D.; Gerbig, D.; Schreiner, P. R. Org. Biomol. Chem., 2012, 10, 3781 (Paywall)
Contributed by Steven Bachrach.
Reposted from Computational Organic Chemistry with permission

Schreiner has written a very nice review of the role of tunneling in organic chemistry.1 This includes tunneling in the conformations of carboxylic acids and in hydrogen abstractions. But the major emphasis is on his own group’s contributions regarding tunneling on a variety of hydroxycarbenes (see these posts: cyclopropylhydroxycarbenemethylhydroxycarbenephenylhydroxycarbenedihydroxycarbene, and hydroxymethylene). This led to the development of a third means for controlling reactions: not just kinetic and thermodynamic control, but tunneling control as well.

Recommended reading for anyone interested in learning how quantum mechanical tunneling can have very real-world chemical consequences.

### Accurate ab Initio Spin Densities

Katharina Boguslawski, Konrad H. Marti, Örs Legeza and Markus Reiher Journal of Chemical Theory and Computation 2012, 8, 1970 (Paywall)

The spin density problem
Obtaining accurate spin density distributions and predicting the correct ground state from a number of close lying states of different spin is a challenging problem in quantum chemistry, particularly when transition metal systems are considered. This paper highlights how density-matrix renomarlization group1 (DMRG) based methods can be used to calculate spin density distributions for molecules that are too large to be treated by complete-active-space self-consistent-field (CASSCF) methods. What I found particularly exciting is the prospect of using DMRG results in the benchmarking and development of new density functionals.

Benchmarking DMRG
The electronic structure of a simple model system of iron nitrosyl is manipulated by surrounding it with point-charges (simulating ligands), adjusting the position of these charges means the system becomes either single- or multi-reference. The results convincingly show how DMRG can converge to a CASSCF(7,7) reference spin density by controlling the number of reference active-system states in the DMRG. The DMRG calculations were carried out using the Reiher group's Qc-Dmrg-ETH code.

Spin densities for large active spaces and comparison to DFT
A problem arrises in CASSCF when the number of active electrons and orbitals required to correctly describe the system becomes too large (this article suggests 18 electrons in 18 orbitals as a practical limit). The paper goes on to show that much larger active spaces are possible with DMRG and, more importantly, that the spin density converges very quickly both with respect to active space and the number of active-system states. The resulting spin densities are then compared to a number of common density functionals, none of which accurately reproduces the DMRG result. Whilst DMRG calculations are not yet commonplace in the computational chemistry community, this paper convinces me that they will have an important role to play in solving the spin density problem.

References
(1) For an introduction to DMRG see: Chan, G. K.-L.; Dorando, J. J.; Ghosh, D.; Hachmann. J.; Neuscamman, E.; Wang, H.;Yanai, T. An Introduction to the Density Matrix Renormalization Group Ansatz in Quantum Chemistry. In Frontiers in Quantum Systems in Chemistry and Physics, 1st ed.; Wilson, S., Grout, P. J.; Maruani, J., Delgado-Barrio, G., Piecuch, P., Eds.; Springer: Dordrecht, The Netherlands, 2008; Vol. 18, pp 49-65. arXiv:0711.1398v1.

## Tuesday, June 5, 2012

### Jailbreaking Benzene Dimers

A. Yu. Rogachev, X.-D. Wen, R. Hoffmann Journal of the American Chemical Society 2012, 134, 8062. (Paywall)
Contributed by Steven Bachrach.
Reposted from Computational Organic Chemistry with permission

Hoffmann1 reports on a number of new benzene dimer structures, notably 5-8, whose RIJCOSX-MP2/cc-pVTZ2 structures are shown in Figure 1. A few of these new dimers are only somewhat higher in energy than the known dimers 1-4. The energies of these dimers, relative to two isolated benzene molecules, are listed in Table 1.
 1 2 3 4 5 6 7 8
Figure 1. RIJCOSX-MP2/cc-pVTZ optimized geometries of 1-8.
Table 1. Energy (kcal mol-1) of the dimers relative to two benzene molecules and activation energy for reversion to two benzene molecules.
 Compound Erel Eact 1 50.9 29 2 49.9 3 38.2 9 4 58.7 19 5 71.9 30 6 49.9 36 7 60.8 27 8 98.8 28
The energy for reversion of the isomers 5-8 to two isolated benzene molecules is calculated to be fairly large, and so they should be stable relative to that decomposition mode. They also examined a series of other decomposition modes, including [1,5]-hydrogen migration, all of which had barriers of 21 kcal mol-1 or greater, retrocyclization ([2+2]), for which they could not locate transition states, electrocyclic ring opening (Cope), with barriers of at least 17 kcal mol-1 and dimerization – some of which had relatively small enthalpic barriers of 4-5 kcal mol-1. However, the dimerizations all have very unfavorable entropic activation barriers.
So, the conclusion is that all of the novel dimers (4-8) can be reasonable expected to hang around for some time and therefore are potential synthetic targets.

### References

(1) Rogachev, A. Yu.; Wen, X.-D.; Hoffmann, R. "Jailbreaking Benzene Dimers," J.
Am. Chem. Soc.
2012134, 8062-8065, DOI:10.1021/ja302597r
(2) Kossmann, S.; Neese, F. "Efficient Structure Optimization with Second-Order Many-Body Perturbation Theory: The RIJCOSX-MP2 Method," J. Chem. Theory Comput.20106, 2325-2338, DOI:10.1021/ct100199k

### InChIs

1: InChI=1S/C12H12/c1-2-6-10-9(5-1)11-7-3-4-8-12(10)11/h1-12H/t9-,10+,11-,12+
InChIKey=WMPWOGVJEXSFLI-UHFFFAOYSA-N
2: InChI=1S/C12H12/c1-2-6-10-9(5-1)11-7-3-4-8-12(10)11/h1-12H/t9-,10+,11+,12-
InChIKey=WMPWOGVJEXSFLI-IWDIQUIJSA-N
3: InChI=1S/C12H12/c1-2-4-12-10-7-5-9(6-8-10)11(12)3-1/h1-12H/t9?,10?,11-,12+
InChIKey=ONVDJSCNMCYFTI-CAODYFQJSA-N
4: InChI=1S/C12H12/c1-2-10-4-3-9(1)11-5-7-12(10)8-6-11/h1-12H
InChIKey=BCBHEUOKKNYIAT-UHFFFAOYSA-N
5: InChI=1S/C12H12/c1-2-6-10-9(5-1)11-7-3-4-8-12(10)11/h1-12H/t9-,10-,11+,12+/m1/s1
InChIKey=WMPWOGVJEXSFLI-WYUUTHIRSA-N
6: InChI=1S/C12H12/c1-2-4-12-10-7-5-9(6-8-10)11(12)3-1/h1-12H/t9?,10?,11-,12-/m0/s1
InChIKey=ONVDJSCNMCYFTI-QQFIATSDSA-N
7: InChI=1S/C12H12/c1-2-6-10-9(5-1)11-7-3-4-8-12(10)11/h1-12H/t9-,10-,11-,12+/m1/s1
InChIKey=WMPWOGVJEXSFLI-KKOKHZNYSA-N
8: InChI=1S/C12H12/c1-2-6-10-9(5-1)11-7-3-4-8-12(10)11/h1-12H/t9-,10-,11-,12-
InChIKey=WMPWOGVJEXSFLI-NQYKUJLISA-N

## Sunday, June 3, 2012

### Interpreting Protein Structural Dynamics from NMR Chemical Shifts

The effect of dynamical averaging on chemical shift predictions
In this work snapshots from MD simulations of are used as input to chemical shift predictors Sparta+, ShiftX+, Camshift, and ShiftS to compute average chemical shifts for two homologous ribonuclease H enzymes from Escherichia coli (ecRNH) and Thermus thermophilus (ttRNH).  What I especially liked about this article is not only the thoroughness: all the most popular chemical shift predictors were compared and the averaging is based on 100 ns MD simulations, one of which was tested against a 1000 ns MD simulation, but also how clearly the results were analyzed and presented.

Does averaging improve overall accuracy?
In the case of ecRNH, RMSDs from experiment computed from averaged chemical shifts are always more accurate than those computed from the x-ray structure for all nuclei and methods tested - on average by 23%, while for ttRNH the improvement is much less and for methods like ShiftX+ averaging consistently leads to slightly worse results. The resolution of the x-ray structures of ecRNH and ttRNH are 2.8 Å and 1.5 Å, respectively, so some of the improvement for ecRNH presumably derives from errors in the low resolution x-ray structure that gets fixed by the MD, and the authors describe one concrete example of this.

Identifying and analyzing interesting residues
However, as the authors point out RMSDs only tell part of the story.  One interesting analysis tool employed in the study is a plot of $|\delta_{x-ray}-\delta_{Exp}|-|\delta_{MD}-\delta_{Exp}|$ as a function of residue number to identify where averaging significantly improves the prediction and where it makes it significantly worse compared to using the x-ray structure.  Some of these cases are then chosen for further analysis to highlight the different underlying structural causes.  Here the authors use another powerful tool: a plot of the probability distribution of chemical shifts [$P(\delta)$], which in many of these cases is bi-modal, i.e. it clearly shows that the average chemical shift is not the most likely chemical shift and that it represents at least two distinct conformations.

The size of the ensemble: dynamical vs conformational averaging
Most of the averages presented in the paper are computed using 22,200 protein structures, but looking at the data makes me wonder how many conformations it really represents.  For example, in the case of Sparta+ C$_\alpha$ chemical shift predictions for ttRNH, averaging leads to significant improvements over x-ray for only about 3-4 residues out of about 145 (and worse predictions for three other residues!).

Furthermore, the peak or peaks in the $P(\delta)$-plots appear quite symmetric in the examples shown, which means that the most probable chemical shift from that peak is a good approximation to the average value computed using that peak.  I wouldn't be surprised if one could identify 10-20 ttRNH structures that would lead to equally accurate average chemical shift values.

Empirical vs QM-based chemical shift predictors
Finally, the authors note an interesting case (Phe27 C') where ShiftS, which is parameterized based on QM-data, gives a completely different $P(\delta)$-plot compared to the other methods (although an equally accurate average chemical shift value for that particular residue!).  I would love to see a detailed analysis of this issue before completely ruling out the ShiftS results.