## Saturday, June 23, 2012

### Accurate Predictions of Nonpolar Solvation Free Energies Require Explicit Consideration of Binding-Site Hydration

ab initio prediction of protein-ligand binding free energies
Fast computers and clever algorithms have made it possible to calculate protein-ligand binding energies ab initio using levels of theory that have been shown to give accurate gas phase binding energies.  PCM or closely related approaches like COSMO, that are the solvation methods of choice for most ab initio studies have now also been implemented in a way that allows for calculations on protein size systems.  However, the combined use of ab initio and PCM (without any further fitting) typically results in absolute protein-ligand binding energies that bear little resemblance to experimental values.

Trouble from an unexpected source
Given the complexity of the problem there are of course many possible sources of error but in 2010 Ryde and co-workers$^1$ identified one major source of error that surprised me.  They used thermodynamic integration (TI) to compute the non-polar solvation free energy contribution $(\Delta G_{np})$ to the binding free energy of benzene to T4 lysozyme mutant L99A and showed that the $\Delta G_{np}$ computed by PCM was off by almost 50 kJ/mol!  For comparison, an approach based on the solvent accessible surface area (SASA) came within 4 kJ/mol of the TI result.

As both experiment and calculations suggested that the binding pocket was solvent free in the ligand-free state, the PCM energy of the apoprotein was recomputed with a "dummy" benzene molecule in the pocket to yield a $\Delta G_{np}$ energy within 2 kJ/mol of the TI results.

Exposed binding pockets are even trickier
In the current paper Ryde and co-workers perform similar comparisons for four additional protein-ligand complexes with varying degrees of solvent exposure of the empty binding binding pocket.  The dummy-ligand approach works surprisingly well for the two proteins with the most buried binding pockets ($\Delta G_{np}$ are within 2.4 kJ/mol of the TI results).  "Surprisingly" because both experiment and MD calculations indicate that several water molecules are displaced from the binding pocket upon ligand binding.

The dummy-ligand approach does not work well for the more expose binding pockets with deviations from TI-results as large as 73 kJ/mol for galectin-3.  While this is not too surprising, the traditional (non-dummy ligand approach) approach is not much better, with an error of 58 kJ/mol for galectin-3.  For comparison the SASA-based prediction is in error by 28 kJ/mol: better, but not really useful.  Clearly, treating water molecules bound in very solvent-exposed pockets as a continuum is not a good approximation with the current models of non-polar solvation.

New approaches needed
The question now is whether a hybrid PCM/explicit solvation approach is needed or whether the problem can be fixed by reformulating and/or parameterizing the non-polar solvation energy terms.  For the altter approach the authors indicate that the cavitation term may present a particular challenge as it includes the entropy change due to the water molecules displaced upon binding.

References
1. S. Genheden, J. Kongsted, P. Söderhjelm, and U. Ryde J. Chem Theory, Comput 2010, 6, 3558. DOI: 10.1021/ct100272s