Sunday, February 26, 2017

Towards full Quantum Mechanics based Protein-Ligand Binding Affinities

Stephan Ehrlich, Andreas H. Göller, and Stefan Grimme (2017)
Contributed by Jan Jensen

Erlich et al. presents absolute binding free energies for activated serine protease factor X (FXa) and tyrosine-protein kinase 2 predicted using DFT. Here I'll focus on FXa. The calculations are based on truncated model systems consisting of ca 1000 atoms. The geometries are optimised using HF-3c/C-PCM and select constraints, the RRHO free energy correction with DFTB3-D3, the electronic energy with PBE-3c, and the solvation free energy with COSMO-RS and PBE0/def-SVP. The energy terms are simply added together to give a total free energy and the binding free energy is simply the change in free energy upon binding without any additional corrections.

The MAD is similar to that found for host-guest complexes but there are clearly some outliers. The authors ascribe L19 and L27 to errors in the structures due to HF-3c artefacts, while L23 is ascribed to the movement of a crystal water molecule and L10 is the only charged ligand where the error in the solvation free energy is likely higher. The error is below 1.5 kcal/mol for 14 of the 25 ligands.

Clearly there is room for improvement but I do think the results are quite encouraging. A MM-PB(GB)SA study in which five different solvation models are tested for the same ligands found maximum $r$ values of 0.28 and 0.60 using ensemble averaged and energy minimised structures respectively. Furthermore, study determined the relative binding free energies using thermodynamic integration, which is generally considered the current gold standard in the drug design, for five ligand pairs (see table, energies in kcal/mol). Given that there is only five points any statistical analysis of the accuracy would be suspect, but I don't think TI can be said to outperform DFT.

DFT TI exp
5->18 4.0 -0.4 1.1
5->12 2.1 0.4 0.4
5->21 7.5 1.3 4.4
5->17 4.1 -0.2 -0.3
5->24 4.1 0.4 3.6

The real question is whether the DFT results can be systematically improved and the main sticking point here will ultimately be the solvation free energy, especially for charged ligands. The continuum model ultimately relies on a fit to experimental data so there is some degree of empiricism that is hard to remove. In principle it can be done by adding explicit water molecules but then the question is how to deal with the sampling in a cost effective way.

This work is licensed under a Creative Commons Attribution 4.0