## Saturday, March 10, 2012

### The Energy Computation Paradox and ab initio Protein Folding

J. C. Faver, M. L. Benson, X. He, B. P. Roberts, B. Wang, M. S. Marshall, C. D. Sherrill, K. M. Merz Jr. PLoS ONE 2011, 6(4): e18868 (Open Access)

Protein structure prediction faces two challenges: sampling the vast conformational space and computing accurate energies for the structures.  Actually, as this paper nicely demonstrates, the latter part of this statement is inaccurate (pun intended): the main challenge is computing precise energies the structures.

The accuracy of the energy reports on the systematic error of the energy function which can be corrected and will largely cancel for related structures, but the precision reports on the random error which cannot be corrected for and must therefore be kept low.

The authors first determine the accuracy and precision of 20 different energy functions ranging from force fields to MP2 calculations with triple zeta basis sets for small gas phase models of 42 van der Waals interactions and 50 hydrogen bonds in the protein ubiquitin by comparison to MP2/CBS [or CCSD(T)/CBS for systems containing aromatic groups] calculations.  From this we learn, for example, that the mean error of interaction is 0.73 kcal/mol for the FF99FB force field and 1.67 kcal/mol for PM6.

FF99FB and PM6, as well as PM6-DH2, are then used to compute energies for the Rosetta decoy set consisting of 49 different proteins.  FF99FB fails to predict the lowest energy for the native structure for 19 proteins, whereas PM6 (and PM6-DH2) identifies the native structure in all cases, despite FF99FB being more accurate.  The reason is that FF99FB, with a variance of 4.04 kcal$^2$/mol$^2$, is less precise than PM6, with a variance of 2.24 kcal$^2$/mol$^2$.  For comparison, PM6-DH2 has a mean error of interaction of only 0.30 kcal/mol and a variance of 1.23 kcal$^2$/mol$^2$.

The authors also convincingly demonstrates how the random error increases with protein size, so that correctly folding larger proteins puts more demands on the precision of the energy function.

Acknowledgment: I thank Kenneth Merz, Jr for helpful discussions.