Sunday, June 25, 2017

A Deep Neural Network with Minimal Chemistry Knowledge Matches the Performance of Expert-developed QSAR/QSPR Models

Highlighted by Jan Jensen

Figure 1: The key difference in using deep learning algorithms as a machine learning tool as opposed to a “machine intelligence” tool is the assistance, augmentation and possible replacement, for human-led tasks like feature engineering in computational chemistry.

A lot of machine learning research in chemistry is focussed on finding the best descriptors for the property of interest.  This paper shows that simply using 2D images of molecules leads to similarly accurate predictions of solvation free energies, in vitro HIV activity, and in vivo toxicity. 

This seems to me an appropriate "null-model" that all machine learning studies should include. Another option would be SMILES strings or some representation thereof. If your fancy descriptor doesn't lead to significantly better predictions then it's back to the drawing board.

The manuscript doesn't mention code availability but one of the co-authors tells me that they plan to make to code available when it is ready.