Findings hits among billions of molecules

The recent developments in make-on-demand molecular libraries present an interesting methodological challenge to virtual screening. Not too long ago, such a library would have hundreds of millions and even 1 billion molecules and there was still a chance to dock a significant portion of these libraries. However, the sizes of the libraries have grown to well beyond 20 billion and show no sign of stopping. There is no way wholesale docking can keep up with this growth so new approaches are needed. 

One computational approach that has kept up with the growth of make-on-demand libraries is similarity searching. It is still possible to search these enormous libraries for similar molecules in just a few minutes. 

Alon et al. uses this general idea to select and dock 490 million molecules with properties that are similar to known binders to the target. Based on the docking scores they prioritised 577 molecules of which 484 were successfully made and 127 showed good activity against the target. 20,000 analogues of the four best candidates are then extracted from among 28 billion molecules in the Enamine REAL Space make-on-demand library, and docked. The 105 best candidates were made and tested leading to further improvement in the measured affinities.

Sadybekov et al. essentially docks the individual building blocks used in the make-on-demand library and then combined the best-scoring fragments into about 1 million molecules for a second round of docking. Using this approach they identified 80 promising candidates of which 60 could be synthesised. Of these 60 molecules, 21 proved active. 920 analogues of the three best candidates are then extracted from among 11 billion molecules in the Enamine REAL Space make-on-demand library, and docked. The 121 best candidates were made and tested leading to further improvement in the measured affinities.

There are several take home messages here. 

The percentage of active compounds against a particular target in library is very small, so you don't get a lot of useful hits until you work with these enormous libraries.

Docking does help in identifying active compounds. Docking has a bad rep in certain circles and I have seen several people refer to them as "random number generators" but studies like these show that this is not the case. Sure, if one expects an excellent, or even respectable, correlation coefficient between docking scores and binding affinities, one will be sorely disappointed.  However, as these studies show, molecules with good docking scores have a much higher chance at being active than molecules with bad docking scores. 

The success rate seems to be about 30-50% depending on the target. So if you are in the lower end and only able to make and test a handful of candidates (which is often the case for academic studies), there's a reasonable chance you won't find any actives and conclude that docking is useless. It's only when you are able to make and test dozens of molecules that you see that docking is working for you. The make-on-demand libraries now makes such numbers feasible for academics.

Finally, several of the co-authors on the two papers I highlight are Ukrainian and are, along with their families and friends, likely in grave danger right now as their country is being attacked by Putin and his ilk. 

