Sunday, October 30, 2016

Automatic chemical design using a data-driven continuous representation of molecules


Rafael Gómez-Bombarelli, David Duvenaud, José Miguel Hernández-Lobato, Jorge Aguilera-Iparraguirre, Timothy D. Hirzel, Ryan P. Adams, and Alán Aspuru-Guzik (2016)
Contributed by Jan Jensen



Chemical space is discrete which makes it hard to search with standard techniques such as gradient-based minimisation.  This paper used a standard machine learning tool called an autoencoder to help solve that problem.  One way to think of an autoencoder is as a data-compressor where one neural network is trained to describe a data set such as an image in some compressed representation and another network is trained to recover the image from the compressed format.

The interesting thing in the context of chemical space is that the compressed format can be a continuous function such as a real-valued vector (latent space). (Another use of autoencoders is dimensionality reduction for data visualization, e.g. as an alternative to principal component analysis.)  This latent space is therefore a continuous representation of the chemical space (a set of SMILES strings) that the autoencoder was trained on.  Another neural net can then be trained to map some chemical property, such as logP values, on this latent space and the space can be searched for regions with desired logP values with techniques as simple as interpolation.

One problem with autoencoders is that they are "lossy" which in this case translates to the fact that not all points in latent space can be decoded to a valid molecule (SMILES string) but the failure rate is relatively low for the two proof-of-concept applications in the paper.

This is a very interesting new tool in the hunt for molecules with new properties.


This work is licensed under a Creative Commons Attribution 4.0

18 comments:

  1. In the nutshell, information driven methodologies is on wheels overwhelming the whole range. data science course in pune

    ReplyDelete
  2. Well, the most on top staying topic is Data Science. Data science is one of the most promising technique in the growing world. I would like to add Data science training to the preference list. Out of all, Data science course in Mumbai
    is making a huge difference all across the country. Thank you so much for showing your work and thank you so much for this wonderful article.

    ReplyDelete
  3. There are some interesting points in time in this article but I don?t know if I see all of them center to heart. There is some validity but I will take hold opinion until I look into it further. Good article , thanks and we want more! Added to FeedBurner as well
    fighting games tekken 3

    ReplyDelete
  4. Attend The Data Science Courses in Bangalore From ExcelR. Practical Data Science Courses in Bangalore Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Data Science Courses in Bangalore.
    ExcelR Data Science Course Bangalore

    ReplyDelete
  5. Youre so cool! I don't suppose I've read anything like this before. So nice to find somebody with some original thoughts on this subject. really thank you for starting this up. this website is something that is needed on the web, someone with a little originality. useful job for bringing something new to the internet!
    Game Thrust

    ReplyDelete
  6. Such a very useful article. I have learn some new information.thanks for sharing.
    data scientist course in mumbai

    ReplyDelete
  7. Such a very useful article. I have learn some new information.thanks for sharing.
    data scientist course in mumbai

    ReplyDelete
  8. Nice blog Thank you very much for the information you shared.
    data science

    ReplyDelete
  9. I was blown out after viewing the article which you have shared over here. So I just wanted to express my opinion on Data Analytics, as this is best trending medium to promote or to circulate the updates, happenings, knowledge sharing.. Aspirants & professionals are keeping a close eye on Data Analytics Course in Mumbai to equip it as their primary skill.

    ReplyDelete
  10. Such a very useful Blog. Very interesting to read this article. I have learn some new information.thanks for sharing. know more about

    ReplyDelete
  11. Such a very useful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article.
    ExcelR data analytics

    ReplyDelete
  12. Great post i must say and thanks for the information. Education is definitely a sticky subject. However, is still among the leading topics of our time. I appreciate your post and look forward to more.
    ExcelR Business Analytics Course

    ReplyDelete
  13. Awesome..I read this post so nice and very imformative information...thanks for sharing
    Click here for data science course

    ReplyDelete
  14. Attend The Data Analytics Course in Bangalore From ExcelR. Practical Data Analytics Course in Bangalore Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Data Analytics Course in Bangalore.
    ExcelR Data Analytics Course in Bangalore

    ReplyDelete
  15. Great post i must say and thanks for the information. Education is definitely a sticky subject. However, is still among the leading topics of our time. I appreciate your post and look forward to more.
    ExcelR data science course in mumbai

    ReplyDelete
  16. This is a wonderful article, Given so much info in it, These type of articles keeps the users interest in the website, and keep on sharing more ... good luck.
    ExcelR data analytics courses

    ReplyDelete
  17. Great post i must say and thanks for the information. Education is definitely a sticky subject. However, is still among the leading topics of our time. I appreciate your post and look forward to more. excelr data science

    ReplyDelete