Tuesday, October 30, 2018

Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation

Jiaxuan You, Bowen Liu, Rex Ying, Vijay Pande, Jure Leskovec (2018)
Highlighted by Jan Jensen



Ever since Alán Aspuru-Guzik and co-workers published their seminal paper there has been a flurry of activity on generative models, which is not surprising given that they offer a radically new alternative to screening chemical libraries as a way to discover new molecules.

Almost all the new efforts on generative models have been based on adapting machine learning techniques used for natural language processing to text-based representation of molecules, i.e. SMILES strings. While very promising the SMILES syntax has some quirks which makes them hard to predicts efficiently. One solution is to change the syntax to be more ML-friendly, but this has yet to be tested for generative models.

Another option is to work with a graph (i.e. atoms and bonds) representation of the molecule and this paper is the first I've seen that does that for an ML-based generative model. In this case the ML method is reinforcement learning where the addition of each atom is treated as an action which can be trained to towards a particular outcome, here molecules with certain properties. This approach seems to outperform the SMILES based approaches for the prediction of some properties.

The code is available here.


This work is licensed under a Creative Commons Attribution 4.0 International License.

Friday, October 12, 2018

Teaching an old carbocation new tricks: Intermolecular C–H insertion reactions of vinyl cations

Popov, S.; Shao, B.; Bagdasarian, A. L.; Benton, T. R.; Zou, L.; Yang, Z.; Houk, K. N.; Nelson, H. M., Science 2018, 361, 381
Contributed by Steven Bacharach
Reposted from Computational Organic Chemistry with permission

A recent paper by Papov, Shao, Bagdasarian, Benton, Zou, Yang, Houk, and Nelson uncovers a vinyl cation insertion reaction that once again involves dynamic effects.1

They find that vinyl triflates and cyclic vinyl triflates will react with [Ph3C]+[HCB11Cl11] and triethylsilane to generate vinyl cations that can then be trapped through a C-H insertion reaction. For example, cyclohexenyl triflate 1 reacts in a cyclohexane solvent to give the insertion product 2.


The reactions of isomers 3 and 4 give different ratios of the two products 5 and 6. In both cases, the cyclohexyl is trapped predominantly at the site of the triflate substituent. This means that the mechanism cannot involve a cyclohexene intermediate, since then the two ratios should be identical.


They performed molecular dynamic trajectory analysis at the M062X/6-311+G(d,p) level, starting with the two transition states leading from 3 (TS3) and 4 (TS4), the only transition states located for the insertion reaction. The structures of these TSs are shown in Figure 1.


TS3

TS4
Figure 1. M062X/6-311+G(d,p) optimized geometries of TS3 and TS4.

The trajectories end up in two product basins associated with 5 and 6 starting with either TS3 or TS4. Thus, these transition states are ambimodal, and typical of reactions where dynamic effects dominate. For the reaction of 3, the majority of the trajectories starting at TS3 end up as 5, consistent with the experiments. Similarly, for the trajectories that start at TS4, the majority end up as 6, consistent with experiments.

Once again, we see that relatively simple organic reactions do not follow simple reaction mechanisms, that a single transition state leads to two different products and the product distributions are dependent on reaction dynamics. This may not be too surprising for the vinyl cation insertions given the many examples provide by the Tantillo group of cation rearrangements that are controlled by reaction dynamics (see for examples, this post and this post).


References

1. Popov, S.; Shao, B.; Bagdasarian, A. L.; Benton, T. R.; Zou, L.; Yang, Z.; Houk, K. N.; Nelson, H. M., "Teaching an old carbocation new tricks: Intermolecular C–H insertion reactions of vinyl cations." Science2018361, 381-387, DOI: 10.1126/science.aat5440.


InChIs

1: InChI=1S/C7H10F3O3S/c8-7(9,10)14(11,12,13)6-4-2-1-3-5-6/h4H,1-3,5H2,(H,11,12,13)
InChIKey=CMPVYBNXADJVOM-UHFFFAOYSA-N
2: InChI<=1S/C12H22/c1-3-7-11(8-4-1)12-9-5-2-6-10-12/h11-12H,1-10H2
InChIKey=WVIIMZNLDWSIRH-UHFFFAOYSA-N
3: InChI=1S/C9H14F3O3S/c1-8(2)5-3-7(4-6-8)16(13,14,15)9(10,11)12/h3H,4-6H2,1-2H3,(H,13,14,15)
InChIKey=XDWBLRRAHKBZJR-UHFFFAOYSA-N
4: InChI=1S/C9H14F3O3S/c1-8(2)5-3-4-7(6-8)16(13,14,15)9(10,11)12/h4H,3,5-6H2,1-2H3,(H,13,14,15)
InChIKey=YHVCPSRICQJFDT-UHFFFAOYSA-N
5: InChI=1S/C14H26/c1-14(2)10-8-13(9-11-14)12-6-4-3-5-7-12/h12-13H,3-11H2,1-2H3
InChIKey=BZQBWUOXOYWYJC-UHFFFAOYSA-N
6: InChI=1S/C14H26/c1-14(2)10-6-9-13(11-14)12-7-4-3-5-8-12/h12-13H,3-11H2,1-2H3
InChIKey=AENMAOBTECURBO-UHFFFAOYSA-N


'
This work is licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License.