Wednesday, April 11, 2018

A Quintuple [6]Helicene with a Corannulene Core as a C5-Symmetric Propeller-Shaped π-System

Kato, K.; Segawa, Y.; Scott, L. T.; Itami, K., Angew. Chem. Int. Ed. 2018, 57, 1337-1341
Contributed by Steven Bacharach
Reposted from Computational Organic Chemistry with permission

Corannulene 1 is an interesting aromatic compound because it is nonplanar, having a bowl shape. [6]helicene is an interesting aromatic compound because it is nonplanar, having the shape of a helix. Kato, Segawa, Scott and Itami have joined these together to synthesize the interesting quintuple helicene compound 3.1
The optimized structure of 3 is shown in Figure 1. They utilized computations to corroborate two experimental findings. First, the NMR spectra of 3 shows a small number of signals indicating that the bowl inversion should be rapid. The molecule has C5 symmetry due to the bowl shape of the corannulene core. Rapid inversion makes the molecule effectively D5. (The inversion transition state is of D5 symmetry, and would be a nice quiz question for those looking for molecules of unusual point groups.) The B3LYP/6-31G(d) computed bowl inversion barrier is only 1.9 kcal mol-1, significantly less that the bowl inversion barrier of 1: 10.4 kcal mol-1. This reduction is partly due to the shallower bowl depth of 3 (0.572 Å in the x-ray structure, 0.325 Å in the computed structure) than in 1 (0.87 Å).

Figure 1. Optimized structure of 3.

Second, they took the enhanced MMMMM-isomer and heated it to obtain the thermodynamic properties for the inversion to the PPPPP-isomer. (The PPPPP-isomer is shown in the top scheme.) The experimental values are ΔH = 36.8 kcal mol-1, ΔS = 8.7 cal mol-1 K-1, and ΔG = 34.2 kcal mol-1 at 298 K. They computed all of the stereoisomers of 3 along with the transition states connecting them. The largest barrier is found in going from MMMMM3 to MMMMP3 with a computed barrier of 34.5 kcal mol-1, in nice agreement with experiment.


1. Kato, K.; Segawa, Y.; Scott, L. T.; Itami, K., "A Quintuple [6]Helicene with a Corannulene Core as a C5-Symmetric Propeller-Shaped π-System." Angew. Chem. Int. Ed. 201857, 1337-1341, DOI: 10.1002/anie.201711985.


1: InChI=1S/C20H10/c1-2-12-5-6-14-9-10-15-8-7-13-4-3-11(1)16-17(12)19(14)20(15)18(13)16/h1-10H
2: InChI=1S/C26H16/c1-3-7-22-17(5-1)9-11-19-13-15-21-16-14-20-12-10-18-6-2-4-8-23(18)25(20)26(21)24(19)22/h1-16H
3: InChI=1S/C80H40/c1-11-31-51-41(21-1)42-22-2-12-32-52(42)62-61(51)71-63-53-33-13-3-23-43(53)44-24-4-14-34-54(44)64(63)73-67-57-37-17-7-27-47(57)48-28-8-18-38-58(48)68(67)75-70-60-40-20-10-30-50(60)49-29-9-19-39-59(49)69(70)74-66-56-36-16-6-26-46(56)45-25-5-15-35-55(45)65(66)72(62)77-76(71)78(73)80(75)79(74)77/h1-40H

This work is licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License.

Friday, March 30, 2018

Planning chemical syntheses with deep neural networks and symbolic AI

Marwin H. S. Segler, Mike Preuss, Mark P. Waller (2018)
Highlighted by Jan Jensen

Figure 1 from the paper. Copyright 2018 Springer Nature

The paper uses a Monte Carlo tree search (MCTS) algorithm (also used in AlphaGo Zero) to suggest retrosynthetic routes that were just as good as those proposed by expert organic chemist. Remarkably the underlying "expert knowledge" is automatically extracted from reaction databases into three neural networks. Thus, the method is referred to as 3N-MCTS.

At the core of this approach are two neural networks that can predict the probability of a molecule undergoing one of either 301,671 or 17,134 chemical transformations, the latter being more computationally efficient than the former. The networks were trained on tranformation rules from 12.4 million single-step reactions from the Reaxys chemistry database, i.e. determined automatically without human intervention.
The retrosynthetic "game" is won if the target molecule can be completely decomposed into predefined precursor molecules within 25 retrosynthetic steps, where the 50 most probable chemical transformations are considered for each step. It is not practically possible to test all $50^{25} \approx 10^{40}$ possible retrosynthetic paths so a MCTS is used to search for the best path.

A MCTS starts by evaluating a number of paths randomly and then assigning likelihood scores to the early parts of the paths depending on whether the paths lead to winners or not. The process is then repeated except that the early steps in the path are chosen based on likelihood scores, which are continuously updated and added to unscored steps.  The changing likelihood scores means that the search for new paths is directed towards the more promising areas of the path tree. I have given a short illustration of the process here. The process is repeated for a given number of steps and the path with the best set of likelihood scores is selected.

One of the tests of the method was a double blind study where experienced synthetic chemists were asked to choose between retrosynthetic routes developed by experts and by 3N-MCTS. The study found no clear preference!

I couldn't find any information about code availability.

Tuesday, March 27, 2018

Beyond optical rotation: what’s left is not always right in total synthesis

Joyce, L. A.; Nawrat, C. C.; Sherer, E. C.; Biba, M.; Brunskill, A.; Martin, G. E.; Cohen, R. D.; Davies, I. W., Chem. Sci. 2018, 9, 415
Contributed by Steven Bacharach
Reposted from Computational Organic Chemistry with permission

The structure of (+)-frondosin B 1 has been the subject of some concern. The compound has been synthesized by a number of research groups with the expected R isomer as the target. However, the Danishefsky1 and MacMillan2 synthesis led to a molecule with [α]D of about +16°, while Trauner3 reports a value of -16.8° and Ovaska4 prepared the S isomer with [α]D = -17.3°. Something is amiss here.

Joyce and coworkers have looked into this structure problem through a combination of advanced analytical techniques and computational chemistry.5 They utilize optical activity, electronic circular dichroism (ECD) and vibrational circular dichroism (VCD) and compare the experiments with computational results. IR and VCD were computed at B3LYP/6-31G** using a Boltzmann-weighted set of low-energy conformations. ECD computations were done at CAM-B3LYP/6-31++G**//B3LYP/6-31G**.

Basically, they found that (+)-frondosin B does have the R stereocenter. The different synthetic schemes did actually all lead to the same isomer, tested by looking at key intermediates along the way. The discrepancy in the optical activity is due to a small impurity, 2, that has the opposite rotation and a magnitude 10 times greater than that of authentic 1.

This paper is another nice example demonstrating the power of modern computational approaches to spectra that can be extremely valuable in structure determination. Organic chemists of all stripes should certainly be aware of how this tool can complement experiments.

My thanks to Derek Lowe who posted on this paper in his In The Pipeline blog.


1) Inoue, M.; Carson, M. W.; Frontier, A. J.; Danishefsky, S. J., "Total Synthesis and Determination of the Absolute Configuration of Frondosin B." J. Am. Chem. Soc.
2001123, 1878-1889, DOI: 10.1021/ja0021060.
2) Reiter, M.; Torssell, S.; Lee, S.; MacMillan, D. W. C., "The organocatalytic three-step total synthesis of (+)-frondosin B." Chem. Sci. 20101, 37-42, DOI: 10.1039/C0SC00204F.
3) Hughes, C. C.; Trauner, D., "Palladium-catalyzed couplings to nucleophilic heteroarenes: the total synthesis of (−)-frondosin B." Tetrahedron 200460, 9675-9686, DOI: 10.1016/j.tet.2004.07.041.
4) Ovaska, T. V.; Sullivan, J. A.; Ovaska, S. I.; Winegrad, J. B.; Fair, J. D., "Asymmetric Synthesis of Seven-Membered Carbocyclic Rings via a Sequential Oxyanionic 5-Exo-Dig Cyclization/Claisen Rearrangement Process. Total Synthesis of (−)-Frondosin B." Org. Letters 200911, 2715-2718, DOI: 10.1021/ol900967j.
5) Joyce, L. A.; Nawrat, C. C.; Sherer, E. C.; Biba, M.; Brunskill, A.; Martin, G. E.; Cohen, R. D.; Davies, I. W., "Beyond optical rotation: what’s left is not always right in total synthesis." Chem. Sci. 20189, 415-424, DOI: 10.1039/C7SC04249C.


1: InChI=1S/C20H24O2/c1-12-6-8-16-14(5-4-10-20(16,2)3)18-15-11-13(21)7-9-17(15)22-19(12)18/h7,9,11-12,21H,4-6,8,10H2,1-3H3/t12-/m1/s1
2: InChI=1S/C20H24O2/c1-12-5-4-10-20(3)16(12)8-6-13(2)19-18(20)15-11-14(21)7-9-17(15)22-19/h7,9,11,13,21H,4-6,8,10H2,1-3H3/t13-,20-/m1/s1

This work is licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License.

Wednesday, March 14, 2018

DeePCG: A Deep Neural Network Molecular Force Field

DeePCG: constructing coarse-grained models via deep neural networks. L Zhang, J Han, H Wang, R Car, Weinan E. arXiv:1802.08549v2 [physics.chem-ph]
Contributed by Jesper Madsen

The idea of “learning” a molecular force field (FF) using neural networks can be traced back to Blank et al. in 1995.[1] Modern variations (reviewed recently by Behler[2]), such as the DeePCG scheme[3] that I highlight here, seem to have two key innovations to set them apart from earlier work: network depth and atomic environment descriptors. The latter was the topic of my recent highlight and Zhang et al.[3] take advantage of similar ideas.
Figure 1: “Schematic plot of the neural network input for the environment of CG particle i, using water as an example. Red and white balls represent the oxygen and the hydrogen atoms of the microscopic system, respectively. Purple balls denote CG particles, which, in our example, are centered at the positions of the oxygens.)” from ref. [3]    
Zhang et al. simulate liquid water using ab initio molecular dynamics (AIMD) on the DFT/PBE0 level of theory in order to train a coarse-grained (CG) molecular water model. The training is done by a standard protocol used in CGing where mean forces are fitted by minimizing a loss-function (the natural choice is the residual sum of squares) over the sampled configurations. CGing liquid water is difficult because of the necessity of many-body contributions to interactions, especially so upon integrating out degrees-of-freedom. One would therefore expect that a FF capable of capturing such many-body effects to perform well, just as DeePCG does, and I think this is a very nice example of exactly how much can be gained by using faithful representations of atomic neighborhoods instead of radially symmetric pair potentials. Recall that traditional force-matching, while provably exact in the limit of the complete many-body expansion,[4] still shows non-negligible deviations from the target distributions for most simple liquids when standard approximations are used.

FF transferability, however, is likely where the current grand challenge is to be found. Zhang et al. remark that it would be convenient to have an accurate yet cheap (e.g., CG) model for describing phase transitions in water. They do not attempt this in the current preprint paper, but I suspect that it is not *that* easy to make a decent CG model that can correctly get subtle long-range correlations right at various densities, let alone different phases of water and ice, coexistences, interfaces, impurities (non-water moieties), etc. Machine-learnt potentials continuously demonstrate excellent accuracy over the parameterization space of states or configurations, but for transferability and extrapolations, we are still waiting to see how far they can get.


[1] Neural network models of potential energy surfaces. TB Blank, SD Brown, AW Calhoun, DJ Doren. J Chem Phys 103, 4129 (1995)
[2] Perspective: Machine learning potentials for atomistic simulations. J Behler. J Chem Phys 145, 170901 (2016)
[3] DeePCG: constructing coarse-grained models via deep neural networks. L Zhang, J Han, H Wang, R Car, Weinan E. arXiv:1802.08549v2 [physics.chem-ph]
[4] The multiscale coarse-graining method. I. A rigorous bridge between atomistic and coarse-grained models. WG Noid, J-W Chu, GS Ayton, V Krishna, S Izvekov, GA Voth, A Das, HC Andersen. J Chem Phys 128, 244114 (2008)