Jiace Suna and Garnet Kin-Lic Chan (2026)
Highlighted by Jan Jensen

Anyway, tensor contraction is the algebraic core of much of quantum chemistry: large multidimensional arrays representing amplitudes and integrals are multiplied and summed over shared indices to produce energies and intermediates. It matters because these contractions set the scaling wall for methods like CCSD(T), where the formal cost rises far faster than Hartree–Fock.
This study uses importance samplling to evaluate the tensor contraction, Importance sampling means drawing the most important terms in a sum more often than the unimportant ones, while reweighting so the final estimator stays unbiased. Here, Sun and Chan use it to evaluate high-order tensor contractions stochastically.
The headline result is that stochastic tensor contraction (STC) drives the scaling of CCSD(T) down dramatically: from the usual O(N^6) and O(N^7) down to O(N^4). In practice, water-cluster tests show very large FLOP reductions and wall-time crossovers at surprisingly small sizes.
Figure 7 in the paper is the real selling point, because it compares against the incumbent approximate workhorse, DLPNO-CCSD(T), on 20 realistic molecules. STC is faster than DLPNO for every system in the set, with speedups ranging from 2.5× to 32×, while also delivering smaller errors than all DLPNO/Normal results and 15 of 20 DLPNO/Tight results. Just as importantly, the STC errors stay tightly clustered around the chosen target of 0.2 kcal/mol, whereas DLPNO errors vary much more from system to system. That makes STC look not just fast, but controllable.
Table 3 sharpens that message. Averaged over the benchmark set, STC has a mean absolute error of 0.2 kcal/mol at a geometric mean runtime of 10.7 min, compared with 3.00 kcal/mol / 58 min for DLPNO/Normal, 0.70 kcal/mol / 159 min for DLPNO/Tight, and 773 min for exact CCSD(T). So the paper’s central claim is not merely better asymptotic scaling, but a roughly order-of-magnitude win in both time and error relative to state-of-the-art local correlation in this benchmark.
One caveat: while the speed-up is undeniably impressive, another likely limiting factor is memory. The paper notes the use of density fitting “to reduce memory requirements,” but does not really quantify memory use or memory scaling in the same systematic way as FLOPs and wall time. Given that modern CC implementations are often limited as much by storage and movement of intermediates as by raw arithmetic, that omission stands out.
Overall, this is prototype code, but very exciting prototype code. It will be very interesting to see whether this stochastic route can mature into something that genuinely displaces DLPNO-CCSD(T) as the default reduced-cost gold-standard method. Code: GitHub repository

This work is licensed under a Creative Commons Attribution 4.0 International License.
No comments:
Post a Comment