TGTGInsighttelegram intelligenceLIVE / telegram public index
← Graph Machine Learning
Graph Machine Learning avatar

TGINSIGHT POST

Post #861

@graphml

Graph Machine Learning

Vues5,450Nombre de vues
Publié7 sept.07/09/2024 08:46
Contenu

Contenu du post

GraphML News (September 7th) - AF 3 reproductions, AlphaProteo, ORB, Entalpic round Just the first week of September, but already so much news in the protein design and materials science! 🧬 Two AlphaFold 3 reproductions are now available: HelixFold 3 from Baidu (tech report) and AF3 from Ligo Bioscience (no tech report yet). Training HelixFold 3 on PDB and custom data yields results roughly similar to the OG AlphaFold 3 on PoseBusters and CASP 15 - good news for science and reproducibility (and for Nature editors, hehe). Getting more data will be the key to the full reproduction - probably no other lab has as large and diverse dataset as DM and Iso. Meanwhile, Google DeepMind announced AlphaProteo - a generative model for binders conditioned on the target protein and possible binding sites. The preprint has no information about the generative model itself (an educated guess would be either autoregressive transformer or discrete diffusion as a backbone) but the training dataset is similar to that of the full AlphaFold 3. Experimentally, AlphaProteo generates plausible binders in several use-cases like Epstein-Barr virus protein, COVID-19 spike protein, and proteins involved in cancer. 🔮 In the computational materials science, Orbital Materials announced ORB - a family of forcefield models to compute energy, forces, and stresses of atomistic systems (like bulk materials or semiconductors). ORB trained on Alexandria and Materials Project trajectories with the denoising objective (improved Noisy Nodes) yields SOTA on MatBench Discovery outperforming big boys MatterSim from MSR and GNoME from DeepMind. The authors highlight that ORB are non-equivariant GNNs - in fact, the backbone is very similar to the Graph Network Simulator from 2020 with an optional attention interaction. It will be fun to watch equivariant vs non-equivariant folks beating each others SOTA in the next few months 🍿 💸Entalpic, a French materials discovery startup with founders graduated from Mila, announced €8.5m seed round co-lead by Breega, Cathay Innovation and Felicis - congrats to Mathieu, Victor, and Alexandre! Entalpic joins CuspAI and Orbital Materials in the emerging market of DL-based materials discovery companies - we’ll be keeping an eye on their advances. Weekend reading: Two papers from Shuiwang Ji’s lab on SE(3)-invariant 1D tokenization of 3D molecules for autoregressive generation: Geometry Informed Tokenization of Molecules for Language Model Generation - for small molecules on QM9 and Geom-Drugs. Fragment and Geometry Aware Tokenization of Molecules for Structure-Based Drug Design Using Language Models - for generating ligands for protein pockets. Talking about autoregressive molecule generation, Any-Property-Conditional Molecule Generation with Self-Criticism using Spanning Trees is another strong baseline improving spanning tree-based graph generation.