Post #840

@graphml

Graph Machine Learning

Vues7,070Nombre de vues

Publié11 mai11/05/2024 15:55

Contenu

Contenu du post

GraphML News (May 11th) - AlphaFold 3 🧬 Google DeepMind and Isomorphic Labs announced AlphaFold 3 going beyond proteins and extending structure prediction capabilities to RNA, DNA, and small molecules (ligands). AF3 employs Pairformer (improved Evoformer) as an encoder and a diffusion model for generating 3D coordinates. Yes, AF3 demonstrates huge gains in structural biology tasks compared to previous models, but perhaps the hottest take from the Nature preprint is: > Similarly to some recent work, we find that no invariance or equivariance with respect to global rotations and translation of the molecule are required in the architecture and so we omit them to simplify the machine learning architecture. 🔥 For reference, AF2 used SE(3)-equivariant attention that spun off a great deal of research in equivariance and geometry for structural biology. The new statement took researchers at ICLR by storm: do we need to invest time and efforts into complex math and group theory if vanilla non-equivariant transformer and diffusion trained on 48 random augmentations can beat other geometric models with baked-in equivariances? AF3 used rather modest compute (compared to LLMs) - 256 A100s for 10 days of pretraining and 10 days of finetuning (overall roughly $420K on Azure) - and it seems to be enough to send a wake-up call to the Geometric DL community. 🤔 Does the bitter lesson strike again? Is it easier to learn symmetries from data and augmentations (classical 2016 paper by Taco Cohen and Max Welling) rather than enforcing those constraints in the model? Maybe it’s the task (DNA and RNA structure prediction) that does not have explicit symmetries to bake into a model? It it quite likely that equivariant models can achieve a similar result - but with higher compute and inference costs - is it still worth it? The inference argument looks quite plausible - foundation models (be it LLMs or AF) run billions of inference passes, if you can save 2x inference time by not doing expensive math and just use longer pre-training, the total serving costs are also reduced. Those will be the main questions in the community on social media and conferences in 2024. Besides that, researchers can use the AlphaFold Server for custom inference jobs - we welcome comp bio folks into the world (thanks OpenAI and Anthropic) of paid API access and proprietary models 😉 Still, given the pace of OS community (at least two ongoing re-implementations 1, 2), relatively easy model, and modest training compute, it might take <6 months to replicate a model similar to AF3 in performance.