Contenu du post
GraphML News (September 28th) - AlphaChip, Generate + Novartis deal, MolPhenix NeurIPS results for both tracks have arrived - congrats to those who made it, the datasets track this year was particularly egregious with hard score cutting below average 6.3. Good luck with the final ICLR push and see you in Vancouver! 💻 Google DeepMind presented AlphaChip - the improved version of the famous 2021 Nature paper that introduced the RL agent that uses edge-level GNNs for chip placement - that is, placing dozens of smaller blocks (often implementing certain logical function) on a canvas to optimize common design metrics like HPWL or PPA. The addendum highlights that pre-training with large compute is rather crucial and reports that AlphaChip has been successfully used for several generations of TPUs (25 RL-designed blocks in the latest TPU) as well as for external customers like MediaTek. The paper got some controversial reputation in the chip design community and some professors even argued for retracting the work from Nature for lack of clarity and reproducibility. Over time, however, it seems more like a skill issue of those who tried to replicate it - generally, the level of ML expertise in the chip design community is pretty low (some accepted papers at top venues like DAC are just 🫣) and most university teams are stuck somewhere between MLPs and convnets. Professors gonna hate, Google gonna continue making impactful real-world products, and we will have new pre-trained checkpoints of AlphaChip with some Colab tutorials 🍿. 💸 Generate:Biomedicines (the authors of Chroma, a generative model for protein design) announced collaboration with Novartis resulting in $65M upfront payments and $1B in biobucks (royalties and other performance-based milestones typically split across many years). 🐦 Valence Labs announced MolPhenix, a CLIP-like model to study phenomics (how cells respond to perturbations). Practically, it is trained on pairs of microscopy images and molecules using ViT as image encoder and MolGPS for molecules. Experiments report massive 10x improvements in Top-1% recall of active molecules over previous SOTA 👏. Weekend reading: TabGraphs: A Benchmark and Strong Baselines for Learning on Graphs with Tabular Node Features by Gleb Bazhenov et al - a fresh collection of new graph datasets where features are interpretable (numerical, categorical) - a stark contrast to boring text-attributed graphs or Planetoid datasets with bag-of-words as features. Design of Ligand-Binding Proteins with Atomic Flow Matching by Junqi Liu et al feat. Jian Tang - generate a docked protein-ligand 3D structure conditioned just on 2D ligand graph and protein sequence with flow matching. Outperforms RFDiffusionAA on several metrics.