Contenu du post
GraphML News (April 6th) - Leash Bio Round, The BELKA Kaggle Competition, Sparse Ops speedups 💸 Leash Biosciences (founded by ex-Recursion folks) announced a $9.3M seed round led by Springtide Ventures. Leash focuses on building huge proprietary datasets for protein-molecule interactions. 🐿️ At the same time, Leash launched a new Kaggle competition on predicting the binding affinity of small molecules to proteins using the Big Encoded Library for Chemical Assessment (BELKA). The dataset contains about 133M small molecules vs 3 proteins (sEH, BRD4 and HSA). Protein-ligand binding diffusion models like DiffDock are allowed as well. Who will win: comp bio folks with domain expertise or Kaggle grandmasters with expertise on finding data leakages? 🤔 We’ll see in 3 months. 📈 Zhongming Yu and the team from UCSD, Stanford, and Intel released GeOT - a tensor centric library for GNNs via efficient segment reduction on GPU. The library ships efficient CUDA kernels for sparse operations like scatter summation and fused message-aggregation kernels. On average, GeOT brings 1.7-3.5x speedups over PyG sparse ops, and 2.3-3.6x over PyG dense ops. Looking forward seeing the kernels in major libraries and, hopefully, the Triton version. 🧜♂️ Recently, I played around quite a lot writing Triton kernels for fusing message and aggregation steps of several GNN architectures into one kernel call and can highly recommend trying to speed up your models with them. Triton kernels are written in Python (saving your feet from C++ code shootings), are compiled automatically into efficient code on several platforms (CUDA, ROCm, and even Intel GPUs), and are often faster than CUDA kernels. Weekend reading (UC San Diego was on fire this week): GeoT: Tensor Centric Library for Graph Neural Network via Efficient Segment Reduction on GPU On the Theoretical Expressive Power and the Design Space of Higher-Order Graph Transformers by Cai Zhou, Rose Yu, Yusu Wang - the authors prove that k-order graph transformers are not more expressive than k-WL unless positional encodings are supplied. The results extend nicely our work Attending to Graph Transformers (recently accepted to TMLR) DE-HNN: An effective neural model for Circuit Netlist representation by Zhishang Luo feat. Yusu Wang) - properly representing analog and digital circuits is a big pain in the chip design community. This work demonstrated the benefits of using directed hypergraphs for netlists and proposed a new big dataset for experiments.