TGTGInsighttelegram intelligenceLIVE / telegram public index
Retour aux chaînes
Graph Machine Learning avatar

TGINSIGHT CHAT

Graph Machine Learning

@graphml

Technologies

Everything about graph theory, computer science, machine learning, etc. If you have something worth sharing with the community, reach out @gimmeblues, @chaitjo. Admins: Sergey Ivanov; Michael Galkin; Chaitanya K. Joshi

Abonnés6,750Abonnés actuels de la chaîne
Posts indexés877Nombre de posts indexés
Portée récente40,570Somme des vues récentes
Posts récents

Posts récents

Page 15 sur 74 · 877 posts

Publié 28 nov.

Denoising Diffusion Is All You Need in Graph ML? - Now on Medium We just published the extended version of the posts on diffusion models on Medium with more spelled out intro and newly generated images by Stable Diffusion! A good option to spend the time if you are on the way to New Orleans and NeurIPS.

3,540 views

Publié 25 nov.

Friday News: LOG Accepted Papers, NeurIPS The inaugural event of the Learning of Graphs (LOG) conference announced accepted papers, extended abstracts, and spotlights - the acceptance rate this year is pretty tough (<25%) but we have heard multiple times that the quality of reviews is on average higher than in other big conferences. Is it the impact of the $$$ rewards for the best reviewers? Tech companies summarize their presence at NeurIPS’22 that starts next week: have a look at works from DeepMind, Amazon, Microsoft, and the GraphML team from Google. A new blog post by Petar Veličković and Fabian Fuchs on universality of neural networks on sets and graphs - the authors identify a direct link between permutation-invariant DeepSets and permutation-invariant aggregations in GNNs like GIN. However, when it comes to multisets (such as nodes sending exactly the same message), PNA might be more expressive thanks to the link to the theoretical findings - given a set of n elements, that the width of the encoder should be at least n - recall that PNA postulates that it is necessary to have n aggregators. Nice read with references!

3,440 views

Publié 21 nov.

OGB Large Scale Challenge 2022 Winners The OGB team has just announced the winners of the annual Large Scale Challenge - the Graphcore team celebrates a double win: top-1 entries in graph regression and link prediction! Notably, most of the winning top-3 models in all tracks are ensembles. The final results: Track 1: Node Classification on MAG240M: 1) Baidu - RUniMP and positional encodings 2) Michigan State Uni / TigerGraph - APPNP + relational GAT, 10-model ensemble 3) Beijing Institute of Technology / Zhipu AI / Tsinghua University - MDGNN, 15-model ensemble Track 2: Link prediction and KG completion on WikiKG90M v2: 1) Graphcore - ensemble of 85 shallow embedding models 2) Microsoft Research Asia - ensemble of 13 shallow models and 10 manual features 3) Tencent - ensemble of 6 models Track 3: Graph Regression on PCQM4M v2: 1) Graphcore / Mila / Valence Discovery - GPS++ (the improved version of GraphGPS covered in this channel), 112 model ensemble 2) Microsoft Research AI4Science - Transformer-M + ViSNet, 22 model ensemble 2) NVIDIA / UCLA - ensemble of Transformer-M, GNN, and ResNet MSR and NVIDIA share the joint 2nd place due to reporting exactly the same MAE on the test set.

3,680 views

Publié 20 nov.

Denoising Diffusion Is Still All You Need (Part 2) 4️⃣DiffLinker from Igashov, Stärk, and EPFL / MSR / Oxford co-authors is the diffusion model for generating molecular linkers conditioned on 3D fragments. While previous models are autoregressive (hence not permutation equivariant) and can only link 2 fragments, DiffLinker generates the whole structure and can link 2+ fragments. In DiffLinker, each point cloud is conditioned on the context (all other known fragments and/or protein pocket), the context is usually fixed. The diffusion framework is similar to EDM but is now conditioned on the 3D data rather than on scalars. The denoising model is the same equivariant EGNN. Interestingly, DiffLinked has an additional module to predict the linker size (number of molecules) so you don’t have to specify it beforehand. The code is available, too! Even more:SMCDiff for generating protein scaffolds conditioned on a desired motif (also with EGNN). Generally, in graph and molecule generation we’d like to support some discreteness, so any improvements to the discrete diffusion are very welcome, eg, Richemond, Dieleman, and Doucet propose a new simplex diffusion for categorical data with the Cox-Ingersoll-Ross SDE (rare find!). Discrete diffusion is also studied for text generation in the recent DiffusER. We’ll spare your browser tabs for now 😅 but do expect more diffusion models in Geometric DL!

3,130 views

Publié 20 nov.

Denoising Diffusion Is Still All You Need (Weekend Reading) In the previous post in June we covered the emergence of denoising diffusion models (DDPMs) in generative Geometric DL tasks. After 5 months, we can acknowledge that diffusion models became the first-class citizen in Geometric DL with numerous works appearing on arxiv every week. Here are 4 recent and very interesting works you might want to check: 1️⃣DiGress by Clemént Vignac, Igor Krawczuk, and the EPFL team is the unconditional graph generation model (although with the possibility to incorporate a score-based function for conditioning on graph-level features like energy MAE). DiGress is a discrete diffusion model, that is, it operates on discrete node types (like atom types C, N, O) and edge types (like single / double / triple bond) where adding noise corresponds to multiplication with the transition matrix (from one type to another) mined as marginal probabilities from the training set. The denoising neural net is a modified Graph Transformer. Works for many graph families - planar, SBMs, and molecules, code is available, and check the video from the reading group presentation! 2️⃣DiffDock by Gabriele Corso, Hannes Stärk, Bowen Jing, and the MIT team is the score-based generative model for molecular docking, eg, given a ligand and a protein, predicting how a ligand binds to a target protein. DiffDock runs the diffusion process over translations T(3), rotations SO(3), and torsion angles SO(2)^m in the product space: (1) positioning of the ligand wrt the protein (often called binding pockets), the pocket is unknown in advance so it is blind docking, (2) defining rotational orientation of the ligand, and (3) defining torsion angles of the conformation. DiffDock trains 2 models: the score model for predicting actual coordinates and the confidence model for estimating the likelihood of the generated prediction. Both models are SE(3)-equivariant networks over point clouds, but the heavier score model works on protein residues from alpha-carbons (initialized from the now-famous ESM2 protein LM) while the confidence model uses the fine-grained atom representations. Initial ligand structures are generated by RDKit. DiffDock dramatically improves the prediction quality, code is available, and you can even upload your own proteins (PDB) and ligands (SMILES) in the online demo on HuggingFace spaces to test it out! 3️⃣DiffSBDD by Schneuing, Du, and the team from EPFL, Cornell, Cambridge, MSR, USTC, Oxford is the diffusion model for generating novel ligands conditioned on the protein pocket. DiffSBDD can be implemented with 2 approaches: (1) pocket-conditioned ligand generation when the pocket is fixed; (2) inpainting-like generation that approximates the joint distribution of pocket-ligand pairs. In both approaches, DiffSBDD relies on the tuned equivariant diffusion model (EDM, ICML 2022) and equivariant EGNN as the denoising model. Practically, ligands and proteins are represented as point clouds with categorical features and 3D coordinates (proteins can be alpha-carbon residues or full atoms, one-hot encoding of residues — ESM2 could be used here in future), so diffusion is performed over the 3D coordinates ensuring equivariance. The code is already available!

2,790 views

Publié 13 nov.

​​Max Welling, Regina Barzilay, Michael Bronstein @ AI Helps Ukraine Charity Conference Mila hosts a fundraising conference *AI Helps Ukraine: Charity Conference*. The goal of the conference is to raise funds to support Ukraine with medicines and humanitarian aid. It will consist of a series of online talks during the month of November and an in-person event on the 8th of December at Mila under the theme AI for Good. The online part will feature the talks of renowned AI researchers including Yoshua Bengio, Max Welling, Alexei Efros,Regina Barzilay, Timnit Gebru and Michael Bronstein. A stellar lineup for Graph ML research! We encourage you to support this wonderful initiative. - Tomorrow (Nov 14th) 4pm UTC Max Welling will be giving a talk on Generating and Steering Molecules with ML and RL. - Nov 21st - Regina Barzilay will talk about Expanding the reach of molecular models in the drug discovery space - Finally, Michael Bronstein (as recurring Graph Santa) will give a talk after LOG closer to Christmas. Recordings of previous talks by Timnit Gebru, Yoshua Bengio, Irina Rish are already available, the full schedule with all speakers and the in-person event is up to date on the website.

3,390 views

Publié 11 nov.

GraphML News (11.11.22) This week was dominated by molecular ML and drug discovery events: - Broad Institute published a YouTube playlist of talks from the recent Machine Learning and Drug Discovery Symposium and leading drug discovery researchers. - ELLIS Machine Learning for Drug Discovery Workshop will take place online in Zoom and GatherTown on Nov 28th, registration is free! - Valence Discovery launched a blog platform for Drug Discovery related posts, the inaugural post by Clemens Isert talks about Quantum ML for drug-like molecules. And a new work from GemNet authors: How robust are modern graph neural network potentials in long and hot molecular dynamics simulations? The work supports a recent line of works (see the Halloween post) on ML force fields particularly focusing on hot dynamics where low test MAE does not necessarily correspond to good simulations. The most important robustness factor seems to be more training data!

3,640 views

Publié 5 nov.

Weekend Reading For those who are not busy with ICLR rebuttals — you can now have a look at all accepted NeurIPS’22 papers on OpenReview (we will have a review of graph papers at NeurIPS a bit later). Meanwhile, the week brought several cool new works: Are Defenses for Graph Neural Networks Robust? by Felix Mujkanovic, Simon Geisler, Stephan Günnemann, Aleksandar Bojchevski. Probably THE most comprehensive work of 2022 on adversarial robustness of GNNs. TuneUp: A Training Strategy for Improving Generalization of Graph Neural Networks by Weihua Hu, Kaidi Cao, Kexin Huang, Edward W Huang, Karthik Subbian, Jure Leskovec. The paper introduces a new self-supervised strategy by asking the model to generalize better on tail nodes of the graph after some synthetic edge dropout. Works in node classification, link prediction, and recsys. Neural Set Function Extensions: Learning with Discrete Functions in High Dimensions by Nikolaos Karalias, Joshua David Robinson, Andreas Loukas, Stefanie Jegelka. Insightful theoretical work on set functions and discrete learning. Particularly good results on combinatorial optimization problems like max clique and max independent set.

3,880 views

Publié 1 nov.

ESM Metagenomic Atlas Meta AI just published the ESM Metagenomic Atlas - a collection of >600M metagenomic protein structures built with ESMFold - the most recent model from Meta for protein folding. We covered ESMFold a few months ago, and both ESM-2 and ESMFold are available in the recent 🤗Transformers 4.24 release (checkpoints for 8M - 3B models for ESM2, a full checkpoint for ESMFold). That’s a nice flex from Meta AI after DeepMind released 200M AlphaFold predictions for PDB, the community definitely benefits from the competition.

3,930 views

Publié 31 oct.

Halloween Paper Reading🎃 We hope you managed to procure enough candies and carve spooky faces on a bunch of pumpkins those days so now you can relax and read a few papers (not that spooky). Molecular dynamics is one of the booming Geometric DL areas where equivariant models show the best qualities. The two cool recent papers on that topic: ⚛️Forces are not Enough: Benchmark and Critical Evaluation for Machine Learning Force Fields with Molecular Simulations by Fu et al. introduces a new benchmark for molecular dynamics - in addition to MD17, the authors add datasets on modeling liquids (Water), peptides (Alanine dipeptide), and solid-state materials (LiPS). More importantly, apart from Energy as the main metric, the authors consider a wide range of physical properties like Stability, Diffusivity, and Radial Distribution Functions. Most SOTA molecular dynamics models were probed including SchNet, ForceNet, DimeNet, GemNet (-T and -dT), NequIP. Density Functional Theory (DFT) calculations are one of the main workhorses of molecular dynamics (and account for a great deal of computing time in big clusters). DFT is O(n^3) to the input size though, so can ML help here? Learned Force Fields Are Ready For Ground State Catalyst Discovery by Schaarschmidt et al. present the experimental study of models of learned potentials - turns out GNNs can do a very good job in O(n) time! Easy Potentials (trained on Open Catalyst data) turns out to be quite a good predictor especially when paired with a subsequent postprocessing step. Model-wise, it is an MPNN with the NoisyNodes self-supervised objective that we covered a few weeks ago. 🪐 For astrophysics aficionados: Mangrove: Learning Galaxy Properties from Merger Trees by Jespersen et al. apply GraphSAGE to merger trees of dark matter to predict a variety of galactic properties like stellar mass, cold gas mass, star formation rate, and even black hole mass. The paper is heavy on the terminology of astrophysics but pretty easy in terms of GNN parameterization and training. Mangrove works 4-9 orders of magnitude faster than standard models (that is, 10 000 - 1 000 000 000 times faster). Experimental charts are pieces of art that you can hang on a wall. 🤖Compositional Semantic Parsing with Large Language Models by Drozdov, Schärli et al. pretty much solve the compositional semantic parsing task (natural language query - structured query like SPARQL) using only code-davinci-002 language model from OpenAI (which is InstructGPT fine-tuned on code). No need for hefty tailored semantic parsing models - turns out a smart extension of the Chain-of-thought prompting (aka "let's think step by step") devised as Least-to-Most prompting (where we first answer easy subproblems before generating a full query) yields whopping 95% accuracy even on hardest Compositional Freebase Questions (CFQ) dataset. CFQ was introduced at ICLR 2020, and just after two years LMs cracked this task - looks like it's time for the new, even more complex dataset.

3,310 views

Publié 28 oct.

GraphML News It’s Friday - time to look back at what happened in the field this week. 📚Blogs & Books (Editors’ Choice 👍) An Introduction to Poisson Flow Generative Models by Ryan O’Connor. Diffusion models are the hottest topic in Geometric Deep Learning but have an important drawback - the sampling is slow 🐌 due to necessity of performing 100-1000 forward passes. Poisson Flow generative models take inspiration from physics and offer another look at the generation process that allows much much faster sampling. This blog gives a very detailed and pictorial explanation of Poisson Flows. Awesome GFlowNets by Narsil-Dinghuai Zhang. Generative Flow Networks (GFlowNets) bring together generative modeling with ideas from reinforcement learning and show especially promising results in drug discovery. This Awesome repo will get you acquainted with the main ideas, most important papers, and some implementations Sheaf Theory through Examples - a book by Daniel Rosiak on the sheaf theory. If you felt you want to know more after reading the Sheaf Diffusion paper - this would be your next step. 🗞️News & Press Elon Musk finally acquired Twitter so it’s time to move to Telegram Mila and Helmholtz Institute announced a new German-Canadian partnership on developing causal models of the cell. As Geometric DL is in the heart of modern structural biology, we’ll keep an eye on the future outcomes. 🛠️Code & Data We somehow missed that but catching up now - the DGL team at Amazon published the materials of the KDD’2022 tutorial on GNNs in Life Sciences. Geometric Kernels - a new fresh framework for kernels and Gaussian processes on non-Euclidean spaces (including graphs, meshes, and Riemannian manifodls). Supports PyTorch, TensorFlow, JAX, and Numpy. A post with the hot new papers for your weekend reading will be arriving shortly!

2,960 views

Publié 26 oct.

Webinar on Fraud Detection with GNNs Graph neural networks (GNN) are increasingly being used to identify suspicious behavior. GNNs can combine graph structures, such as email accounts, addresses, phone numbers, and purchasing behavior to find meaningful patterns and enhance fraud detection. Join the webinar by Nikita Iserson, Senior ML/AI Architect at TigerGraph, to learn how graphs are used to uncover fraud on Thursday, Oct 27th, 6pm CET. Agenda: • Introduction to TigerGraph • Fraud Detection Challenges • Graph Model, Data Exploration, and Investigation • Visual Rules, Red Flags, and Feature Generation • TigerGraph Machine Learning Workbench • XGBoost with Graph Features • Graph Neural Network and Explainability

2,880 views
12•••5•••10•••1314151617•••20•••25•••30•••35•••40•••45•••50•••55•••60•••65•••70•••7374