TGINSIGHT CHAT
Graph Machine Learning
@graphml
TechnologiesEverything about graph theory, computer science, machine learning, etc. If you have something worth sharing with the community, reach out @gimmeblues, @chaitjo. Admins: Sergey Ivanov; Michael Galkin; Chaitanya K. Joshi
Posts récents
Page 46 sur 74 · 877 posts
Publié 26 nov.
Jraph - A library for graph neural networks in jax. Jgraph is a new library by DeepMind for constructing GNNs in JAX (autograd computation) and Haiku (writing neural network layers). Could be useful if you cannot use PyTorch.
Publié 25 nov.
Publié 25 nov.
How node features affect performance of GNN? This is an open question that I recently thought a bit. In particular, what surprised me are the results from a recent paper on Label Propagation on a particular dataset Rice31 (table below). You can see that some models achieve 80% accuracy, while others 10% (random guess). In the paper they say that the node features are heterogeneous features such as gender or major, but after speaking with authors it seems they use spectral embeddings instead. I have tried this dataset with GNN and my results are close to random guess (10%). I tried several variations of GNN as well as node features, but didn't get much higher than 15%. Then I tried GBDT with spectral embeddings and it gave me about 50% accuracy. I haven't tried LP yet on this dataset, but it would be remarkable to see that LP with spectral embeddings can have such a drastic difference with GNN. This and other experiments led me to think that the paradigm of message passing is too strong, i.e. aggregating information simultaneously among your neighbors may not be a good idea in general. The inductive bias that such model has could be wrong for a particular graph dataset. GNN work on some graph datasets, but how node labels depend on the graph structure is very similar to how message-passing works. In other words, if you were to create a dataset, where a node label equals to an average label of your neighbors, then GNN that does average aggregation would easily learn such dependency. But if your node labels depend on the structure in some counter-intuitive way (for example, by picking a neighbor at random and then assigning its node label), then your GNN with average aggregation would fail. In other words, GNN models don't have to follow message-passing paradigm, they can have very different design principles and that's something that I think we will see in the coming years.
Publié 24 nov.
Fresh picks from ArXiv Today at ArXiv smoothing for link prediction, design space of GNNs, and scalable networks for heterogeneous graphs ⛰ If I forgot to mention your paper, please shoot me a message and I will update the post. Conferences - Node Similarity Preserving Graph Convolutional Networks WSDM 2021 - Design Space for Graph Neural Networks with Jure Leskovec, NeurIPS 2020 Graphs - VLG-Net: Video-Language Graph Matching Network for Video Grounding - Scalable Graph Neural Networks for Heterogeneous Graphs - Graph embeddings via matrix factorization for link prediction: smoothing or truncating negatives? - Reinforcement Learning of Graph Neural Networks for Service Function Chaining - Quantum algorithms for learning graphs and beyond Survey - Survey and Open Problems in Privacy Preserving Knowledge Graph: Merging, Query, Representation, Completion and Applications - Subpath Queries on Compressed Graphs: a Survey
Publié 23 nov.
Network Repository A cool interactive repository of about a thousand of different graphs. Could be useful if you need some graphs with specific properties for specific tasks.
Publié 23 nov.
Knowledge Graphs in NLP @ EMNLP 2020 A new digest from Michael Galkin on the applications of knowledge graphs in NLP at the last EMNLP conference. Much bigger models (6.5B parameters), more languages (100 languages for entity linking), more complex tasks (data to text).
Publié 20 nov.
Weisfeiler and Leman go sparse: Towards scalable higher-order graph embeddings This is a guest post by Christopher Morris about their recent work accepted to NeurIPS 2020 that deals with higher-order WL algorithms. Motivation: Since the power of GNNs is upper-bounded by the 1-dimensional Weisfeiler-Leman algorithm (WL) (Xu et al. 2019, Morris et al. 2019), it is natural to design GNNs based on insights from the k-dimensional WL (k-WL), which is a strictly more powerful heuristic for the graph isomorphism heuristic. Instead of computing colors or features for single vertices, the k-WL gets more powerful by computing colors for k-tuples, defined over the vertex set, and defines a suitable adjacency notion between them to do a message-passing style update. Hence, it accounts for the higher-order interactions between vertices. However, it does not scale and may suffer from overfitting when used in a machine learning setting. Hence, it remains an important open problem to design WL-based graph learning methods simultaneously expressive, scalable, and non-overfitting. Methodological Contribution: In our paper, we propose local variants of the k-WL and corresponding neural architectures, which consider a subset of the original neighborhood, making them more scalable, and less prone to overfitting. Surprisingly, the expressive power of (one of) our algorithms is strictly higher than the original algorithm in terms of the ability to distinguish non-isomorphic graphs. We then lift our results to the neural setting and connect our finding to recent learning theoretic results for GNNs (Garg et al., 2020), showing that our architectures offer better generalization errors. Empirical results: Our experimental study confirms that the local algorithms, both kernel and neural architectures lead to vastly reduced computation times and prevent overfitting. The kernel version establishes a new state-of-the-art for graph classification on a wide range of benchmark datasets. In contrast, the neural version shows promising performance on large-scale molecular regression tasks. Future Challenges: While our new sparse architecture leads to a boost in expressive power over standard GNNs and is less prone to overfitting than dense architectures, it still does not scale to truly large-scale. The main reason for this is the exponential dependence on k, i.e., the algorithm still considers all n**k tuples. Hence, designing scalable (higher-order) GNNs that can provably capture graph structure is an important future goal. In general, we believe that moving away from the restrictive graph isomorphism objective and deriving a deeper understanding of our architecture, when optimized with stochastic gradient descent, are important futures goals.
Publié 19 nov.
Video: Recent Developments of Graph Network Architectures (video) I already did a post with the slides with a great lecture of Xavier Bresson on latest GNNs. Here is also a video presentation of it.
Publié 18 nov.
Combining Label Propagation and Simple Models Out-performs Graph Neural Networks This paper by Cornell and Facebook made a lot of noise on Twitter recently. In short, it shows that GNNs can be outperformed by simpler models such as MLP + Label Propagation (LP) on several large datasets. They use LP (actually twice) to propagate the labels from training nodes to test nodes. LP has been used for two decades successfully (NIPS 2004 as well as this survey), it's just it was not directly compared to GNN. Unfortunately, LP does not use node features, so the authors propose first to use MLP on node features and then use LP on predictions of MLP and on labels. This work only applies for transductive node classification, but not on inductive node classification (applying trained model on new graphs), neither on link prediction nor graph classification. But for node classification it shows pretty good results in terms of speed and quality. Another detail is that LP usually works on homophilous graphs, i.e. graphs where nodes with the same labels have higher chance of being connected. While this assumption is reasonable, not all graphs have this type of connectivity, for example the mail that goes from a person to a post office to aggregator to the recipient may connect nodes of different classes together. Petar Veličković talks more in detail about this. I must add that it's not the first time we see that existing graph datasets can be outperformed by simple models. A year ago there were many works showing that MLP works better than GNN on many graph classification datasets (e.g. this paper). MLP don't work on OGB datasets really well, but MLP + LP does. Hopefully it will lead to more graph datasets and subsequently to more insights about which tools are the best for graph prediction problems.
Publié 17 nov.
Fresh picks from ArXiv Today at ArXiv new big graph dataset for graph classification, fast clustering, and reconstructing ancient documents with GNN ⚰️ If I forgot to mention your paper, please shoot me a message and I will update the post. Graphs - A Large-Scale Database for Graph Representation Learning - Using Graph Neural Networks to Reconstruct Ancient Documents - Learning to Drop: Robust Graph Neural Network via Topological Denoising - Towards Better Approximation of Graph Crossing Number - Distill2Vec: Dynamic Graph Representation Learning with Knowledge Distillation - Node Attribute Completion in Knowledge Graphs with Multi-Relational Propagation Conferences - Duality-Induced Regularizer for Tensor Factorization Based Knowledge Graph Completion NeurIPS 2020 - Higher-Order Spectral Clustering of Directed Graphs NeurIPS 2020 - Molecular Mechanics-Driven Graph Neural Network with Multiplex Graph for Molecular Structures Workshop NeurIPS 2020 - IGSQL: Database Schema Interaction Graph Based Neural Model for Context-Dependent Text-to-SQL Generation EMNLP 2020
Publié 16 nov.
Workshops at NeurIPS 2020 There are more than 60 workshops at NeurIPS this year. Some relevant (with available accepted papers) are Learning Meets Combinatorial Algorithms (LMCA) on ML + NP-hard problems; and Differential Geometry meets Deep Learning (DiffGeo4DL) on geometry and manifolds.
Publié 13 nov.
EMNLP 2020 stats Dates: Nov 16-18 Where: Online Price: $200 ($75 students) Graph papers can be found at paper digest. • 3359 submissions (vs 2905 in 2019) • 754/520 accepted EMNLP/Findings (vs 660 in 2019) • 22.4% / 20.5% acceptance rate (vs 22.7% in 2019) • ~104 total graph papers (8% of total)