Post #346

@graphml

Graph Machine Learning

Vues1,760Nombre de vues

Publié20 nov.20/11/2020 13:30

Contenu

Contenu du post

Weisfeiler and Leman go sparse: Towards scalable higher-order graph embeddings This is a guest post by Christopher Morris about their recent work accepted to NeurIPS 2020 that deals with higher-order WL algorithms. Motivation: Since the power of GNNs is upper-bounded by the 1-dimensional Weisfeiler-Leman algorithm (WL) (Xu et al. 2019, Morris et al. 2019), it is natural to design GNNs based on insights from the k-dimensional WL (k-WL), which is a strictly more powerful heuristic for the graph isomorphism heuristic. Instead of computing colors or features for single vertices, the k-WL gets more powerful by computing colors for k-tuples, defined over the vertex set, and defines a suitable adjacency notion between them to do a message-passing style update. Hence, it accounts for the higher-order interactions between vertices. However, it does not scale and may suffer from overfitting when used in a machine learning setting. Hence, it remains an important open problem to design WL-based graph learning methods simultaneously expressive, scalable, and non-overfitting. Methodological Contribution: In our paper, we propose local variants of the k-WL and corresponding neural architectures, which consider a subset of the original neighborhood, making them more scalable, and less prone to overfitting. Surprisingly, the expressive power of (one of) our algorithms is strictly higher than the original algorithm in terms of the ability to distinguish non-isomorphic graphs. We then lift our results to the neural setting and connect our finding to recent learning theoretic results for GNNs (Garg et al., 2020), showing that our architectures offer better generalization errors. Empirical results: Our experimental study confirms that the local algorithms, both kernel and neural architectures lead to vastly reduced computation times and prevent overfitting. The kernel version establishes a new state-of-the-art for graph classification on a wide range of benchmark datasets. In contrast, the neural version shows promising performance on large-scale molecular regression tasks. Future Challenges: While our new sparse architecture leads to a boost in expressive power over standard GNNs and is less prone to overfitting than dense architectures, it still does not scale to truly large-scale. The main reason for this is the exponential dependence on k, i.e., the algorithm still considers all n**k tuples. Hence, designing scalable (higher-order) GNNs that can provably capture graph structure is an important future goal. In general, we believe that moving away from the restrictive graph isomorphism objective and deriving a deeper understanding of our architecture, when optimized with stochastic gradient descent, are important futures goals.