Post #893

@graphml

Graph Machine Learning

Vues4,040Nombre de vues

Publié2 mai02/05/2025 16:23

Contenu

Contenu du post

Graph Learning Will Lose Relevance Due To Poor Benchmarks by Maya Bechler-Speicher, Ben Finkelshtein, Fabrizio Frasca, Luis Müller, Jan Tönshoff, Antoine Siraudin, Viktor Zaverkin, Michael M. Bronstein, Mathias Niepert, Bryan Perozzi, Mikhail Galkin, Christopher Morris 📜arxiv 📣 Our new spicy ICML 2025 position paper. Graph learning is less trendy in the ML world than it was in 2020-2022. We believe the problem is in poor benchmarks that hold the field back - and suggest ways to fix it! We identified three problems: #️⃣ P1: No transformative real-world applications - while LLMs and geometric generative models become more powerful and solve complex tasks every generation (from reasoning to protein folding), how transformative could a GNN on Cora or OGB be? P1 Remedies: The community is overlooking many significant and transformative applications, including chip design and broader ML for systems, combinatorial optimization, and relational data (as highlighted by RelBench). Each of them offers $billions in potential outcomes. #️⃣ P2: While everything can be modeled as a graph, often it should not be. We made a simple experiment and probed a vanilla DeepSet w/o edges and a GNN on Cayley graphs (fixed edges for a certain number of nodes) on molecular datasets and the performance is quite competitive. #️⃣ P3: Bad benchmarking culture (this one hits hard) - it’s a mess 🙂 Small datasets (don’t use Cora and MUTAG in 2025), no standard splits, and in many cases recent models are clearly worse than GCN / Sage from 2020. It gets worse when evaluating generative models. Remedies for P3: We need more holistic benchmarks which are harder to game and saturate - while it’s a common problem for all ML fields, standard graph learning benchmarks are egregiously old and rather irrelevant for the scale of problems doable in 2025. 💡 As a result, it’s hard to build a true foundation model for graphs. Instead of training each model on each dataset, we suggest using GNNs / GTs as processors in the “encoder-processor-decoder” blueprint, train them at scale, and only tune graph-specific encoders/decoders. For example, we pre-trained several models on PCQM4M-v2, COCO-SP, and MalNet Tiny, and fine-tuned them on PascalVOC, Peptides-struct, and Stargazers to find that graph transformers benefit from pre-training. The project started around NeurIPS 2024 when Christopher Morris gathered us to discuss the peeve points of graph learning and how to continue to do impactful research in this area. I believe the outcomes appear promising, and we can re-imagine graph learning in 2025 and beyond!