Contenu du post
Discrete Neural Algorithmic Reasoning Guest post by Gleb Rodionov Paper: https://www.arxiv.org/abs/2402.11628 Blog: https://research.yandex.com/blog/discrete-neural-algorithmic-reasoning Code: https://github.com/yandex-research/dnar In this paper, we focus on generalizable and interpretable neural algorithmic reasoners. Starting with attention-based GNN, we inspect the reasons for generalization errors and propose several architectural modifications: feature discretization, hard attention and separating discrete and continuous data flows. All of these blocks are important for generalization: ⁃ State discretization prevents the model to use complex and redundant dependencies in the data; ⁃ Hard attention ensures that attention weights are not annealed for larger graphs. Also, hard attention limits the set of possible messages that each node can receive; ⁃ Separating discrete and continuous flows is needed to ensure that state discretization does not lose information about continuous data. As a result, we achieve a model that provably imitates the execution of several algorithms for any test data when trained with hints. Practically, on SALSA-CLRS, trained on problem sizes of 16 nodes, the model demonstrates perfect graph- and node-level scores generalizing to problems of up to 1600 nodes. For future work, it would be interesting to enhance the expressivity of the proposed model to a broader set of algorithms and investigate whether it is possible to train these models without hints.