Contenu du post
GraphDF: A Discrete Flow Model for Molecular Graph Generation This is a guest post by Shuiwang Ji about their recent work, accepted to ICML 2021. Title: “GraphDF: A Discrete Flow Model for Molecular Graph Generation” TL; DR: - We propose GraphDF, a novel discrete latent variable model for molecular graph generation method. - We propose to use invertible modulo shift transform to sequentially generate graph nodes and edges from discrete latent variables. - Our proposed method outperforms prior methods on random generation, property optimization, and constrained optimization tasks. Code is now available as part of our DIG library. We study the molecular generation problem and propose a novel method (GraphDF) achieving new state-of-the-art performance. While prior methods use continuous latent variables, we argue that discrete latent variables are more suitable to model the categorical distribution of graph nodes and edges. In our GraphDF, the molecular graph is generated by sequentially using modulo shift transform to convert a sampled discrete latent variable to the categorical number of the graph node or edge type. The use of discrete latent variables eliminates the bad effect of dequantization and models the underlying distribution of graph structures more accurately. The modulo shift transform captures conditional information from the last sub-graph by graph convolutional networks to ensure the order invariance. Comprehensive studies show that our method outperform prior methods on random generation, property optimization, and constrained optimization tasks. Our method is the first work to model the density of complicated molecular graph data with discrete latent variables. We hope that it can provide a new insight for the community to explore more powerful graph generation models in the future.