Contenu du post
Denoising Diffusion Is Still All You Need (Weekend Reading) In the previous post in June we covered the emergence of denoising diffusion models (DDPMs) in generative Geometric DL tasks. After 5 months, we can acknowledge that diffusion models became the first-class citizen in Geometric DL with numerous works appearing on arxiv every week. Here are 4 recent and very interesting works you might want to check: 1️⃣DiGress by Clemént Vignac, Igor Krawczuk, and the EPFL team is the unconditional graph generation model (although with the possibility to incorporate a score-based function for conditioning on graph-level features like energy MAE). DiGress is a discrete diffusion model, that is, it operates on discrete node types (like atom types C, N, O) and edge types (like single / double / triple bond) where adding noise corresponds to multiplication with the transition matrix (from one type to another) mined as marginal probabilities from the training set. The denoising neural net is a modified Graph Transformer. Works for many graph families - planar, SBMs, and molecules, code is available, and check the video from the reading group presentation! 2️⃣DiffDock by Gabriele Corso, Hannes Stärk, Bowen Jing, and the MIT team is the score-based generative model for molecular docking, eg, given a ligand and a protein, predicting how a ligand binds to a target protein. DiffDock runs the diffusion process over translations T(3), rotations SO(3), and torsion angles SO(2)^m in the product space: (1) positioning of the ligand wrt the protein (often called binding pockets), the pocket is unknown in advance so it is blind docking, (2) defining rotational orientation of the ligand, and (3) defining torsion angles of the conformation. DiffDock trains 2 models: the score model for predicting actual coordinates and the confidence model for estimating the likelihood of the generated prediction. Both models are SE(3)-equivariant networks over point clouds, but the heavier score model works on protein residues from alpha-carbons (initialized from the now-famous ESM2 protein LM) while the confidence model uses the fine-grained atom representations. Initial ligand structures are generated by RDKit. DiffDock dramatically improves the prediction quality, code is available, and you can even upload your own proteins (PDB) and ligands (SMILES) in the online demo on HuggingFace spaces to test it out! 3️⃣DiffSBDD by Schneuing, Du, and the team from EPFL, Cornell, Cambridge, MSR, USTC, Oxford is the diffusion model for generating novel ligands conditioned on the protein pocket. DiffSBDD can be implemented with 2 approaches: (1) pocket-conditioned ligand generation when the pocket is fixed; (2) inpainting-like generation that approximates the joint distribution of pocket-ligand pairs. In both approaches, DiffSBDD relies on the tuned equivariant diffusion model (EDM, ICML 2022) and equivariant EGNN as the denoising model. Practically, ligands and proteins are represented as point clouds with categorical features and 3D coordinates (proteins can be alpha-carbon residues or full atoms, one-hot encoding of residues — ESM2 could be used here in future), so diffusion is performed over the 3D coordinates ensuring equivariance. The code is already available!