P07-13
DiffInt: Integrating Explicit Hydrogen Bond Modeling into Diffusion Models for Structure-Based Drug Design
Masami SAKO *1, Nobuaki YASUO2, Masakazu SEKIJIMA1
1Department of Computer Science, Tokyo Institute of Technology
2Academy for Convergence of Materials and Informatics (TAC-MI), Tokyo Institute of Technology
( * E-mail: sako.m.ab@m.titech.ac.jp )
Structure-based drug design is a crucial approach in the drug discovery process and aims to effectively create molecules that bind to specific target proteins. In recent years, significant progress has been made in applying deep learning methods to molecule generation, making it possible to generate ligand molecules directly in the protein pocket with 3D information. However, these studies have not been able to incorporate protein-ligand interaction information, making it difficult to efficiently generate ligand molecules with high binding affinity.
In this study, we realize pharmacophore modeling that preserves hydrogen bond between the protein and ligand molecules in the structure-based drug design via deep learning model. By introducing "interaction particles" that explicitly represent hydrogen bonds between the protein and ligand molecules, it becomes possible to generate ligand molecules with the desired hydrogen bonds retained. The model combines an E(3)-equivariant graph neural network with a diffusion model framework. The diffusion process gradually adds noise to the input ligand molecule, whereas the inverse process generates a new molecule through denoising. Both the protein pocket structure and the interacting particles remain fixed as conditions throughout these processes, leading to the generation of molecules with specific desired interactions.
The model has been trained on 100,000 protein-ligand complexes in the CrossDocked dataset and evaluated by generating the ligand molecules 100 times for each protein pocket of a test set consisting of 100 proteins. Hydrogen bond reproducibility and hydrogen bond energies estimated from docking simulations outperform existing models.