P07-05

Development of Prediction Models for Membrane Permeability of Cyclic Peptides using 3D Descriptors obtained from Molecular Dynamics Simulations and 2D Descriptors

Masatake SUGITA *1, 2Yudai NOSO1Takuya FUJIE1, 2Jianan LI1Keisuke YANAGISAWA1, 2Yutaka AKIYAMA1, 2

1School of Computing, Institute of Science Tokyo
2Middle Molecule IT-Based Drug Discovery Laboratory (MIDL), Institute of Science Tokyo
( * E-mail: sugita@bi.c.titech.ac.jp )

Improving membrane permeability is an critical issue in cyclic peptide drug discovery. Although membrane permeability prediction has been performed based on molecular dynamics simulations,[1] it is computationally expensive. Alternatively, machine learning models can predict membrane permeability at negligible cost, but it requires a larger dataset. However, only 7334 experimental values of membrane permeability are available at the time we started our research and the number of data has not increased significantly. Therefore, we developed a machine learning protocol using 3D descriptors obtained from molecular dynamics simulations incombination with 2D descriptors, to generate a universal model with a realistic computational cost.
We targeted 252 peptides across four datasets. Several 3D descriptors are obtained from the predicted conformations outside the membrane, at the water/membrane interface, and in the membrane based on the replica exchange with solute tempering/replica exchange umbrella sampling method. Simple learning algorithms such as random forest and support vector machine were used. The best prediction performance was obtained using XGBoost, with a correlation coefficient R = 0.77 and mean absolute error = 0.46. The important descriptors included those representing hydrophilicity and hydrophobicity of the peptide, as well as the conformational differences between inside and outside the membrane, the degree of freedom of the peptide, and the approximate shape of the peptide at the membrane center. In addition, the ability of the model to predict membrane permeability of peptides with different chemical structures from the training data was confirmed by excluding one of the four data sets and then creating a new training model using the protocol developed in this study to predict the excluded data. The results showed the model's generic nature with R = 0.49 and RMSE = 0.85. In such situations, it is difficult to predict membrane permeability using only 2D descriptors, demonstrating that descriptors based on conformations obtained from MD are essential. We also added new dataset consisting of 20 peptides and performed external validation.

[1] Masatake Sugita et al., J. Chem. Inf. Model., 62, 18, 4549-4560 (2022)