P05-06
In silico prediction of total clearance, volume of distribution, and half-life with deep learning
Ryoko TERADA *, Reiko WATANABE, Kenji MIZUGUCHI
Institute for Protein Research of Osaka University
( * E-mail: u264709j@ecs.osaka-u.ac.jp )
In drug discovery, in silico screening with artificial intelligence (AI) is being put to practical use. Predicting compound profiles comprehensively based on compound structures in the early stages of drug discovery is expected to contribute further to the efficiency of drug discovery by reducing the cost and time required for experiments. Various parameters describe pharmacokinetics, which change from when a drug is administered to the body until it is excreted. It is known that total clearance (CLtot), volume of distribution in steady state (Vdss), and half-life (T1/2) derived from these two parameters have a large effect on the blood concentration profile, and these can be used to estimate the dose and dosing interval. Although the prediction of these parameters has been conducted for a long time, the construction of prediction models is challenging because complex interactions of multiple phenomena affect the value of those parameters, and the amount of publicly available human clinical data is limited. Recently, AI technologies such as deep learning have been rapidly developing, and multi-task learning, fine-tuning, and other techniques have made it possible to improve efficiency by utilizing data similarity.
In this study, we aim to build regression models that can predict CLtot, Vdss, and T1/2 by deep learning with public data, taking advantage of the similarity and relevance of the three parameters for building a multi-task model. First, after the intensive manual curation, we created the datasets of over 1,200 compounds with chemical structure information and their three parameter values from the ChEMBL database and the literature. Then, we built the prediction models with graph convolution networks, and the constructed single- and multi-task models were evaluated by using RMSE and R square. We also analyzed multi-task models, adding related tasks such as fraction unbound in plasma (fu,p) and mean residence time (MRT). Multi-task models showed better accuracy than single-task models, especially at T1/2. We also found that the combination of tasks in multi-task learning affected their final accuracy differently. We will discuss the appropriate model setting for predicting CLtot, Vdss, and T1/2.