P07-17
Report on Participation in the Tox24 Challenge: Construction of a High-Accuracy QSAR Predictive Model for Transthyretin Activity
Yuma IWASHITA *, Kyosuke KIMURA, Tomoya KOMASAKA, Koki SHISHIDO, Taichi NAKAMURA, Mizuho ASADA, Yoshihiro UESAWA
Laboratory of Medical Molecular Analysis, Meiji Pharmaceutical University
( * E-mail: m246208@std.my-pharm.ac.jp )
【Objective】The Tox24 Challenge [1, 2] is a QSAR competition aimed at advancing computational methods for predicting the in vitro activity of compounds. Specifically, the challenge is to predict the measured affinity of various compounds for transthyretin (TTR). Sponsored by AIDD, AiChemist, Chemical Research in Toxicology, and ICANN2024; submission deadline is August 31, 2024. Participants have to upload their predictive values of the affinity of diverse compounds for TTR to a dedicated website where their performance and ranking will be then displayed in real time on a leaderboard [3]. Our laboratory has assembled a model development team for this competition. This presentation outlines our dataset preparation and machine learning techniques used for constructing our TTR activity prediction model.
【Methods】The dataset was obtained from the dedicated website on OCHEM [4]. The compound data were categorized into three sets: training (1012 compounds), leaderboard (200 compounds), and blind (300 compounds) sets. The training dataset included measured TTR activities for model development. The leaderboard dataset lacked measured values; participants submitted predicted activities for the leaderboard dataset to the organizers, who then evaluated accuracy based on the root mean square error (RMSE). The measured values for the dataset will be released by the organizers on August 15 and will be used to further develop the model. The blind dataset will be used for the final ranking of predictive models and will remain undisclosed until the competition ends. We devised methods to adjust molecular descriptors based on the SMILES data provided by the organizers and implemented rigorous validation to ensure the accuracy of our machine learning models.
【Results and Discussion】The proposed model achieved an RMSE of 19.9 on the leaderboard dataset, securing the top position at the time of this report. The final winning model based on the blind dataset will be announced on September 19, 2024.
References:
1) Tetko IV. Tox24 Challenge. Chem Res Toxicol. 2024 Jun 17;37(6):825-826.
2) https://e-nns.org/icann2024/challenge/
3) https://ochem.eu//challenge/show.do?render-mode=full
4) https://ochem.eu/home/show.do