P05-03
Development of tools to enhance the extracting process of ADME activity information from the Common Technical Document (CTD)
Masataka KURODA *1, 2, Reiko WATANABE3, Hitoshi KAWASHIMA1, Yasuhiko HASHIDA4, Atsuo MATSUEDA4, Kenji MIZUGUCHI3
1National Institutes of Biomedical Innovation, Health and Nutrition
2Mitsubishi Tanabe Pharma Corporation
3Institute for Protein Research, Osaka University
4Chinou Jouhou Shisutemu Inc.
( * E-mail: m-kuroda@nibiohn.go.jp )
Many types of data need to be included in the application for a pharmaceutical product. In particular, absorption, distribution, metabolism and excretion (ADME)-related data are abundant and of high quality because they are generated using quality-controlled experimental protocols. The generation, publication and use of such data are important in drug discovery and development to build predictive models of compound profiles or databases. The Common Technical Document (CTD) provides a common format between Japan, the US and the EU, for the registration of a human pharmaceutical product to facilitate documentation and speed up the process. However, while there are standards for the overall document structure, the data representation varies from drug to drug because it is up to the applicant to decide how to write the individual data. In addition, the CTD is distributed in PDF format, and although it can be viewed, data extraction and formatting will be non-trivial. In this study, we developed a dedicated tool, CTDCurator, with the aim of reducing the effort required for data formatting in particular and improving the quality of the data, such as the accuracy and consistency of words and units. In the initial stages, tasks such as detecting experimental values and conditions, registering newly appearing words and units, standardizing words and units, and exception handling are always required, and this tool has made it possible to output formatted data in less time than manual work. In addition, the quality improvements achieved by this tool can be applied to other data extraction tasks.