O04_06

Development of a Supervised Deep Learning Method for DNA Sequence Estimation from DNA Images

Hirotaka KONDO *1Akinori KUZUYA1, 2, 3Akihiko KONAGAYA3

1Organization for Research and Development of Innovative Science and Technology, Kansai University
2Department of Chemistry and Materials Engineering, Kansai University
3Molecular Robotics Research Institute, Co., Ltd.
( * E-mail: kondo@kansai-u.ac.jp)

Atomic force microscopy (AFM) has attracted widespread attention as a technique for observing the shape and structure of DNA molecules on a nanoscale. However, extracting DNA sequence information directly from AFM images is difficult due to AFM’s noises and resolution limitations, making it difficult to read nucleotide sequence precisely. Various methods have been attempted in recent years to overcome these challenges, using deep learning to remove noises and increase resolution [1].

In this study, we propose a new method to estimate sequences from DNA images using supervised machine learning. We generated 360 images of a DNA model in units of 1 degree using a molecular image viewer (VMD). Then, we aligned the pixel images so that the center of each image to create molecular surface images. Supervised machine learning was performed on between these images and the corresponding sequence information. In particular, we used a modified T with a large bulging functional group for T discrimination and a modified C for C discrimination to recognize the position of each modified DNA. Learning in groups of one mutation, two mutations, and multiple mutations resulted in accurate sequence discrimination.
Supervised machine learning was also performed on a grayscale image, which was more similar to the AFM image, and could be also useful to discriminate DNA sequences.

[1] Xianran HU, Qing LIU, Gregory GUTMANN, Masayuki YAMAMURA, Akinori KUZUYA, Akihiko KONAGAYA: High-Resolution AFM Imaging of DNA Structures: An Approach via Cycle GANs and Virtual Reality Integration, CBI 2023 conference, p.180 (2023).