P06-02
A Novel Endometrial Cancer Patient Stratification Considering ARID1A Protein Expression and Function with Effective Use of Multi-omics Data
Junsoo SONG *1, Ayako UI2, Kenji MIZUGUCHI1, Reiko WATANABE1
1Laboratory for Computational Biology, Institute for Protein Research, Osaka University
2Institute of Development, Aging and Cancer, Tohoku University
( * E-mail: u920213a@ecs.osaka-u.ac.jp )
Conventional patient stratification based solely on mutations or mRNA expression often fails to reflect functional activity accurately. Since proteins serve as the cell's functional molecules, their expression directly correlates with cellular states. Consequently, stratifying patients based on protein expression offers significant advantages; however, several challenges remain. First, the disparity in data volume across different omics fields complicates efficient data utilization. Proteomics suffers from limited data availability due to its technical complexity and relatively low throughput. Second, neither mutation nor protein expression alone can serve as a perfect indicator of functional activity for certain proteins such as ARID1A. ARID1A, a DNA-binding subunit of the SWI/SNF family, is the most frequently mutated gene in this complex, particularly associated with ovarian and endometrial cancers. A comprehensive consideration of ARID1A mutations, activity, and protein expression is essential for developing therapeutic strategies. Research indicates that most ARID1A loss-of-function mutations are heterozygous, and protein expression is detectable in all samples. Thus, additional direct information about ARID1A activity is necessary for more precise patient stratification based on ARID1A activity.
To address these issues, we propose an innovative patient stratification strategy. We developed a machine learning model to supplement insufficient protein expression data of ARID1A. This model was trained using publicly available multi-omics data, integrating information from multiple databases. By estimating the transcriptional regulation of genes directly targeted by ARID1A, we inferred the activity of ARID1A in tumor tissue. With fully supplemented proteomics data and activity labels, patients were stratified into three groups (High, Mid, Low) considering both function and protein expression of ARID1A in a patient. We identified differentially expressed genes (DEGs) between groups and Gene Set Enrichment Analysis (GSEA) revealed that our method highlights transcriptional variations of tumor immune microenvironment, which were ambiguous with stratification based on mRNA expression. Further investigation of the extracted DEGs may reveal genes that help predict the effectiveness of immune checkpoint inhibitor (ICB) treatment in ARID1A-deficient patients.