P11-06
Survival Analysis of Chronic Kidney Disease Using Multi-Regional Data from the LIFE Study
Hiromu MATSUMOTO *1, Tomohiro RYU2, Koichiro KATO1, 3, Fukuda HARUHISA4
1Department of Applied Chemistry, Graduate School of Engineering, Kyushu Unviersity
2Department of Chemistry, Graduate School of Science, Kyushu Unviersity
3Center for Molecular Systems, Kyushu Unviersity
4Department of Health Care Administration and Management, Graduate School of Medical Sciences, Kyushu Unviersity
( * E-mail: matsumoto.hiromu.238@s.kyushu-u.ac.jp )
By late 2022, the number of dialysis patients in Japan had reached approximately 350,000, highlighting the severity and prevalence of chronic kidney disease (CKD). Given the irreversible nature of renal function decline, early detection and prevention are crucial. However, CKD is often asymptomatic in its early stages, making timely identification challenging. Consequently, there is a critical need to develop predictive models that can identify early indicators of CKD and more effectively target at-risk populations.
Accurate prediction of kidney function trajectories, particularly glomerular filtration rate (GFR) and estimated GFR (eGFR), is crucial for the timely diagnosis and classification of chronic kidney disease (CKD), diagnosed when eGFR falls below 60 mL/min/1.73m2. While existing machine learning models show potential in predicting eGFR, they heavily depend on prior eGFR data. Developing models that can predict eGFR decline independently of historical measurements is crucial for enhancing predictive accuracy and understanding the factors driving renal function decline.
To address these challenges, we utilized data from the Longevity Improvement & Fair Evidence (LIFE) Study, a large-scale, multi-regional cohort study that integrates health-related data. This comprehensive database allowed for survival analysis across 14 municipalities, enabling an examination of the impact of regional factors on survival outcomes alongside various health metrics.
Diabetic patients with fasting blood glucose levels of 126 mg/dL or higher and eGFR between 60 and 70 mL/min/1.73m2 were classified as high-risk and targeted for tracking in this study. The survival analysis dataset was split into an 8:2 train-test ratio, ensuring a balanced distribution of patients who crossed the eGFR threshold of 60 mL/min/1.73m2 during follow-up and those who did not. The train set contained 8,563 records from 2,354 patients, and the test set included 2,192 records from 389 patients. We incorporated explanatory variables such as biometric measurements (e.g., height, weight, blood pressure), self-reported lifestyle factors (e.g., exercise, sleep habits), and regional residence as a dummy variable. Survival analysis was conducted with the decline in eGFR as the primary endpoint, specifically defining the event as reaching eGFR levels below 60 mL/min/1.73m2. The Cox proportional hazards model and Random Survival Forest were employed for model construction.
The survival analysis using the Cox proportional hazards model identified risk factors consistent with those previously reported. Furthermore, the results suggested that regional residence may be associated with the rate of progression of CKD. Ongoing work includes further refinement of these findings through advanced analyses, including the use of the SurvSHAP tool, which provides interpretability of survival models, to gain deeper insights into the explanatory variables.