Anglia Ruskin Research Online (ARRO)
Browse

Prediction Model for Cardiovascular Disease in Patients with Diabetes Using Machine Learning Derived and Validated in Two Independent Korean Cohorts

Download (1.52 MB)
journal contribution
posted on 2024-07-23, 12:35 authored by Hyunji Sang, Hojae Lee, Myeongcheol Lee, Jaeyu Park, Sunyoung Kim, Ho Geol Woo, Masoud Rahmati, Ai Koyanagi, Lee Smith, Sihoon Lee, You-Cheol Hwang, Tae Sun Park, Hyunjung Lim, Dong Keon Yon, Sang Youl Rhee

This study aimed to develop and validate a machine learning (ML) model tailored to the Korean population with type 2 diabetes mellitus (T2DM) to provide a superior method for predicting the development of cardiovascular disease (CVD), a major chronic complication in these patients. We used data from two cohorts, namely the discovery (one hospital; n = 12,809) and validation (two hospitals; n = 2019) cohorts, recruited between 2008 and 2022. The outcome of interest was the presence or absence of CVD at 3 years. We selected various ML-based models with hyperparameter tuning in the discovery cohort and performed area under the receiver operating characteristic curve (AUROC) analysis in the validation cohort. CVD was observed in 1238 (10.2%) patients in the discovery cohort. The random forest (RF) model exhibited the best overall performance among the models, with an AUROC of 0.830 (95% confidence interval [CI] 0.818–0.842) in the discovery dataset and 0.722 (95% CI 0.660–0.783) in the validation dataset. Creatinine and glycated hemoglobin levels were the most influential factors in the RF model. This study introduces a pioneering ML-based model for predicting CVD in Korean patients with T2DM, outperforming existing prediction tools and providing a groundbreaking approach for early personalized preventive medicine. 

History

Refereed

  • Yes

Volume

14

Publication title

Scientific Reports

ISSN

2045-2322

Publisher

Nature Portfolio

File version

  • Published version

Item sub-type

Article

Affiliated with

  • School of Psychology and Sport Science Outputs

Usage metrics

    ARU Outputs

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC