Model Machine Learning yang Dioptimalkan untuk Prediksi Penyakit Jantung Menggunakan R Shiny

Yadhurani Dewi Amritha; Ni Luh Putu Ika Candrawengi; Md Wira Putra Dananjaya; Made Ari Riska Dayanti

doi:10.53863/kst.v8i01.1994

Authors

Yadhurani Dewi Amritha Universitas Pendidikan Nasional, Denpasar, Indonesia
Ni Luh Putu Ika Candrawengi Universitas Pendidikan Nasional, Denpasar, Indonesia
Md Wira Putra Dananjaya Universitas Pendidikan Nasional, Denpasar, Indonesia
Made Ari Riska Dayanti Universitas Pendidikan Nasional, Denpasar, Indonesia

DOI:

https://doi.org/10.53863/kst.v8i01.1994

Keywords:

machine learning, random forest, biostatistics, R Shiny, e-health

Abstract

Heart disease continues to be a major contributor to global mortality, highlighting the critical importance of early detection in enhancing patient outcomes. The increasing availability of structured clinical datasets has enabled the application of intelligent systems for risk prediction and diagnostic support. In this paper, the effectiveness of three supervised learning algo- rithms—Random Forest (RF), Support Vector Machine (SVM), and Decision Tree (DT)—is evaluated for the task of heart disease prediction. This investigation is based on the Heart Failure Prediction dataset sourced from the Kaggle platform. The training process for each model involved a 10-fold cross- validation, with its hyperparameters later being tuned using grid search optimization. Model efficacy was measured against standard classification benchmarks, including accuracy, sensitivity, specificity, and the area under the ROC curve (AUC). The Random Forest model emerged as the most effective, demon- strating superior performance with an AUC of 0.9517, sensitivity of 81.18%, and specificity of 90.44%. To facilitate clinical use, this model was subsequently integrated into a user-friendly web tool built with the R Shiny framework. The interface allows users to input patient-level clinical data and obtain real-time predictions, along with visualizations of feature importance and risk probability. This implementation bridges the gap between algorithm development and practical application, offering a user- friendly decision support tool for early heart disease screening. The findings affirm that machine learning models, when properly tuned and validated, can serve as effective and interpretable tools in clinical decision-making. This work contributes to the advancement of e-health and the integration of AI-driven models into medical workflows

References

Ansari, F., Sharma, S., & Garg, A. (2023). Performance evaluation of machine learning techniques for heart disease prediction. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 13(1), e1441.

Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13(2)

Davis, F. D. (1989). Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Quarterly, 13(3), 319–340.

Erni, A. P., & Sa’adah, U. (2021). Comparison of Decision Tree, Naïve Bayes, and Random Forest Algorithm in Detecting Heart Disease. Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), 5(1), 101–108.

ISO. (2010). Ergonomics of human-system interaction — Part 210: Human-centred design for interactive systems (ISO 9241-210:2010). International Organization for Standardization.

Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. IJCAI, 2, 1137–1145.

Kuhn, M., & Johnson, K. (2013). Applied predictive modeling. Springer.

Kushniruk, A. W., & Patel, V. L. (2004). Cognitive and usability engineering methods for evaluation of clinical information systems. Journal of Biomedical Informatics, 37(1), 56–76.

Lantz, B. (2019). Machine Learning with R: Expert techniques for predictive modeling. Packt Publishing Ltd.

Ong, W. M., Lee, J., & Tan, T. S. (2020). User-centered design of clinical prediction tools: usability evaluation of a prototype for type 2 diabetes screening. BMC Medical Informatics and Decision Making, 20(1), 1–10.

Powers, D. M. (2011). Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation. Journal of Machine Learning Technologies, 2(1), 37–63.

Prasanna Sai Teja, P., & Veeramani, T. (2022). Comparing the Efficiency of Heart Disease Prediction Using Machine Learning Techniques. Cardiometry, (23), 494–500.

Robin, X., Turck, N., Hainard, A., et al. (2011). pROC: An open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics, 12, 77. https://doi.org/10.1186/1471-2105-12-77

World Health Organization. (2021). Cardiovascular diseases (CVDs). Retrieved from https://www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds

Yang, L., & Shami, A. (2020). On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing, 415, 295-316.

Zhou, T., Jin, X., & Yang, Z. (2020). Heart disease prediction model based on decision tree and grid search. Journal of Intelligent & Fuzzy Systems, 38(3), 3229–3239.

Zriqat, I. A., Al-Dubai, M. M., & Al-Sharabi, T. H. (2017). A comparative study for predicting heart diseases using data mining classification methods. arXiv preprint arXiv:1706.09969.