Comparative Analysis of XGBoost, KNN, and SVM Algorithms for Heart Disease Prediction Using SMOTE-Tomek Balancing
DOI:
10.33395/sinkron.v10i1.15469Keywords:
Heart Disease; Machine Learning; K-Nearest Neighbors; Support Vector Machine; XGBoost;Abstract
Heart disease remains one of the leading causes of death worldwide, making early detection crucial for improving patient outcomes. This study aims to evaluate and compare the performance of several machine learning algorithms in detecting heart disease using the 2015 BRFSS dataset, which includes responses from 253,680 individuals. The three algorithms examined are Extreme Gradient Boosting (XGBoost), K-Nearest Neighbors (KNN), and Support Vector Machine (SVM). The data preprocessing steps involved feature encoding, class imbalance handling using the Synthetic Minority Over-sampling Technique combined with Tomek Links (SMOTE-Tomek), and hyperparameter tuning through RandomizedSearchCV. The models were assessed on a hold-out validation set using several metrics, including accuracy, Receiver Operating Characteristic-Area Under the Curve (ROC-AUC), F1-score, precision, and recall. The results demonstrated that XGBoost achieved the highest performance, with an accuracy of 94%, a ROC-AUC score of 0.98, and an F1-score of 0.94. In comparison, KNN achieved an accuracy of 87% (ROC-AUC 0.95), while SVM attained an accuracy of 79% (ROC-AUC 0.86). These findings suggest that XGBoost is a robust model for large-scale heart disease classification and holds potential for implementation in clinical decision support systems.
Downloads
References
Adi, S, Wintarti, A (2022). Komparasi Metode Support Vector Machine (SVM), K-Nearest Neighbors (KNN), dan Random Forest (RF) Untuk Prediksi Penyakit Gagal Jantung. MATHunesa.
Andani, M., Triloka, J., Irianto, S. Y., & Nugroho, H. W. (2025). Comparison of K-Nearest Neighbor, Naive Bayes, Random Forest Algorithms for Obesity Prediction. Sinkron, 9(1), 502–510. https://doi.org/10.33395/sinkron.v9i1.14478
Arif, S. N. N., Siregar, A. M., Faisal, S., & Juwita, A. R. (2024). Klasifikasi Penyakit Serangan Jantung Menggunakan Metode Machine Learning K-Nearest Neighbors (KNN) dan Support Vector Machine (SVM). JURNAL MEDIA INFORMATIKA BUDIDARMA, 8(3), 1617. https://doi.org/10.30865/mib.v8i3.7844
Arjun Vahlevy, D., Levis Putra Zendrato, E., Fadillah, R., & Jafar Sidiq, R. (2023). Tinjauan Literatur Sistematik pada Sistem Pakar untuk Diagnosa Penyakit Manusia. Jurnal Artificial Inteligent Dan Sistem Penunjang Keputusan, 1(1). https://garuda.kemdikbud.go.id/.
Derisma. (2020). Perbandingan Kinerja Algoritma untuk Prediksi Penyakit Jantung dengan Teknik Data Mining. In Journal of Applied Informatics and Computing (JAIC) (Vol. 4, Issue 1). http://jurnal.polibatam.ac.id/index.php/JAIC
Hairani, H., Anggrawan, A., & Priyanto, D. (2023). Improvement Performance of the Random Forest Method on Unbalanced Diabetes Data Classification Using Smote-Tomek Link. International Journal on Informatics Visualization, 7(1), 258–264. https://doi.org/10.30630/joiv.7.1.1069
Hidayat, R., Sy, Y. S., Sujana, T., Husnah, M., Saputra, H. T., & Okmayura, F. (2024). Implementasi Machine Learning Untuk Prediksi Penyakit Jantung Menggunakan Algoritma Support Vector Machine. BIOS : Jurnal Teknologi Informasi Dan Rekayasa Komputer, 5(2), 161–168. https://doi.org/10.37148/bios.v5i2.152
Maskuri, M. N., Sukerti, K., & Herdian Bhakti, R. M. (2022). Penerapan Algoritma K-Nearest Neighbor (KNN) untuk Memprediksi Penyakit Stroke Stroke Desease Predict Using KNN Algorithm. Jurnal Ilmiah Intech : Information Technology Journal of UMUS, 4(1).
Mayang Pratiwi, D., & Mufidah, L. (2024). Perbandingan Metode Decision Tree Classifier dan XGBoost Classifier Dalam Memprediksi Penyakit Jantung (Vol. 4, Issue 1).
Natsir, F. M., Yusliana, R., Universitas, B., Makassar, M., Wahyuni, T., & Muhammadiyah Makassar, U. (2024). Arus Jurnal Sains dan Teknologi (AJST) Analisis Deteksi Dini Penyakit Jantung dengan Pendekatan Support Vector Machine pada Data Pasien INFO PENULIS. 2(2). http://jurnal.ardenjaya.com/index.php/ajsthttp://jurnal.ardenjaya.com/index.php/ajst
Nugraha, W. (2022). Prediksi Penyakit Jantung Cardiovascular Menggunakan Model Algoritma Klasifikasi. Jurnal Sigmata
Pramudhyta, N. A., & Rohman, M. S. (2024). Perbandingan Optimasi Metode Grid Search dan Random Search dalam Algoritma XGBoost untuk Klasifikasi Stunting. Jurnal Media Informatika Budidarma, 8(1), 19. https://doi.org/10.30865/mib.v8i1.6965
Rahman, H., & Agusman, R. (2024). Model Prediksi Penyakit Jantung Menggunkan Machine Learning. In Tata Sutabri Jurnal Ilmiah Betrik (Vol. 15, Issue 03).
Rasid, A., & Kenedy, S. (2023). Implementation Of Support Vector Machine Algorithm With Hyper-Tuning Randomized Search In Stroke Prediction. In Journal of Information Systems and Computer Science Prima) (Vol. 6, Issue 2).
Ratantja Kusumajati, F., Rahmat, B., & Junaidi, A. (2024). Implementation Of Balancing Data Method Using Smotetomek In Diabetes Classification Using Xgboost a. 12(4).
Sah, A., Niesa, C., Jafar, R. R., & Muharrom, M. (2025). Analisis Model Prediksi Penyakit Jantung Menggunakan Adaptive Boosting, Gradient Boosting, dan Extreme Gradient Boosting. Jurnal Ilmiah FIFO, 17(1), 46. https://doi.org/10.22441/fifo.2025.v17i1.006
Shabrina Assyifa, D., & Luthfiarta, A. (2024). SMOTE-Tomek Re-sampling Based on Random Forest Method to Overcome Unbalanced Data for Multi-class Classification. Inform : Jurnal Ilmiah Bidang Teknologi Informasi Dan Komunikasi, 9(2), 151–160. https://doi.org/10.25139/inform.v9i2.8410
Sukamto, T. F., Prameswary, C. L., Royadi, D., & Sofia, D. (2025). Diabetes Disease Prediction on Unbalanced Data Using SMOTE-Tomek Links and Random Forest Algorithm. G-Tech: Jurnal Teknologi Terapan, 9(3), 1194–1203. https://doi.org/10.70609/g-tech.v9i3.7164
Sumantiawan, D. I., Suseno, J. E., & Syafei, W. A. (2023). Sentiment Analysis of Customer Reviews Using Support Vector Machine and Smote-Tomek Links For Identify Customer Satisfaction. J. Sistem Info. Bisnis, 13(1), 1–9. https://doi.org/10.21456/vol13iss1pp1-9
Surono, M., Fadli, M., Purwamti, D. S., Susanto, E. R., & Komputer, M. I. (2025). INSOLOGI: Jurnal Sains dan Teknologi Hybrid XGBoost-SVM Model untuk Sistem Pendukung Keputusan dalam Prediksi Penyakit Diabetes. Media Cetak, 4(3), 443–454. https://doi.org/10.55123/insologi.v4i3.5410
Yogianto, A., Homaidi, A., & Fatah, Z. (2024). Implementasi Metode K-Nearest Neighbors (KNN) untuk Klasifikasi Penyakit Jantung. G-Tech: Jurnal Teknologi Terapan, 8(3), 1720–1728. https://doi.org/10.33379/gtech.v8i3.4495
Downloads
How to Cite
Issue
Section
License
Copyright (c) 2026 Yuliana, Robet, Leony Hoki

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.


Moraref
PKP Index
Indonesia OneSearch
OCLC Worldcat
Index Copernicus
Scilit




















