Comparative Analysis of Homogeneous and Heterogeneous Ensembles for Diabetes Classification Optimization
DOI:
10.33395/sinkron.v9i1.14439Keywords:
Diabetes, Boosting, Bagging, Stacking, BlendingAbstract
Diabetes mellitus is a chronic disease with an increasing prevalence worldwide, including in Indonesia, reaching 11.7% by 2023. Early prediction of this disease is essential for more effective management. This study aims to develop a diabetes mellitus prediction model using an ensemble learning approach, including homogeneous (boosting and bagging) and heterogeneous (stacking and blending) techniques. In this study, the boosting algorithm using AdaBoost with Random Forest as the base estimator showed the highest accuracy of 98%, with balanced precision and recall. The bagging technique, which also uses Random Forest as the base estimator, achieved 97% accuracy, although slightly lower than boosting. The stacking technique, which combines XGBoost, Gradient Boosting, and Random Forest as base learners, with Random Forest as the meta-model, yields similar accuracy of 98%, but with lower prediction error, demonstrating its ability to cope with more complex data. Blending, which uses a similar approach but with training on the entire dataset, gave 98% accuracy with shorter processing time and more efficient memory usage than stacking.
Downloads
References
A, P. D., & Homayouni, S. (2021). Bagging and Boosting Ensemble Classifiers for Classification of Comparative Evaluation.
Agnitia LEstari, M., Tabrani, M., & Ayumida, S. (2021). Sistem Informasi Pengolahan Data Administrasi Kependudukan Pada Kantor Desa Pucung Karawang. Jurnal Interkom: Jurnal Publikasi Ilmiah Bidang Teknologi Informasi Dan Komunikasi, 13(3), 14–21. https://doi.org/10.35969/interkom.v13i3.50
Alam, U., Asghar, O., Azmi, S., & Malik, R. A. (2014). General aspects of diabetes mellitus. Handbook of Clinical Neurology, 126, 211–222. https://doi.org/10.1016/B978-0-444-53480-4.00015-1
Butt, U. M., Letchmunan, S., Ali, M., Hassan, F. H., Baqir, A., & Sherazi, H. H. R. (2021). Machine Learning Based Diabetes Classification and Prediction for Healthcare Applications. Journal of Healthcare Engineering, 2021. https://doi.org/10.1155/2021/9930985
Chandra, W., Suprihatin, B., & Resti, Y. (2023). Median-KNN Regressor-SMOTE-Tomek Links for Handling Missing and Imbalanced Data in Air Quality Prediction. Symmetry, 15(4). https://doi.org/10.3390/sym15040887
Chang, V., Bailey, J., Xu, Q. A., & Sun, Z. (2023). Pima Indians diabetes mellitus classification based on machine learning (ML) algorithms. Neural Computing and Applications, 35(22), 16157–16173. https://doi.org/10.1007/s00521-022-07049-z
Chatzimparmpas, A., Martins, R. M., Kucher, K., & Kerren, A. (2021). StackGenVis: Alignment of data, algorithms, and models for stacking ensemble learning using performance metrics. IEEE Transactions on Visualization and Computer Graphics, 27(2), 1547–1557. https://doi.org/10.1109/TVCG.2020.3030352
Fareed, M. M. S., Zikria, S., Ahmed, G., Mui-Zzud-Din, Mahmood, S., Aslam, M., Jillani, S. F., Moustafa, A., & Asad, M. (2022). ADD-Net: An Effective Deep Learning Model for Early Detection of Alzheimer Disease in MRI Scans. IEEE Access, 10, 96930–96951. https://doi.org/10.1109/ACCESS.2022.3204395
Gomes, H. M., Barddal, J. P., Enembreck, A. F., & Bifet, A. (2017). A survey on ensemble learning for data stream classification. ACM Computing Surveys, 50(2). https://doi.org/10.1145/3054925
Kahloot, K. M., & Ekler, P. (2021). Algorithmic Splitting: A Method for Dataset Preparation. IEEE Access, 9, 125229–125237. https://doi.org/10.1109/ACCESS.2021.3110745
Kumar, M., Singhal, S., Shekhar, S., Sharma, B., & Srivastava, G. (2022). Optimized Stacking Ensemble Learning Model for Breast Cancer Detection and Classification Using Machine Learning. Sustainability (Switzerland), 14(21). https://doi.org/10.3390/su142113998
Lai, H., Huang, H., Keshavjee, K., Guergachi, A., & Gao, X. (2019). Predictive models for diabetes mellitus using machine learning techniques. BMC Endocrine Disorders, 19(1), 1–9. https://doi.org/10.1186/s12902-019-0436-6
Manconi, A., Armano, G., Gnocchi, M., & Milanesi, L. (2022). A Soft-Voting Ensemble Classifier for Detecting Patients Affected by COVID-19. Applied Sciences (Switzerland), 12(15). https://doi.org/10.3390/app12157554
Mengcan, M. I. N., Xiaofang, C., & Yongfang, X. I. E. (2021). Constrained voting extreme learning machine and its application. 32(1), 209–219. https://doi.org/10.23919/JSEE.2021.000018
Mohammed, A., & Kora, R. (2023). A comprehensive review on ensemble deep learning: Opportunities and challenges. Journal of King Saud University - Computer and Information Sciences, 35(2), 757–774. https://doi.org/10.1016/j.jksuci.2023.01.014
Mujumdar, A., & Vaidehi, V. (2019). Diabetes Prediction using Machine Learning Algorithms. Procedia Computer Science, 165, 292–299. https://doi.org/10.1016/j.procs.2020.01.047
Muljono, Wulandari, S. A., Azies, H. Al, Naufal, M., Prasetyanto, W. A., & Zahra, F. A. (2024). Breaking Boundaries in Diagnosis: Non-Invasive Anemia Detection Empowered by AI. IEEE Access, 12(November 2023), 9292–9307. https://doi.org/10.1109/ACCESS.2024.3353788
Ngo, G., Beard, R., & Chandra, R. (2022). Evolutionary bagging for ensemble learning. Neurocomputing, 510, 1–14. https://doi.org/10.1016/j.neucom.2022.08.055
Nur, A., Thohari, A., Karima, A., Santoso, K., & Rahmawati, R. (2024). Crack Detection in Building Through Deep Learning Feature Extraction and Machine Learning Approach. 8(1), 1–6.
Ogurtsova, K., da Rocha Fernandes, J. D., Huang, Y., Linnenkamp, U., Guariguata, L., Cho, N. H., Cavan, D., Shaw, J. E., & Makaroff, L. E. (2017). IDF Diabetes Atlas: Global estimates for the prevalence of diabetes for 2015 and 2040. Diabetes Research and Clinical Practice, 128, 40–50. https://doi.org/10.1016/j.diabres.2017.03.024
Rif’at, I. D., Hasneli N, Y., & Indriati, G. (2023). Gambaran Komplikasi Diabetes Melitus Pada Penderita Diabetes Melitus. Jurnal Keperawatan Profesional, 11(1), 52–69. https://doi.org/10.33650/jkp.v11i1.5540
Saxena, R., Sharma, S. K., Gupta, M., & Sampada, G. C. (2022). A Novel Approach for Feature Selection and Classification of Diabetes Mellitus: Machine Learning Methods. Computational Intelligence and Neuroscience, 2022. https://doi.org/10.1155/2022/3820360
Tanwar, A., & Bhatia, P. K. (2024). A Review on Diabetes Prediction Using Machine Learning Techniques. Lecture Notes in Electrical Engineering, 1185(09), 513–524. https://doi.org/10.1007/978-981-97-1682-1_41
Tuysuzoglu, G., & Birant, D. (2020). Enhanced bagging (eBagging): A novel approach for ensemble learning. International Arab Journal of Information Technology, 17(4), 515–528. https://doi.org/10.34028/iajit/17/4/10
Wang, Z., Wu, C., Zheng, K., Niu, X., & Wang, X. (2019). SMOTETomek-Based Resampling for Personality Recognition. IEEE Access, 7, 129678–129689. https://doi.org/10.1109/ACCESS.2019.2940061
Wu, H., Wu, Y., Jiang, Y., Zhou, B., Zhou, H., Chen, Z., Xiong, Y., Liu, Q., & Zhang, H. (2022). ScHiCStackL: A stacking ensemble learning-based method for single-cell Hi-C classification using cell embedding. Briefings in Bioinformatics, 23(1), 1–10. https://doi.org/10.1093/bib/bbab396
Yorke-smith, N., & Dumančić, S. (2023). Model Stacking Performance Comparisons for Lifetime Estimation of CMOS ICs. November 2022.
Zhang, H., Liu, C., Zhang, Z., Xing, Y., Liu, X., Dong, R., He, Y., Xia, L., & Liu, F. (2021). Recurrence Plot-Based Approach for Cardiac Arrhythmia Classification Using Inception-ResNet-v2. Frontiers in Physiology, 12(May), 1–13. https://doi.org/10.3389/fphys.2021.648950
Downloads
How to Cite
Issue
Section
License
Copyright (c) 2025 Muhammad Naufal Maulana, Muljono, Eka Putra Agus Meindiawan

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.