Addressing Class Imbalance in Stunting Classification Using SMOTE Enhanced Random Forest
DOI:
10.33395/sinkron.v9i4.15349Keywords:
Stunting, Nutritional Status, Random Forest, Imbalance Data, SMOTEAbstract
Stunting is a chronic nutritional problem that poses serious long-term effects on children’s health, including impaired physical growth, delayed cognitive development, and reduced productivity in adulthood. Early and accurate detection of stunting is therefore essential to support effective public health interventions and targeted policy implementation. However, one of the central challenges in developing machine learning models for this purpose is the presence of class imbalance in health-related datasets. Such imbalance frequently leads to biased classifiers that perform well on majority classes but fail to identify minority categories, reducing the overall reliability of the system. To overcome this issue, the present study utilized the Synthetic Minority Oversampling Technique (SMOTE) to balance the distribution of classes in a dataset containing 110,000 records. A Random Forest algorithm was then employed as the base classifier, with hyperparameter optimization carried out using the Optuna framework to ensure robustness and generalizability. The experimental results demonstrate that the combined application of SMOTE and Optuna significantly improved classification performance, producing the highest Macro Area Under the Curve (AUC) of 0.9972. This outstanding score indicates the model’s superior ability to distinguish nutritional status categories across both majority and minority classes. The study concludes that addressing data imbalance through oversampling is a fundamental methodological step in constructing fair and effective machine learning systems for stunting detection, ultimately contributing to improved health outcomes and evidence-based policy design.
Downloads
References
S. Aisyah et al., “GAMBARAN PENGUKURAN ANGKA STUNTING DI KOTA MEDAN TAHUN 2022,” vol. 8, pp. 3711–3716, 2024.
R. Hardinata, L. Oktaviana, F. F. Husain, S. Putri, and F. Kartiasih, “Analysis of Factors Influencing Stunting in Indonesia 2021,” Seminar Nasional Official Statistics 2023, vol. 2023, no. 1, pp. 817–826, 2023.
P. P. Rahayu and Casnuri, “Stunting risk differences based on gender,” Seminar Nasional UNRIYO, vol. 1, no. 1, pp. 135–139, 2020.
N. F. Khusna, A. Rahmah, and R. K. Nur, “Implementasi Random Forest dalam Klasifikasi Kasus Stunting pada Balita dengan Hyperparameter Tuning Grid Search,” vol. 2024, no. Senada, pp. 791–801, 2024.
M. R. Akbar Ariyadi, S. Lestanti, and S. Kirom, “Klasifikasi Balita Stunting Menggunakan Random Forest Classifier Di Kabupaten Blitar,” JATI (Jurnal Mahasiswa Teknik Informatika), vol. 7, no. 6, pp. 3846–3851, 2024, doi: 10.36040/jati.v7i6.7822.
R. J. Ellis, R. M. Sander, and A. Limon, “Twelve key challenges in medical machine learning and solutions,” Intell Based Med, vol. 6, no. February, 2022, doi: 10.1016/j.ibmed.2022.100068.
S. Aisyah et al., “GAMBARAN PENGUKURAN ANGKA STUNTING DI KOTA MEDAN TAHUN 2022,” vol. 8, pp. 3711–3716, 2024.
G. Surono and N. N. Pusparini, “Journal of technology information,” Jurnal Of Technology Information, vol. 5, no. 2, pp. 99–104, 2020.
R. Ridwan, E. H. Hermaliani, and M. Ernawati, “Penerapan: Penerapan Metode SMOTE Untuk Mengatasi Imbalanced Data Pada Klasifikasi Ujaran Kebencian,” Computer Science (CO-SCIENCE), vol. 4, no. 1, pp. 80–88, 2024, [Online]. Available: https://jurnal.bsi.ac.id/index.php/co-science/article/view/2990
R. Supriyadi, W. Gata, N. Maulidah, and A. Fauzi, “Penerapan Algoritma Random Forest Untuk Menentukan Kualitas Anggur Merah,” E-Bisnis : Jurnal Ilmiah Ekonomi dan Bisnis, vol. 13, no. 2, pp. 67–75, 2020, doi: 10.51903/e-bisnis.v13i2.247.
S. Shekhar, A. Bansode, and A. Salim, “A Comparative study of Hyper-Parameter Optimization Tools,” 2021 IEEE Asia-Pacific Conference on Computer Science and Data Engineering, CSDE 2021, 2021, doi: 10.1109/CSDE53843.2021.9718485.
J. Al Amien, Yoze Rizki, and Mukhlis Ali Rahman Nasution, “Implementasi Adasyn Untuk Imbalance Data Pada Dataset UNSW-NB15 Adasyn Implementation For Data Imbalance on UNSW-NB15 Dataset,” Jurnal CoSciTech (Computer Science and Information Technology), vol. 3, no. 3, pp. 242–248, 2022, doi: 10.37859/coscitech.v3i3.4339.
F. Hutter, Parameter Optimization, vol. 19. 2017. doi: 10.1142/9789814630146_0014.
E. F. Swana, W. Doorsamy, and P. Bokoro, “Tomek Link and SMOTE Approaches for Machine Fault Classification with an Imbalanced Dataset,” Sensors, vol. 22, no. 9, 2022, doi: 10.3390/s22093246.
I. K. Dharmendra, I. M. Agus, W. Putra, and Y. P. Atmojo, “Evaluasi Efektivitas SMOTE dan Random Under Sampling pada Klasifikasi Emosi Tweet,” vol. 9, no. 2, pp. 192–193, 2024.
T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: A Next-generation Hyperparameter Optimization Framework,” Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2623–2631, 2019, doi: 10.1145/3292500.3330701.
Z. Jin, J. Shang, Q. Zhu, C. Ling, W. Xie, and B. Qiang, “RFRSF: Employee Turnover Prediction Based on Random Forests and Survival Analysis,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12343 LNCS, pp. 503–515, 2020, doi: 10.1007/978-3-030-62008-0_35.
Y. S. Nugroho and N. Emiliyawati, “Sistem Klasifikasi Variabel Tingkat Penerimaan Konsumen Terhadap Mobil Menggunakan Metode Random Forest,” Jurnal Teknik Elektro, vol. 9, no. 1, pp. 24–29, 2017.
G. A. Sandag, “Prediksi Rating Aplikasi App Store Menggunakan Algoritma Random Forest,” CogITo Smart Journal, vol. 6, no. 2, pp. 167–178, 2020, doi: 10.31154/cogito.v6i2.270.167-178.
J. Muktabir, “Stunting & Wasting Dataset,” Kaggle. [Online]. Available: https://www.kaggle.com/datasets/jabirmuktabir/stunting-wasting-dataset. [Accessed: Aug. 12, 2025].
Harnelia, “Faktor Stunting,” Kaggle. [Online]. Available: https://www.kaggle.com/datasets/harnelia/faktor-stunting. [Accessed: Aug. 12, 2025].
Downloads
How to Cite
Issue
Section
License
Copyright (c) 2025 Ronald Belferik, Frans Mikael Sinaga, Ferawaty, Mangasa A.S. Manullang, Tetti Sinaga

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.