Classification of the Human Development Index in Indonesia Using the Bootstrap Aggregating Method
DOI:
10.33395/sinkron.v6i1.11173Abstract
Successful development of the quality of human life in a region is determined by the Human Development Index (HDI). Human development performance based on the HDI can be measured: long and healthy life, knowledge, and a decent standard of living. The HDI is usually grouped into several categories to facilitate the classification of the HDI level of each region. This study aimed to determine the ability of the bootstrap aggregating (bagging) method to classify the HDI by district/city. Bagging is a stochastic machine learning approach that can eliminate the variance of the classifier by producing a bootstrap ensemble to obtain better accuracy results. The dependent variable in this study was the HDI by district/city in 2020. In contrast, life expectancy at birth, expected years of schooling, mean years of schooling, and real expenditure per capita are adjusted as independent variables. Bagging was applied to the high and low categories of HDI data. The bagging method demonstrated good classification performance due to only eight classification errors, namely the HDI data which should be in the high category but classified into the low category by the bagging method. Based on the results of calculations with 25 replications, it can be concluded that the bagging method has a very good performance, with an accuracy value of 92.3%, the sensitivity of 100%, and specificity of 83.33%. The bagging method is considered very good for the classifying the HDI by district/city in Indonesia in 2020 because it has a balanced accuracy of 91.67%.
Downloads
References
Badan Pusat Statistik. (2021a). [Metode Baru] Indeks Pembangunan Manusia 2019-2020. Badan Pusat Statistik. https://www.bps.go.id/indicator/26/413/1/-metode-baru-indeks-pembangunan-manusia.html
Badan Pusat Statistik. (2021b). [Metode Baru] Indeks Pembangunan Manusia menurut Provinsi 2018-2020. Badan Pusat Statistik. https://www.bps.go.id/indicator/26/494/1/-metode-baru-indeks-pembangunan-manusia-menurut-provinsi.html
Badan Pusat Statistik. (2021c). Apa Itu Indeks Pembangunan Manusia? Badan Pusat Statistik. https://www.bps.go.id/subject/26/indeks-pembangunan-manusia.html
Bramer, M. (2007). Principles of Data Mining. In M. Bramer (Ed.), Springer Science+Business Media springer.com (Issue January 2007). Springer Science+Business Media springer.com. https://doi.org/10.1007/978-1-84628-766-4
Caelen, O. (2017). A Bayesian interpretation of the confusion matrix. Annals of Mathematics and Artificial Intelligence, 81(3–4), 429–450. https://doi.org/10.1007/s10472-017-9564-8
Darsyah, M. Y. (2014). Klasifikasi Indeks Pembangunan Manusia (IPM) dengan Pendekatan K-Nearset Neighbor (K-NN). Seminar Nasional Pendidikan, Sains Dan Teknologi Fakultas Matematika Dan Ilmu Pengetahuan Alam Universitas Muhammadiyah Semarang, 29–35. https://www.researchgate.net/publication/339470520_KLASIFIKASI_INDEKS_PEMBANGUNAN_MANUSIA_IPM_DENGAN_PENDEKATAN_K-NEARSET_NEIGHBOR_K-NN
Fauzi, F. (2017). K-Nearset Neighbor (K-NN) dan Support Vector Machine (SVM) untuk Klasifikasi Indeks Pembangunan Manusia Provinsi Jawa Tengah. Jurnal Mipa, 40(2), 118–124. https://journal.unnes.ac.id/nju/index.php/JM/article/view/12884/7338
Fauzi, Fatkhurokhman, Yamin, M., & Wahyu, T. (2017). Klasifikasi Indeks Pembangunan Manusia Kabupaten / Kota Se-Indonesia dengan Pendekatan Smooth Support Vector Machine ( SSVM ) Kernel Radial Basis Function ( RBF ). Seminar Nasional Pendidikan, Sains Dan Teknologi Fakultas Matematika Dan Ilmu Pengetahuan Alam Universitas Muhammadiyah Semarang, 88–97. https://jurnal.unimus.ac.id/index.php/psn12012010/article/view/2986
Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., & Herrera, F. (2012). A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches. IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews, 42(4), 463–484. https://doi.org/10.1109/TSMCC.2011.2161285
Hasnain, M., Pasha, M. F., Ghani, I., Imran, M., Alzahrani, M. Y., & Budiarto, R. (2020). Evaluating Trust Prediction and Confusion Matrix Measures for Web Services Ranking. IEEE Access, 8, 90847–90861. https://doi.org/10.1109/ACCESS.2020.2994222
Issam H. Laradji, Mohammad Alshayeb, L. G. (2015). Software defect prediction using ensemble learning on selected features. Information and Software Technology ScienceDirect, 58(September 2019), 388–402. https://doi.org/10.1016/j.infsof.2014.07.005
Luque, A., Carrasco, A., Martín, A., & de las Heras, A. (2019). The impact of class imbalance in classification performance metrics based on the binary confusion matrix. Pattern Recognition Elsevier Ltd, 91, 216–231. https://doi.org/10.1016/j.patcog.2019.02.023
Mauludiyah, K. (2020). Klasifikasi Indeks Pembangunan Manusia Kabupaten/Kota di Indonesia Menggunakan Metode Random Forest. Jurnal ILmiah.
Mohammed, M., Khan, M. B., & Bashie, E. B. M. (2017). Machine learning: Algorithms and applications. In E. B. M. B. Mohssen Mohammed, Muhammad Badruddin Khan (Ed.), Machine Learning: Algorithms and Applications. CRC Press Taylor & Francis Group. https://doi.org/10.1201/9781315371658
Mordelet, F., & Vert, J. P. (2013). A bagging SVM to learn from positive and unlabeled examples. Pattern Recognition Letters Elsevier B.V., 37(1), 201–209. https://doi.org/10.1016/j.patrec.2013.06.010
Muttaqin, M. F. J., & Zulkarnain. (2020). Cluster Analysis Using K-Means Method to Classify Indonesia Regency/City based on Human Development Index Indicator. ACM International Conference Proceeding Series, 81–85. https://doi.org/10.1145/3400934.3400951
Naomi Altman, M. K. (2017). Points of Significance: Ensemble methods: Bagging and random forests. Nature Methods, 14(10), 933–934. https://doi.org/10.1038/nmeth.4438
Otok, B. W., Musa, M., Purhadi, & Yasmirullah, S. D. P. (2020). Propensity score stratification using bootstrap aggregating classification trees analysis. Heliyon, 6(7), 0–7. https://doi.org/10.1016/j.heliyon.2020.e04288
Verawaty, Muji Gunarto, Rolia Wahasusmiah, C. I. M. (2021). Determinants of Human Development Index in Indonesia. Proceedings of the 11th Annual International Conference on Industrial Engineering and Operations Management Singapore, March, 4199–4210. https://www.researchgate.net/publication/353750726_Determinants_of_Human_Development_Index_in_Indonesia
Wahono, R. S., & Suryana, N. (2013). Combining particle swarm optimization based feature selection and bagging technique for software defect prediction. International Journal of Software Engineering and Its Applications SERSC, 7(5), 153–166. https://doi.org/10.14257/ijseia.2013.7.5.16
Windridge, D., & Nagarajan, R. (2017). Quantum Bootstrap Aggregation. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 10106 LNCS(February), 115–121. https://doi.org/10.1007/978-3-319-52289-0_9
Xindong Wu, V. K. (2009). The Top Ten Algorithms in Data Mining. In V. K. Xindong Wu (Ed.), Taylor & Francis Group, LLC (Vol. 53, Issue 9). Taylor & Francis Group, LLC. https://doc.lagout.org/Others/Data Mining/The Top Ten Algorithms in Data Mining %5BWu %26 Kumar 2009-04-09%5D.pdf
Downloads
How to Cite
Issue
Section
License
Copyright (c) 2021 Noor Ell Goldameir, Anne Mudya Yolanda, Arisman Adnan, Lusi Febrianti

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.