Machine Learning to Identify Monkey Pox Disease
DOI:
10.33395/sinkron.v8i3.12524Keywords:
Monkey Pox, Machine Learning, Kaggle, Classification, SVMAbstract
In May 2022, it has received by WHO reports from non-endemic countries on cases of monkey pox disease. Monkey pox is a rare zoonotic disease caused by infection with the monkeypox virus that belongs to the genus orthopoxvirus and the family poxviridae, and also the variola virus. This study aims to classify patients who have contracted the monkey pox virus. We modeled an analysis of monkey pox disease and conducted comparisons utilizing a dataset from Kaggle consisting of a CSV file with records for 25,000 patients. The monkey pox dataset was analyzed using the correlation coefficient and the number of target variables. Machine learning (ML) methods are used for classification by utilizing the K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Random Forest (RF), and Gradient Boosting (GB) algorithms. This study resulted in the highest classifier Gradient Boosting (GB) algorithm with an accuracy value of 71%. then the accuracy obtained by Support Vector Machine (SVM) is 69%, Random Forest (RF) accuracy is 68%, and finally K-Nearest Neighbor (KNN) obtains 63% accuracy. This ML method is expected to analyze monkey pox disease so that it helps the country and government, especially the health field in assessing, identifying, and being able to take appropriate action against monkey pox disease.
Downloads
References
Abdelhamid, A. A., El-Kenawy, E. S. M., Khodadadi, N., Mirjalili, S., Khafaga, D. S., Alharbi, A. H., Ibrahim, A., Eid, M. M., & Saber, M. (2022). Classification of Monkeypox Images Based on Transfer Learning and the Al-Biruni Earth Radius Optimization Algorithm. Mathematics 2022, Vol. 10, Page 3614, 10(19), 3614. https://doi.org/10.3390/MATH10193614
Ahmad, I., Yousaf, M., Yousaf, S., & Ahmad, M. O. (2020). Fake News Detection Using Machine Learning Ensemble Methods. Complexity, 2020. https://doi.org/10.1155/2020/8885861
Ahsan, M. M., Uddin, M. R., Farjana, M., Sakib, A. N., Momin, K. Al, & Luna, S. A. (2022). Image Data collection and implementation of deep learning-based model in detecting Monkeypox disease using modified VGG16. https://doi.org/10.48550/arxiv.2206.01862
Ahsan, M. M., Uddin, M. R., & Luna, S. A. (2022). Monkeypox Image Data collection. https://doi.org/10.48550/arxiv.2206.01774
Al Awaidy, S., & Sallam, M. (n.d.). Systematic Review and Meta-analysis on COVID-19 Vaccine Hesitancy View project Respiratory Viruses View project. https://doi.org/10.18295/squmj.8.2022.046
Athani, S., Joshi, S., Rao, B. A., Rai, S., & Kini, N. G. (2021). Parallel Implementation of kNN Algorithm for Breast Cancer Detection. Advances in Intelligent Systems and Computing, 1176, 475–483. https://doi.org/10.1007/978-981-15-5788-0_46/COVER
Besombes, C., Gonofio, E., Konamna, X., Selekon, B., Gessain, A., Berthet, N., Manuguerra, J. C., Fontanet, A., & Nakouné, E. (2019). Intrafamily Transmission of Monkeypox Virus, Central African Republic, 2018. Emerging Infectious Diseases, 25(8), 1602. https://doi.org/10.3201/EID2508.190112
Cervantes, J., Garcia-Lamont, F., Rodríguez-Mazahua, L., & Lopez, A. (2020). A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing, 408, 189–215. https://doi.org/10.1016/J.NEUCOM.2019.10.118
Chen, R. C., Dewi, C., Huang, S. W., & Caraka, R. E. (2020). Selecting critical features for data classification based on machine learning methods. Journal of Big Data, 7(1), 1–26. https://doi.org/10.1186/S40537-020-00327-4/FIGURES/13
Divya Zion, G. (2020). Comparative analysis of tools for big data visualization and challenges. Data Visualization: Trends and Challenges Toward Multidisciplinary Perception, 33–52. https://doi.org/10.1007/978-981-15-2282-6_3/COVER
Durski, K. N., McCollum, A. M., Nakazawa, Y., Petersen, B. W., Reynolds, M. G., Briand, S., Djingarey, M. H., Olson, V., Damon, I. K., & Khalakdina, A. (2018). Emergence of Monkeypox — West and Central Africa, 1970–2017. Morbidity and Mortality Weekly Report, 67(10), 306. https://doi.org/10.15585/MMWR.MM6710A5
Eid, M. M., El-Kenawy, E. S. M., Khodadadi, N., Mirjalili, S., Khodadadi, E., Abotaleb, M., Alharbi, A. H., Abdelhamid, A. A., Ibrahim, A., Amer, G. M., Kadi, A., & Khafaga, D. S. (2022). Meta-Heuristic Optimization of LSTM-Based Deep Network for Boosting the Prediction of Monkeypox Cases. Mathematics 2022, Vol. 10, Page 3845, 10(20), 3845. https://doi.org/10.3390/MATH10203845
Gautam, A., & Singh, V. (2020). Parametric Versus Non-Parametric Time Series Forecasting Methods: A Review. Journal of Engineering Science and Technology Review, 13(3), 165–171. https://doi.org/10.25103/jestr.133.18
Girometti, N., Byrne, R., Bracchi, M., Heskin, J., McOwan, A., Tittle, V., Gedela, K., Scott, C., Patel, S., Gohil, J., Nugent, D., Suchak, T., Dickinson, M., Feeney, M., Mora-Peris, B., Stegmann, K., Plaha, K., Davies, G., Moore, L. S. P., … Whitlock, G. (2022). Demographic and clinical characteristics of confirmed human monkeypox virus cases in individuals attending a sexual health centre in London, UK: an observational analysis. The Lancet Infectious Diseases, 22(9), 1321–1328. https://doi.org/10.1016/S1473-3099(22)00411-X
Herrera, V. M., Khoshgoftaar, T. M., Villanustre, F., & Furht, B. (2019). Random forest implementation and optimization for Big Data analytics on LexisNexis’s high performance computing cluster platform. Journal of Big Data, 6(1), 1–36. https://doi.org/10.1186/S40537-019-0232-1/FIGURES/17
Horak, J., Vrbka, J., & Suler, P. (2020). Support Vector Machine Methods and Artificial Neural Networks Used for the Development of Bankruptcy Prediction Models and their Comparison. Journal of Risk and Financial Management 2020, Vol. 13, Page 60, 13(3), 60. https://doi.org/10.3390/JRFM13030060
Islam, J. U., Hollebeek, L. D., Rahman, Z., Khan, I., & Rasool, A. (2019). Customer engagement in the service context: An empirical investigation of the construct, its antecedents and consequences. Journal of Retailing and Consumer Services, 50(May), 277–285. https://doi.org/10.1016/j.jretconser.2019.05.018
Islam, T., Hussain, M. A., Chowdhury, F. U. H., & Islam, B. M. R. (2022). Can Artificial Intelligence Detect Monkeypox from Digital Skin Images? BioRxiv, 2022.08.08.503193. https://doi.org/10.1101/2022.08.08.503193
Islam, T., Hussain, M. A., Uddin, F., Chowdhury, H., & Islam, B. M. R. (2022). A Web-scraped Skin Image Database of Monkeypox, Chickenpox, Smallpox, Cowpox, and Measles. BioRxiv, 2022.08.01.502199. https://doi.org/10.1101/2022.08.01.502199
Kim, H., & Lakshmi, V. (2018). Use of Cyclone Global Navigation Satellite System (CyGNSS) Observations for Estimation of Soil Moisture. Geophysical Research Letters, 45(16), 8272–8282. https://doi.org/10.1029/2018GL078923
Kolluri, A., Vinton, K., & Murthy, D. (2022). PoxVerifi: An Information Verification System to Combat Monkeypox Misinformation. https://doi.org/10.48550/arxiv.2209.09300
Kumar, N., Acharya, A., Gendelman, H. E., & Byrareddy, S. N. (2022). The 2022 outbreak and the pathobiology of the monkeypox virus. Journal of Autoimmunity, 131, 102855. https://doi.org/10.1016/J.JAUT.2022.102855
Lakshmanaprabu, S. K., Shankar, K., Ilayaraja, M., Nasir, A. W., Vijayakumar, V., & Chilamkurti, N. (2019). Random forest for big data classification in the internet of things using optimal features. International Journal of Machine Learning and Cybernetics 2019 10:10, 10(10), 2609–2618. https://doi.org/10.1007/S13042-018-00916-Z
Li, C., Chen, Y., & Shang, Y. (2022). A review of industrial big data for decision making in intelligent manufacturing. Engineering Science and Technology, an International Journal, 29, 101021. https://doi.org/10.1016/J.JESTCH.2021.06.001
Lu, M., Xuan, S., & Wang, Z. (2019). Oral microbiota: A new view of body health. Food Science and Human Wellness, 8(1), 8–15. https://doi.org/10.1016/J.FSHW.2018.12.001
Mohapatra, R. K., Tuli, H. S., Sarangi, A. K., Chakraborty, S., Chandran, D., Chakraborty, C., & Dhama, K. (2022). Unexpected sudden rise of human monkeypox cases in multiple non-endemic countries amid COVID-19 pandemic and salient counteracting strategies: Another potential global threat? International Journal of Surgery (London, England), 103, 106705. https://doi.org/10.1016/J.IJSU.2022.106705
Monkey-Pox PATIENTS Dataset. | Kaggle. (n.d.). Retrieved November 17, 2022, from https://www.kaggle.com/datasets/muhammad4hmed/monkeypox-patients-dataset
Neelakandan, S., & Paulraj, D. (2020). A gradient boosted decision tree-based sentiment classification of twitter data. Https://Doi.Org/10.1142/S0219691320500277, 18(4). https://doi.org/10.1142/S0219691320500277
Petrovic, N., & Petrović, N. (n.d.). Machine Learning within Information Systems Course using Weka in Java: Monkeypox Case Studies SCOR (Semantic COordination for Rawfie) View project Machine Learning within Information Systems Course using Weka in Java: Monkeypox Case Studies. https://doi.org/10.13140/RG.2.2.26822.75842
Rogers, T. J., Worden, K., Fuentes, R., Dervilis, N., Tygesen, U. T., & Cross, E. J. (2019). A Bayesian non-parametric clustering approach for semi-supervised Structural Health Monitoring. Mechanical Systems and Signal Processing, 119, 100–119. https://doi.org/10.1016/J.YMSSP.2018.09.013
Roux, M. (2018). A Comparative Study of Divisive and Agglomerative Hierarchical Clustering Algorithms. Journal of Classification 2018 35:2, 35(2), 345–366. https://doi.org/10.1007/S00357-018-9259-9
Saadatfar, H., Khosravi, S., Joloudari, J. H., Mosavi, A., & Shamshirband, S. (2020). A New K-Nearest Neighbors Classifier for Big Data Based on Efficient Data Pruning. Mathematics 2020, Vol. 8, Page 286, 8(2), 286. https://doi.org/10.3390/MATH8020286
Sahin, E. K. (2020). Assessing the predictive capability of ensemble tree methods for landslide susceptibility mapping using XGBoost, gradient boosting machine, and random forest. SN Applied Sciences, 2(7), 1–17. https://doi.org/10.1007/S42452-020-3060-1/TABLES/4
Saravanan, R., & Sujatha, P. (2019). A State of Art Techniques on Machine Learning Algorithms: A Perspective of Supervised Learning Approaches in Data Classification. Proceedings of the 2nd International Conference on Intelligent Computing and Control Systems, ICICCS 2018, 945–949. https://doi.org/10.1109/ICCONS.2018.8663155
Sarker, I. H. (2021). Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Computer Science, 2(3), 1–21. https://doi.org/10.1007/S42979-021-00592-X/FIGURES/11
Schober, P., & Schwarte, L. A. (2018). Correlation Coefficients: Appropriate Use and Interpretation. Anesthesia and Analgesia, 126(5), 1763–1768. https://doi.org/10.1213/ANE.0000000000002864
Shabani, S., Samadianfard, S., Sattari, M. T., Mosavi, A., Shamshirband, S., Kmet, T., & Várkonyi-Kóczy, A. R. (2020). Modeling Pan Evaporation Using Gaussian Process Regression K-Nearest Neighbors Random Forest and Support Vector Machines; Comparative Analysis. Atmosphere 2020, Vol. 11, Page 66, 11(1), 66. https://doi.org/10.3390/ATMOS11010066
Shah, D., Patel, S., & Bharti, S. K. (2020). Heart Disease Prediction using Machine Learning Techniques. SN Computer Science 2020 1:6, 1(6), 1–6. https://doi.org/10.1007/S42979-020-00365-Y
Singh, P., Singh, N., Singh, K. K., & Singh, A. (2021). Diagnosing of disease using machine learning. Machine Learning and the Internet of Medical Things in Healthcare, 89–111. https://doi.org/10.1016/B978-0-12-821229-5.00003-3
Singh, U., Rizwan, M., Alaraj, M., & Alsaidan, I. (2021). A Machine Learning-Based Gradient Boosting Regression Approach for Wind Power Production Forecasting: A Step towards Smart Grid Environments. Energies 2021, Vol. 14, Page 5196, 14(16), 5196. https://doi.org/10.3390/EN14165196
Vabalas, A., Gowen, E., Poliakoff, E., & Casson, A. J. (2019). Machine learning algorithm validation with a limited sample size. PLOS ONE, 14(11), e0224365. https://doi.org/10.1371/JOURNAL.PONE.0224365
Wang, T., Ke, H., Zheng, X., Wang, K., Sangaiah, A. K., & Liu, A. (2020). Big Data Cleaning Based on Mobile Edge Computing in Industrial Sensor-Cloud. IEEE Transactions on Industrial Informatics, 16(2), 1321–1329. https://doi.org/10.1109/TII.2019.2938861
Zareef, M., Chen, Q., Hassan, M. M., Arslan, M., Hashim, M. M., Ahmad, W., Kutsanedzie, F. Y. H., & Agyekum, A. A. (2020). An Overview on the Applications of Typical Non-linear Algorithms Coupled With NIR Spectroscopy in Food Analysis. Food Engineering Reviews 2020 12:2, 12(2), 173–190. https://doi.org/10.1007/S12393-020-09210-7
Zitnik, M., Nguyen, F., Wang, B., Leskovec, J., Goldenberg, A., & Hoffman, M. M. (2019). Machine learning for integrating data in biology and medicine: Principles, practice, and opportunities. Information Fusion, 50, 71–91. https://doi.org/10.1016/J.INFFUS.2018.09.012
Downloads
How to Cite
Issue
Section
License
Copyright (c) 2023 Febri Aldi, Irohito Nozomi, Rio Bayu Sentosa, Ahmad Junaidi
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.