Machine Learning to Identify Monkey Pox Disease


  • Febri Aldi Universitas Putra Indonesia "YPTK" Padang
  • Irohito Nozomi Universitas Putra Indonesia YPTK Padang, Indonesia
  • Rio Bayu Sentosa Universitas Putra Indonesia YPTK Padang, Indonesia
  • Ahmad Junaidi Universitas Putra Indonesia YPTK Padang, Indonesia




Monkey Pox, Machine Learning, Kaggle, Classification, SVM


In May 2022, it has received by WHO reports from non-endemic countries on cases of monkey pox disease. Monkey pox is a rare zoonotic disease caused by infection with the monkeypox virus that belongs to the genus orthopoxvirus and the family poxviridae, and also the variola virus. This study aims to classify patients who have contracted the monkey pox virus. We modeled an analysis of monkey pox disease and conducted comparisons utilizing a dataset from Kaggle consisting of a CSV file with records for 25,000 patients. The monkey pox dataset was analyzed using the correlation coefficient and the number of target variables.  Machine learning (ML) methods are used for classification by utilizing the K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Random Forest (RF), and Gradient Boosting (GB) algorithms. This study resulted in the highest classifier Gradient Boosting (GB) algorithm with an accuracy value of 71%. then the accuracy obtained by Support Vector Machine (SVM) is 69%, Random Forest (RF) accuracy is 68%, and finally K-Nearest Neighbor (KNN) obtains 63% accuracy. This ML method is expected to analyze monkey pox disease so that it helps the country and government, especially the health field in assessing, identifying, and being able to take appropriate action against monkey pox disease.


GS Cited Analysis


Download data is not yet available.


Abdelhamid, A. A., El-Kenawy, E. S. M., Khodadadi, N., Mirjalili, S., Khafaga, D. S., Alharbi, A. H., Ibrahim, A., Eid, M. M., & Saber, M. (2022). Classification of Monkeypox Images Based on Transfer Learning and the Al-Biruni Earth Radius Optimization Algorithm. Mathematics 2022, Vol. 10, Page 3614, 10(19), 3614.

Ahmad, I., Yousaf, M., Yousaf, S., & Ahmad, M. O. (2020). Fake News Detection Using Machine Learning Ensemble Methods. Complexity, 2020.

Ahsan, M. M., Uddin, M. R., Farjana, M., Sakib, A. N., Momin, K. Al, & Luna, S. A. (2022). Image Data collection and implementation of deep learning-based model in detecting Monkeypox disease using modified VGG16.

Ahsan, M. M., Uddin, M. R., & Luna, S. A. (2022). Monkeypox Image Data collection.

Al Awaidy, S., & Sallam, M. (n.d.). Systematic Review and Meta-analysis on COVID-19 Vaccine Hesitancy View project Respiratory Viruses View project.

Athani, S., Joshi, S., Rao, B. A., Rai, S., & Kini, N. G. (2021). Parallel Implementation of kNN Algorithm for Breast Cancer Detection. Advances in Intelligent Systems and Computing, 1176, 475–483.

Besombes, C., Gonofio, E., Konamna, X., Selekon, B., Gessain, A., Berthet, N., Manuguerra, J. C., Fontanet, A., & Nakouné, E. (2019). Intrafamily Transmission of Monkeypox Virus, Central African Republic, 2018. Emerging Infectious Diseases, 25(8), 1602.

Cervantes, J., Garcia-Lamont, F., Rodríguez-Mazahua, L., & Lopez, A. (2020). A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing, 408, 189–215.

Chen, R. C., Dewi, C., Huang, S. W., & Caraka, R. E. (2020). Selecting critical features for data classification based on machine learning methods. Journal of Big Data, 7(1), 1–26.

Divya Zion, G. (2020). Comparative analysis of tools for big data visualization and challenges. Data Visualization: Trends and Challenges Toward Multidisciplinary Perception, 33–52.

Durski, K. N., McCollum, A. M., Nakazawa, Y., Petersen, B. W., Reynolds, M. G., Briand, S., Djingarey, M. H., Olson, V., Damon, I. K., & Khalakdina, A. (2018). Emergence of Monkeypox — West and Central Africa, 1970–2017. Morbidity and Mortality Weekly Report, 67(10), 306.

Eid, M. M., El-Kenawy, E. S. M., Khodadadi, N., Mirjalili, S., Khodadadi, E., Abotaleb, M., Alharbi, A. H., Abdelhamid, A. A., Ibrahim, A., Amer, G. M., Kadi, A., & Khafaga, D. S. (2022). Meta-Heuristic Optimization of LSTM-Based Deep Network for Boosting the Prediction of Monkeypox Cases. Mathematics 2022, Vol. 10, Page 3845, 10(20), 3845.

Gautam, A., & Singh, V. (2020). Parametric Versus Non-Parametric Time Series Forecasting Methods: A Review. Journal of Engineering Science and Technology Review, 13(3), 165–171.

Girometti, N., Byrne, R., Bracchi, M., Heskin, J., McOwan, A., Tittle, V., Gedela, K., Scott, C., Patel, S., Gohil, J., Nugent, D., Suchak, T., Dickinson, M., Feeney, M., Mora-Peris, B., Stegmann, K., Plaha, K., Davies, G., Moore, L. S. P., … Whitlock, G. (2022). Demographic and clinical characteristics of confirmed human monkeypox virus cases in individuals attending a sexual health centre in London, UK: an observational analysis. The Lancet Infectious Diseases, 22(9), 1321–1328.

Herrera, V. M., Khoshgoftaar, T. M., Villanustre, F., & Furht, B. (2019). Random forest implementation and optimization for Big Data analytics on LexisNexis’s high performance computing cluster platform. Journal of Big Data, 6(1), 1–36.

Horak, J., Vrbka, J., & Suler, P. (2020). Support Vector Machine Methods and Artificial Neural Networks Used for the Development of Bankruptcy Prediction Models and their Comparison. Journal of Risk and Financial Management 2020, Vol. 13, Page 60, 13(3), 60.

Islam, J. U., Hollebeek, L. D., Rahman, Z., Khan, I., & Rasool, A. (2019). Customer engagement in the service context: An empirical investigation of the construct, its antecedents and consequences. Journal of Retailing and Consumer Services, 50(May), 277–285.

Islam, T., Hussain, M. A., Chowdhury, F. U. H., & Islam, B. M. R. (2022). Can Artificial Intelligence Detect Monkeypox from Digital Skin Images? BioRxiv, 2022.08.08.503193.

Islam, T., Hussain, M. A., Uddin, F., Chowdhury, H., & Islam, B. M. R. (2022). A Web-scraped Skin Image Database of Monkeypox, Chickenpox, Smallpox, Cowpox, and Measles. BioRxiv, 2022.08.01.502199.

Kim, H., & Lakshmi, V. (2018). Use of Cyclone Global Navigation Satellite System (CyGNSS) Observations for Estimation of Soil Moisture. Geophysical Research Letters, 45(16), 8272–8282.

Kolluri, A., Vinton, K., & Murthy, D. (2022). PoxVerifi: An Information Verification System to Combat Monkeypox Misinformation.

Kumar, N., Acharya, A., Gendelman, H. E., & Byrareddy, S. N. (2022). The 2022 outbreak and the pathobiology of the monkeypox virus. Journal of Autoimmunity, 131, 102855.

Lakshmanaprabu, S. K., Shankar, K., Ilayaraja, M., Nasir, A. W., Vijayakumar, V., & Chilamkurti, N. (2019). Random forest for big data classification in the internet of things using optimal features. International Journal of Machine Learning and Cybernetics 2019 10:10, 10(10), 2609–2618.

Li, C., Chen, Y., & Shang, Y. (2022). A review of industrial big data for decision making in intelligent manufacturing. Engineering Science and Technology, an International Journal, 29, 101021.

Lu, M., Xuan, S., & Wang, Z. (2019). Oral microbiota: A new view of body health. Food Science and Human Wellness, 8(1), 8–15.

Mohapatra, R. K., Tuli, H. S., Sarangi, A. K., Chakraborty, S., Chandran, D., Chakraborty, C., & Dhama, K. (2022). Unexpected sudden rise of human monkeypox cases in multiple non-endemic countries amid COVID-19 pandemic and salient counteracting strategies: Another potential global threat? International Journal of Surgery (London, England), 103, 106705.

Monkey-Pox PATIENTS Dataset. | Kaggle. (n.d.). Retrieved November 17, 2022, from

Neelakandan, S., & Paulraj, D. (2020). A gradient boosted decision tree-based sentiment classification of twitter data. Https://Doi.Org/10.1142/S0219691320500277, 18(4).

Petrovic, N., & Petrović, N. (n.d.). Machine Learning within Information Systems Course using Weka in Java: Monkeypox Case Studies SCOR (Semantic COordination for Rawfie) View project Machine Learning within Information Systems Course using Weka in Java: Monkeypox Case Studies.

Rogers, T. J., Worden, K., Fuentes, R., Dervilis, N., Tygesen, U. T., & Cross, E. J. (2019). A Bayesian non-parametric clustering approach for semi-supervised Structural Health Monitoring. Mechanical Systems and Signal Processing, 119, 100–119.

Roux, M. (2018). A Comparative Study of Divisive and Agglomerative Hierarchical Clustering Algorithms. Journal of Classification 2018 35:2, 35(2), 345–366.

Saadatfar, H., Khosravi, S., Joloudari, J. H., Mosavi, A., & Shamshirband, S. (2020). A New K-Nearest Neighbors Classifier for Big Data Based on Efficient Data Pruning. Mathematics 2020, Vol. 8, Page 286, 8(2), 286.

Sahin, E. K. (2020). Assessing the predictive capability of ensemble tree methods for landslide susceptibility mapping using XGBoost, gradient boosting machine, and random forest. SN Applied Sciences, 2(7), 1–17.

Saravanan, R., & Sujatha, P. (2019). A State of Art Techniques on Machine Learning Algorithms: A Perspective of Supervised Learning Approaches in Data Classification. Proceedings of the 2nd International Conference on Intelligent Computing and Control Systems, ICICCS 2018, 945–949.

Sarker, I. H. (2021). Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Computer Science, 2(3), 1–21.

Schober, P., & Schwarte, L. A. (2018). Correlation Coefficients: Appropriate Use and Interpretation. Anesthesia and Analgesia, 126(5), 1763–1768.

Shabani, S., Samadianfard, S., Sattari, M. T., Mosavi, A., Shamshirband, S., Kmet, T., & Várkonyi-Kóczy, A. R. (2020). Modeling Pan Evaporation Using Gaussian Process Regression K-Nearest Neighbors Random Forest and Support Vector Machines; Comparative Analysis. Atmosphere 2020, Vol. 11, Page 66, 11(1), 66.

Shah, D., Patel, S., & Bharti, S. K. (2020). Heart Disease Prediction using Machine Learning Techniques. SN Computer Science 2020 1:6, 1(6), 1–6.

Singh, P., Singh, N., Singh, K. K., & Singh, A. (2021). Diagnosing of disease using machine learning. Machine Learning and the Internet of Medical Things in Healthcare, 89–111.

Singh, U., Rizwan, M., Alaraj, M., & Alsaidan, I. (2021). A Machine Learning-Based Gradient Boosting Regression Approach for Wind Power Production Forecasting: A Step towards Smart Grid Environments. Energies 2021, Vol. 14, Page 5196, 14(16), 5196.

Vabalas, A., Gowen, E., Poliakoff, E., & Casson, A. J. (2019). Machine learning algorithm validation with a limited sample size. PLOS ONE, 14(11), e0224365.

Wang, T., Ke, H., Zheng, X., Wang, K., Sangaiah, A. K., & Liu, A. (2020). Big Data Cleaning Based on Mobile Edge Computing in Industrial Sensor-Cloud. IEEE Transactions on Industrial Informatics, 16(2), 1321–1329.

Zareef, M., Chen, Q., Hassan, M. M., Arslan, M., Hashim, M. M., Ahmad, W., Kutsanedzie, F. Y. H., & Agyekum, A. A. (2020). An Overview on the Applications of Typical Non-linear Algorithms Coupled With NIR Spectroscopy in Food Analysis. Food Engineering Reviews 2020 12:2, 12(2), 173–190.

Zitnik, M., Nguyen, F., Wang, B., Leskovec, J., Goldenberg, A., & Hoffman, M. M. (2019). Machine learning for integrating data in biology and medicine: Principles, practice, and opportunities. Information Fusion, 50, 71–91.


Crossmark Updates

How to Cite

Aldi, F., Nozomi, I. ., Sentosa, R. B. ., & Junaidi, A. . (2023). Machine Learning to Identify Monkey Pox Disease. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 8(3), 1335-1347.