Comparison Of The C.45 And Naive Bayes Algorithms To Predict Diabetes

Authors

  • Alam School of Business and Information Technology, STMIK LIKMI Bandung – Indonesia
  • Divi Adiffia Freza Alana School of Business and Information Technology, STMIK LIKMI Bandung – Indonesia
  • Christina Juliane School of Business and Information Technology, STMIK LIKMI Bandung – Indonesia

DOI:

10.33395/sinkron.v8i4.12998

Keywords:

Decision Tree Algorithm C4,5, Data Mining, Diabetes, Naïve Bayes

Abstract

Diabetes mellitus is an urgent global health problem and has a major impact on people around the world. This disease is characterized by high levels of sugar (glucose) in the blood due to disturbances in the production or use of the hormone insulin by the body. This study aims to carry out accurate early detection of diabetics so that they can be treated as soon as possible to reduce the risk of death and to compare the two algorithms that have the best level of accuracy. The algorithms used in this study are the C4.5 and Naïve Bayes Decision Tree Algorithms. The results of the experiments carried out in this study the Decision Tree Algorithm C4.5 and Naïve Bayes can be used in modeling the early detection of diabetes. The highest average accuracy results were obtained at 90.835% using the Decision Tree C4.5 Algorithm. As for the Naïve Bayes Algorithm, an average accuracy rate of 90.745% is obtained. The pruning process was carried out using the Decision Tree Algorithm C4.5, the accuracy performance increased to 91.30%. There were 18 patterns or rules for the early detection of diabetics from the built model. The determination of attributes, the number of attribute dimensions, and the number of samples greatly affect the performance of the model built.

GS Cited Analysis

Downloads

Download data is not yet available.

References

Anggraini, S., Defit, S., & Nurcahyo, G. W. (2018). Analisis Data Mining Penjualan Ban Menggunakan Algoritma C4. 5. Jurnal Ilmu Teknik Elektro …, 5(2), 0–7. https://core.ac.uk/download/pdf/295348196.pdf

Chen, S., Webb, G. I., Liu, L., & Ma, X. (2020). A novel selective naïve Bayes algorithm. Knowledge-Based Systems, 192(xxxx), 105361. https://doi.org/10.1016/j.knosys.2019.105361

Chen, W., Li, Y., Xue, W., Shahabi, H., Li, S., Hong, H., Wang, X., Bian, H., Zhang, S., Pradhan, B., & Ahmad, B. Bin. (2020). Modeling flood susceptibility using data-driven approaches of naïve Bayes tree, alternating decision tree, and random forest methods. Science of the Total Environment, 701, 134979. https://doi.org/10.1016/j.scitotenv.2019.134979

Choubey, D. K., Kumar, P., Tripathi, S., & Kumar, S. (2020). Performance evaluation of classification methods with PCA and PSO for diabetes. Network Modeling Analysis in Health Informatics and Bioinformatics, 9(1). https://doi.org/10.1007/s13721-019-0210-8

Enriko, I. K. A., Melinda, M., Sulyani, A. C., & Astawa, I. G. B. (2021). Breast cancer recurrence prediction system using k-nearest neighbor, naïve-bayes, and support vector machine algorithm. Jurnal Infotel, 13(4), 185–188. https://doi.org/10.20895/infotel.v13i4.692

Fajriati, N., Prasetiyo, B., Semarang, U. N., & Korespondensi, P. (2023). Optimasi algoritma naïve bayes dengan diskritisasi k-means optimization of naïve bayes algorithm using k-means discretization in heart disease diagnosis. 10(3), 503–512. https://doi.org/10.25126/jtiik.2023106510

Fersellia, F., Utami, E., & Yaqin, A. (2023). Sentiment Analysis of Shopee Food Application User Satisfaction Using the C4.5 Decision Tree Method. Sinkron, 8(3), 1554–1563. https://doi.org/10.33395/sinkron.v8i3.12531

Gadekallu, T. R., Khare, N., Bhattacharya, S., Singh, S., Maddikunta, P. K. R., Ra, I. H., & Alazab, M. (2020). Early detection of diabetic retinopathy using pca-firefly based deep learning model. Electronics (Switzerland), 9(2), 1–16. https://doi.org/10.3390/electronics9020274

Hasan, M. K., Alam, M. A., Das, D., Hossain, E., & Hasan, M. (2020). Diabetes prediction using ensembling of different machine learning classifiers. IEEE Access, 8, 76516–76531. https://doi.org/10.1109/ACCESS.2020.2989857

Kopitar, L., Kocbek, P., Cilar, L., Sheikh, A., & Stiglic, G. (2020). Early detection of type 2 diabetes mellitus using machine learning-based prediction models. Scientific Reports, 10(1), 1–12. https://doi.org/10.1038/s41598-020-68771-z

Kumari, S., Kumar, D., & Mittal, M. (2021). An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier. International Journal of Cognitive Computing in Engineering, 2(November 2020), 40–46. https://doi.org/10.1016/j.ijcce.2021.01.001

Nagaraj, P., Deepalakshmi, P., Mansour, R. F., & Almazroa, A. (2021). Artificial flora algorithm-based feature selection with gradient boosted tree model for diabetes classification. Diabetes, Metabolic Syndrome and Obesity, 14, 2789–2806. https://doi.org/10.2147/DMSO.S312787

Patra, K. C., Sethi, R. N., & Behera, D. K. (2021). Benchmark of Unsupervised Machine Learning Algorithms for Condition Monitoring. In Lecture Notes in Networks and Systems: Vol. 185 LNNS. https://doi.org/10.1007/978-981-33-6081-5_17

Rianto, H., Amrin, Rudianto, Pahlevi, O., Kusumawardhani, P., & Hadi, S. S. (2020). Determining the Eligibility of Providing Motorized Vehicle Loans by Using the Logistic Regression, Naive Bayes and Decission Tree (C4.5). Journal of Physics: Conference Series, 1641(1). https://doi.org/10.1088/1742-6596/1641/1/012061

Shrinivasan, L., Verma, R., & Nandeesh, M. D. (2023). Early prediction of diabetes diagnosis using hybrid classification techniques. IAES International Journal of Artificial Intelligence, 12(3), 1139–1148. https://doi.org/10.11591/ijai.v12.i3.pp1139-1148

Singh, S., & Yassine, A. (2018). Big data mining of energy time series for behavioral analytics and energy consumption forecasting. Energies, 11(2). https://doi.org/10.3390/en11020452

Thabtah, F., Hammoud, S., Kamalov, F., & Gonsalves, A. (2020). Data imbalance in classification: Experimental evaluation. Information Sciences, 513, 429–441. https://doi.org/10.1016/j.ins.2019.11.004

Tigga, N. P., & Garg, S. (2020). Prediction of Type 2 Diabetes using Machine Learning Classification Methods. Procedia Computer Science, 167(2019), 706–716. https://doi.org/10.1016/j.procs.2020.03.336

Wong, T., & Yeh, P. (n.d.). 10.1109@Tkde.2019.2912815. 1, 1.

Downloads


Crossmark Updates

How to Cite

Alam, A., Alana, D. A. F. ., & Juliane, C. . (2023). Comparison Of The C.45 And Naive Bayes Algorithms To Predict Diabetes. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 7(4), 2641-2650. https://doi.org/10.33395/sinkron.v8i4.12998