Data Mining Model For Designing Diagnostic Applications Inflammatory Liver Disease


  • Omar Pahlevi Universitas Bina Sarana Informatika, Indonesia
  • Amrin Amrin Universitas Bina Sarana Informatika, Indonesia




C4.5, naïve bayes, k-Nearest Neighbor, confusion matrix, ROC Curva


Hepatitis is an infectious disease that is a public health problem that affects morbidity, mortality, public health status, life expectancy, and other socio-economic impacts. Early diagnosis of hepatitis is very important so that it can be treated and treated quickly. In this study, the authors will apply and compare several data mining classification methods, including the C4.5 algorithm, Naïve Bayes, and k-Nearest Neighbor to diagnose hepatitis, then compare which of the three methods is the most accurate. Based on the results of measuring the performance of the three models using the Cross Validation, Confusion Matrix and ROC Curve methods, it is known that the C4.5 method is the best method with an accuracy of 70.99% and an under the curva (AUC) value of 0.950, then the k-Nearest Neighbor method with accuracy of 67.19% and the value under the curve (AUC) 0.873, then the naïve Bayes method with an accuracy rate of 66.14% and a value under the curve (AUC) of 0.742.



Publication History:

Submitted Aug 23, 2020
Published Oct 6, 2020
Last Modified Oct 16, 2020

