Analysis of Dimensional Reduction Effect on K-Nearest Neighbor Classification Method

Authors

  • Taufiqurrahman Taufiqurrahman Universitas Sumatera Utara, Medan, Indonesia
  • Erna Budhiarti Nababan Universitas Sumatera Utara
  • Syahril Efendi Universitas Sumatera Utara

DOI:

10.33395/sinkron.v6i1.11234

Keywords:

Dimension Reduction, LDA, PCA, Confusion Matrix, K-NN

Abstract

Classification algorithms mostly become problematic on data with high dimensions, resulting in a decrease in classification accuracy. One method that allows classification algorithms to work faster and more effectively and improve the accuracy and performance of a classification algorithm is by dimensional reduction. In the process of classifying data with the K-Nearest Neighbor algorithm, it is possible to have features that do not have a matching value in classifying, so dimension reduction is required. In this study, the dimension reduction method used is Linear Discriminant Analysis and Principal Component Analysis and classification process using KNN, then analyzed its performance using Matrix Confusion. The datasets used in this study are Arrhythmia, ISOLET, and CNAE-9 obtained from UCI Machine Learning Repository. Based on the results, the performance of classifiers with LDA is better than with PCA on datasets with more than 100 attributes. Arrhythmia datasets can improve performance on K-NN K=3 and K=5. The best performance is obtained by LDA+K-NN K=3 which produces an accuracy value of 98.53%, the lowest performance found in K-NN without reduction with K=3. ISOLET datasets, the best performance results are also obtained by data that has been reduced with LDA, but the best performance is obtained when the classification of K-NN with K=5 and the lowest performance is found in PCA+ K-NN with a value of K=3. As for the best performance, dataset CNAE-9 is also achieved by LDA+K-NN, while the lowest performance is PCA+K-NN with the value of K=3.

GS Cited Analysis

Downloads

Download data is not yet available.

References

Budiman, E., Santoso, E., & Afirianto, T. (2017). Pendeteksi Jenis Autis pada Anak Usia Dini Menggunakan Metode Linear Discriminant Analysis (LDA). Jurnal Pengembangan Teknologi Informasi Dan Ilmu Komputer, 1(7), 583–592.

Cahyani, S., Wiryasaputra, R., & Gustriansyah, R. (2018). Identifikasi Huruf Kapital Tulisan Tangan Menggunakan Linear Discriminant Analysis dan Euclidean Distance. Jurnal Sistem Informasi Bisnis, 8(1), 57. Retrieved from https://doi.org/10.21456/vol8iss1pp57-67

Hana, F. M. (2020). Perbandingan Algoritma Neural Network Dengan Linier Discriminant Analysis (Lda) Pada Klasifikasi Penyakit Diabetes, 1, 1541–1541.

Hariadi, F., Rambu, R., & Enda, H. (2019). Face Detection Using Linear Discriminant Analysis (Lda) Method and Support Vector Machine (Svm). JOINCS (Journal of Informatics, Network, and Computer Science), 1(2), 1–5. Retrieved from https://doi.org/10.21070/joincs.v1i2.521

Hasdyna, N., Nababan, E., & Effendi, S. (2019). Dimension Reduction in Datasets Using Information Gain To Enhance K-NN Performance, 6, 379–383.

Lubis, A., Sihombing, P., & Nababan, E. (2020). Analysis of Accuracy Improvement in K-Nearest Neighbor using Principal Component Analysis (PCA). Journal of Physics: Conference Series, 1566, 12062. Retrieved from https://doi.org/10.1088/1742-6596/1566/1/012062

Ma, F. A., & Wisesty, U. N. (2018). Analisis Pengaruh Metode Reduksi Dimensi Minimum Redundancy Maximum Relevance pada Klasifikasi Kanker Berdasarkan Data Microarray Menggunakan Classifier Support Vector Machine Analysis of The Influence of Minimum Redundancy Maximum Relevance as Dimensiona, 5(1), 1499–1506.

Mutawalli, L., Zaen, M. T. A., & Bagye, W. (2019). KLASIFIKASI TEKS SOSIAL MEDIA TWITTER MENGGUNAKAN SUPPORT VECTOR MACHINE (Studi Kasus Penusukan Wiranto). Jurnal Informatika Dan Rekayasa Elektronik, 2(2), 43. Retrieved from https://doi.org/10.36595/jire.v2i2.117

Rosadi, M. I., Sanjaya, C. B., & Hakim, L. (2018). Klasifikasi Diabetic Retinopathy Menggunakan Seleksi Fitur Dan Support Vector Machine. Jurnal RESISTOR (Rekayasa Sistem Komputer), 1(2), 109–117. Retrieved from https://doi.org/10.31598/jurnalresistor.v1i2.312

Suyanto, S., Siregar, B., Nababan, E., & Fikri, H. (2020). Classification of Infection Type Based on Leukocytes Examination Results Using K-Nearest Neighbor. Journal of Physics: Conference Series, 1566, 12130. Retrieved from https://doi.org/10.1088/1742-6596/1566/1/012130

Syaliman, K. U., Nababan, E. B., & Sitompul, O. S. (2018). Improving the accuracy of k-nearest neighbor using local mean based and distance weight. Journal of Physics: Conference Series, 978(1). Retrieved from https://doi.org/10.1088/1742-6596/978/1/012047

Wibawa, M. S., & Novianti, K. D. P. (2017). Reduksi fitur untuk optimalisasi klasifikasi tumor payudara berdasarkan data citra FNA. Konferensi Nasional Sistem & Informatika, 73–78.

Downloads


Crossmark Updates

How to Cite

Taufiqurrahman, T., Nababan, E. B. ., & Efendi, S. . (2021). Analysis of Dimensional Reduction Effect on K-Nearest Neighbor Classification Method. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 5(2B), 222-230. https://doi.org/10.33395/sinkron.v6i1.11234