Implementation Of K-Nearest Neighbor Algorithm With SMOTE For Hotel Reviews Sentiment Analysis

Authors

  • Firman Gazali Mahmud Sekolah Tinggi Teknologi Wastukancana https://orcid.org/0000-0001-5442-2151
  • Teguh Iman Hermanto Sekolah Tinggi Teknologi Wastukancana
  • Imam Maruf Nugroho Sekolah Tinggi Teknologi Wastukancana

DOI:

10.33395/sinkron.v8i2.12214

Keywords:

Hotel Review, K-Nearest Neighbor, Sentiment Analysis, SMOTE, Traveloka

Abstract

Indonesia has considerable tourism development potential, this phenomenon is in accordance with the number of foreign tourist visits to Indonesia from January to September 2022 recorded by Badan Pusat Statistik many as 2,397,181 visitors. This research focuses on super-priority destinations in Labuan Bajo, East Nusa Tenggara, based on the government's plan that the focus of developing this destination is to increase hotel development to meet the need for an additional 2,000 hotel rooms. Thus, the available hotel rooms are still limited. Then for need to choose a hotel based on the November 2021 survey by the Populix website, 76% of 1,012 respondents chose to book hotels online with the majority using the Traveloka website. However, making decisions in choosing hotels using the reviews feature in the Traveloka website still raises various problems, such as biased information and even the rating values ​​given do not match the reviews submitted. So that users to know what becomes the perception of positive and negative ratings, it is necessary to do in-depth research on satisfaction factors to find out positive and negative sentiments of hotel visitors. This study uses the k-nearest neighbor algorithm with SMOTE on the research objects of the three most popular hotels in Labuan Bajo. Data testing uses a value of k = 3 so that it produces an accuracy value of 87.71% - 93.47% with a maximum error tolerance of 12.29%. In addition, the performance of accuracy results is validated by the appropriate AUC value, namely the good classification category.

GS Cited Analysis

Downloads

Download data is not yet available.

References

Adila, N. (2022). Implementation of Web Scraping for Journal Data Collection on the SINTA Website. SinkrOn: Jurnal Dan Penelitian Teknik Informatika, 7(4), 2478–2485. https://doi.org/10.33395/sinkron.v7i4.11576

Angreni, I. A., Adisasmita, S. A., Ramli, M. I., & Hamid, S. (2018). Pengaruh Nilai K Pada Metode K-Nearest Neighbor (KNN) Terhadap Tingkat Akurasi Identifikasi Kerusakan Jalan. Rekayasa Sipil, 7(2), 63. https://doi.org/10.22441/jrs.2018.v07.i2.01

Cahyaningtyas, C., Nataliani, Y., & Widiasari, I. R. (2021). Analisis Sentimen Pada Rating Aplikasi Shopee Menggunakan Metode Decision Tree Berbasis SMOTE. Jurnal Teknologi Informasi, 18(2), 173–184. https://doi.org/10.24246/aiti.v18i2.173-184

Chory, R. N., Nasrun, M., & Setianingsih, C. (2018). Sentiment Analysis On User Satisfaction Of Mobile Data Services Using Support Vector Machine ( SVM ) Algorithm. 2018 IEEE International Conference on Internet of Things and Intelligence System (IOTAIS), 194–200. https://doi.org/10.1109/IOTAIS.2018.8600884

Damarta, R., Hidayat, A., & Abdullah, A. S. (2021). The application of k-nearest neighbors classifier for sentiment analysis of PT PLN (Persero) twitter account service quality. Journal of Physics: Conference Series, 1722(1), 012002. https://doi.org/10.1088/1742-6596/1722/1/012002

Dharma, A. S., & Saragih, Y. G. R. (2022). Comparison of Feature Extraction Methods on Sentiment Analysis in Hotel Reviews. Sinkron, 7(4), 2349–2354. https://doi.org/10.33395/sinkron.v7i4.11706

Haryanto, E. M. O. N., Estetikha, A. K. A., & Setiawan, R. A. (2022). Implementasi Smote Untuk Mengatasi Imbalanced Data Pada Sentimen Analisis Sentimen Hotel Di Nusa Tenggara Barat Dengan Menggunakan Algoritma Svm. Jurnal Informasi Interaktif, 7(1), 16–20. http://e-journal.janabadra.ac.id/index.php/informasiinteraktif/article/view/1615

Karami, A., Lundy, M., Webb, F., & Dwivedi, Y. K. (2020). Twitter and Research: A Systematic Literature Review through Text Mining. IEEE Access, 8, 67698–67717. https://doi.org/10.1109/ACCESS.2020.2983656

Lo, A. S., & Yao, S. S. (2019). What makes hotel online reviews credible? International Journal of Contemporary Hospitality Management, 31(1), 41–60. https://doi.org/10.1108/IJCHM-10-2017-0671

Patel, A., Oza, P., & Agrawal, S. (2023). Sentiment Analysis of Customer Feedback and Reviews for Airline Services using Language Representation Model. ScienceDirect Procedia Computer Science, 218, 2459–2467. https://doi.org/10.1016/j.procs.2023.01.221

Safitri, R. N. (2020). Analisis Sentimen Review Pelanggan Hotel Menggunakan Metode K-Nearest Neightbor (K-NN). In Unirvesitas Dinamika (Vol. 2507, Issue 1). https://doi.org/10.1016/j.solener.2019.02.027%0A

Sahria, Y. (2020). Implementasi Teknik Web Scraping pada Jurnal SINTA Untuk Analisis Topik Penelitian Kesehatan Indonesia. URECOL (Unversity Research Colloqium), 297–306. http://repository.urecol.org/index.php/proceeding/article/view/1079

Setiawan, R. A., Estetikha, A. K. A., Nurharyanto, E. M. O., Asmara, Y., & Wahyudi, A. (2022). Analisis Sentimen Hotel di Nusa Tenggara Barat Menggunakan Algoritma SVM. Jurnal Informasi Interaktif, 7(1), 149–155. https://papersmai.mercubuana-yogya.ac.id/index.php/smai/article/view/98

Tallo, T. E., & Musdholifah, A. (2018). The Implementation of Genetic Algorithm in Smote (Synthetic Minority Oversampling Technique) for Handling Imbalanced Dataset Problem. Proceedings - 2018 4th International Conference on Science and Technology, ICST 2018, 1, 1–4. https://doi.org/10.1109/ICSTC.2018.8528591

Tsujii, K., Tsuda, K., & Takahashi, M. (2015). Towards Extracting the Hotel Evaluations from the Comments by the Foreign Tourists with Text Mining. 2015 IIAI 4th International Congress on Advanced Applied Informatics, 46–49. https://doi.org/10.1109/IIAI-AAI.2015.172

Downloads


Crossmark Updates

How to Cite

Gazali Mahmud, F., Iman Hermanto, T., & Maruf Nugroho, I. (2023). Implementation Of K-Nearest Neighbor Algorithm With SMOTE For Hotel Reviews Sentiment Analysis. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 7(2), 595-602. https://doi.org/10.33395/sinkron.v8i2.12214