Comparison of Feature Extraction Methods on Sentiment Analysis in Hotel Reviews

Authors

DOI:

10.33395/sinkron.v7i4.11706

Abstract

The development of technology causes things that done through meet in person or coming to a place can now be done by viewing information through gadgets or websites. Nowadays, to find out information about a place that provides accommodation for a vacation or a business visit, it can be done by accessing social media to see reviews from visitors who have visited the place, example, a hotel. Reviews given by hotel visitors are seen as more credible than information obtained from advertisements but the problem is that there are many reviews circulating on social media and it takes a time to analyze them. This study aims to analyze hotel reviews using the sentiment analysis method with the Support Vector Machine (SVM) approach. Sentiment analysis can be used to analyze the opinions of a large number of hotel visitors where it usually focuses on opinions that positive, negative and neutral. Before being analyzed with the support vector machine algorithm, 3 feature extraction methods will be used, namely Bag Of Words, TF-IDF and improvement TF-IDF to get the value of each word weight. The selection of these three methods is carried out by considering the influence of the presence of the same word feature in each review. In this comparison method, TF-IDF was found to be the best feature extraction method with 71.75% accuracy, 78.66% precision, 71.91% recall and 70.08% f1-score. The results obtained indicate that there are influence of features of the word in the hotel review data.

GS Cited Analysis

Downloads

Download data is not yet available.

References

Ahuja, R., Chug, A., Kohli, S., Gupta, S., & Ahuja, P. (2019). The impact of features extraction on the sentiment analysis. Procedia Computer Science, 152. https://doi.org/10.1016/j.procs.2019.05.008

Berrar, D. (2018). Cross-validation. In Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics (Vol. 1–3). https://doi.org/10.1016/B978-0-12-809633-8.20349-X

Guo, A., & Yang, T. (2016). Research and improvement of feature words weight based on TFIDF algorithm. Proceedings of 2016 IEEE Information Technology, Networking, Electronic and Automation Control Conference, ITNEC 2016. https://doi.org/10.1109/ITNEC.2016.7560393

Himawan, H., Kaswidjanti, W., Sentimen, A., Sosial, M., & Based, L. (2018). Metode Lexicon Based dan Support Vector Machine untuk Menganalisis Sentimen pada Media Sosial sebagai Rekomendasi Oleh-Oleh Favorit. Seminar Nasional Informatika, 2018(November).

Kurniawan, A., Indriarti, & Adinugroho, S. (2019). Analisis Sentimen Opini Film Menggunakan Metode Naïve Bayes dan Lexicon Based Features. Jurnal Pengembangan Teknologi Informasi Dan Ilmu Komputer, 3(9).

Liang, H., Sun, X., Sun, Y., & Gao, Y. (2017). Text feature extraction based on deep learning: a review. Eurasip Journal on Wireless Communications and Networking, Vol. 2017. https://doi.org/10.1186/s13638-017-0993-1

Lo, A. S., & Yao, S. S. (2019). What makes hotel online reviews credible?: An investigation of the roles of reviewer expertise, review rating consistency and review valence. International Journal of Contemporary Hospitality Management, 31(1). https://doi.org/10.1108/IJCHM-10-2017-0671

Najib, A. C., Irsyad, A., Qandi, G. A., & Rakhmawati, N. A. (2019). Perbandingan Metode Lexicon-based dan SVM untuk Analisis Sentimen Berbasis Ontologi pada Kampanye Pilpres Indonesia Tahun 2019 di Twitter. Fountain of Informatics Journal, 4(2). https://doi.org/10.21111/fij.v4i2.3573

Padurariu, C., & Breaban, M. E. (2019). Dealing with data imbalance in text classification. Procedia Computer Science, 159. https://doi.org/10.1016/j.procs.2019.09.229

Pecar, S., Simko, M., & Bielikova, M. (2018). Sentiment analysis of customer reviews: Impact of text pre-processing. DISA 2018 - IEEE World Symposium on Digital Intelligence for Systems and Machines, Proceedings. https://doi.org/10.1109/DISA.2018.8490619

Qader, W. A., Ameen, M. M., & Ahmed, B. I. (2019). An Overview of Bag of Words;Importance, Implementation, Applications, and Challenges. Proceedings of the 5th International Engineering Conference, IEC 2019. https://doi.org/10.1109/IEC47844.2019.8950616

Sarudin, R. (2021). ANALISIS ONLINE REVIEW TRIPADVISOR.COM TERHADAP MINAT PEMBELIAN PRODUK JASA AKOMODASI DI HOTEL MANHATTAN. Jurnal Hospitality Dan Pariwisata, 7(1). https://doi.org/10.30813/jhp.v7i1.2634

Silaa, V., Masui, F., & Ptaszynski, M. (2022). A Method of Supplementing Reviews to Less-Known Tourist Spots Using Geotagged Tweets. Applied Sciences (Switzerland), 12(5). https://doi.org/10.3390/app12052321

Wankhade, M., Rao, A. C. S., & Kulkarni, C. (2022). A survey on sentiment analysis methods, applications, and challenges. Artificial Intelligence Review. https://doi.org/10.1007/s10462-022-10144-1

Ying, X. (2019). An Overview of Overfitting and its Solutions. Journal of Physics: Conference Series, 1168(2). https://doi.org/10.1088/1742-6596/1168/2/022022

Zhu, F. (2021). The Impact of High Technology on the Economy. Proceedings - 2021 5th International Conference on Data Science and Business Analytics, ICDSBA 2021. https://doi.org/10.1109/ICDSBA53075.2021.00069

Downloads


Crossmark Updates

How to Cite

Dharma, A. S., & Saragih , Y. G. R. . (2022). Comparison of Feature Extraction Methods on Sentiment Analysis in Hotel Reviews. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 7(4), 2349-2354. https://doi.org/10.33395/sinkron.v7i4.11706