Analysis of TF-IDF and TF-RF Feature Extraction on Product Review Sentiment

Authors

  • Keisha Priya Harmandini School of Computing, Telkom University, Indonesia
  • Kemas Muslim L School of Computing, Telkom University, Indonesia

DOI:

10.33395/sinkron.v8i2.13376

Keywords:

sentiment analysis, shopee, svm, tf-idf, tf-rf

Abstract

Sentiment analysis of product reviews is critical in understanding customer views and satisfaction, especially in the context of e-commerce applications. A marketplace provides channels where users can submit reviews of the products they purchase. However, due to the large number of reviews in a marketplace, analyzing them is no longer feasible to be performed manually. This research proposes a machine learning implementation to perform sentiment analysis on product reviews. In this research, the product review dataset on Shopee marketplace is used for sentiment analysis by comparing TF-IDF and TF-RF feature extraction using the SVM algorithm with stages of dataset, labeling, feature extraction and accuracy results. The importance of the comparison between TF-IDF and TF-RF feature extraction in this research is related to the need to evaluate and determine which feature extraction method is most effective in increasing the accuracy of sentiment analysis. TF-IDF and TF-RF are two methods commonly used in text analysis, and a comparison of their performance can provide deep insight into the effectiveness of each in the context of product sentiment analysis.Thus, through this comparison, this research aims to determine the best approach that can provide the highest accuracy results, so that the results can serve as a guide for further research. Based on the evaluation, the highest accuracy value is achieved at 92.87% by using TF-IDF and SVM classifiers which outperformed previous research.

GS Cited Analysis

Downloads

Download data is not yet available.

References

Cahyaningtyas, C., Nataliani, Y., & Widiasari, I. R. (2021). Analisis sentimen pada rating aplikasi Shopee menggunakan metode Decision Tree berbasis SMOTE. AITI, 18(2), 173-184.Krech Thomas, H. (2004). Training strategies for improving listeners' comprehension of foreign-accented speech (Doctoral dissertation). University of Colorado, Boulder.

Feldman, R., & Sanger, J. (2007). The text mining handbook: advanced approaches in analyzing unstructured data. Cambridge university press.

Gumilang, Z. A. N. (2018). Implementasi Naive Bayes Classifier dan Asosiasi untuk Analisis Sentimen Data Ulasan Aplikasi E-Commerce Shopee pada Situs Google Play.

Hadna, M. S., Santosa, P. I., & Winarno, W. W. (2016). Studi literatur tentang perbandingan metode untuk proses analisis sentimen di Twitter. Semin. Nas. Teknol. Inf. dan Komun, 2016, 57-64.

Hafidhoh, N. U., & Sissandhy, A. N. Klasifikasi Fasilitas Umum di Jawa Tengah pada Twitter dengan Algoritma Agglomerative Hierarchical Clustering dan Naive Bayes Classifier. vol, 4, 60-68.

Hantoro, K., Handayani, D., & Setiawati, S. (2022). A Implementation of Text Mining In Sentiment Analysis of Shopee Indonesia Using SVM. Bulletin of Information Technology (BIT), 3(2), 115-120.

Ipmawati, J. (2016). Komparasi teknik klasifikasi teks mining pada analisis sentimen. Indonesian Journal of Networking and Security (IJNS), 6(1).

Kaburuan, E. R., Sartika, Y., & Agustina, I. Sentiment Analysis on Product Reviews from Shopee Marketplace using the Naïve Bayes Classifier. vol, 13, 150-159.

Manning, D. M., Raghavan, P., & Schutze, H. 2008. Introduction toInformation Retrieval. Cambridge, United Kingdom: Cambridge University Press

Nasukawa, T., & Yi, J. (2003, October). Sentimen analysis: Capturing favorability using natural language processing. In Proceedings of the 2nd international conference on Knowledge capture (pp. 70-77).

Nasution, M. R. A., & Hayaty, M. (2019). Perbandingan Akurasi dan Waktu Proses Algoritma K-NN dan SVM dalam Analisis Sentimen Twitter. J. Inform, 6(2), 226-235.

Novantirani, A., Sabariah, M. K., & Effendy, V. (2015). Analisis Sentimen pada Twitter untuk Mengenai Penggunaan Transportasi Umum Darat Dalam Kota dengan Metode SVM. eProceedings of Engineering, 2(1).

Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up? Sentiment classification using machine learning techniques. arXiv preprint cs/0205070.

PRATAMA, M. Y. (2022). ANALISA SENTIMEN TERHADAP PENGGUNAAN APLIKASI SHOPEE FOOD PADA TWITTER MENGGUNAKAN METODE NA? VE BAYES DAN SUPPORT VECTOR MACHINE (SVM) (Doctoral dissertation, Universitas Mercu Buana Jakarta).

Purbaya, M. E., Rakhmadani, D. P., Arum, M. P., & Nasifah, L. Z. (2023). Implementation of n-gram Methodology to Analyze Sentiment Reviews for Indonesian Chips Purchases in Shopee E-Marketplace. Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), 7(3), 609-617.

Putri, M. R., & Lhaksmana, K. M. (2023). Analisis Sentimen Terhadap Tweet Pelecehan Seksual Dengan Perbandingan Metode Term Weighting Menggunakan Klasifikasi SVM Terhadap Tagar Permendikbud30. eProceedings of Engineering, 10(2).

Putri, N. K. E., Atastina, I., & Laksitowening, K. A. (2014). Analisis Perbandingan Metode Pembobotan TF. CHI2 dan TF. RF Terhadap Kategorisasi Teks Berbahasa Indonesia. Universitas Telkom, Bandung.

Riany, J., Fajar, M., & Lukman, M. P. (2016). Penerapan deep sentiment analysis pada angket penilaian terbuka menggunakan K-Nearest Neighbor. SISFO Vol 6 No 1, 6.

Sammut, C., & Webb, G. I. (Eds.). (2011). Encyclopedia of machine learning. Springer Science & Business Media.

Sari, F. V., & Wibowo, A. (2019). Analisis Sentimen Pelanggan Toko Online Jd. Id Menggunakan Metode Naïve Bayes Classifier Berbasis Konversi Ikon Emosi. Simetris: Jurnal Teknik Mesin, Elektro dan Ilmu Komputer, 10(2), 681-686.

Sari, S. N., Faisal, M. R., Kartini, D., Budiman, I., Saragih, T. H., & Muliadi, M. (2023). Perbandingan Ekstraksi Fitur dengan Pembobotan Supervised dan Unsupervised pada Algoritma Random Forest untuk Pemantauan Laporan Penderita COVID-19 di Twitter. Jurnal Komputasi, 11(1), 33-42.

Tala, F. (2003). A study of stemming effects on information retrieval in Bahasa Indonesia.

Tambunan, M. G., & Setiawan, E. B. (2020). Prediksi Kepribadian Disc Pada Twitter Menggunakan Metode Decision Tree C4. 5 Dengan Pembobotan Tf-idf Dan Tf-rf. eProceedings of Engineering, 7(1).

Wahyudi, D., Susyanto, T., & Nugroho, D. (2017). Implementasi dan analisis algoritma stemming nazief & adriani dan porter pada dokumen berbahasa indonesia. Jurnal Ilmiah SINUS, 15(2).

Wu, H., & Gu, X. (2014, August). Reducing over-weighting in supervised term weighting for sentiment analysis. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers (pp. 1322-1330).

Ye, J., Jing, X., & Li, J. (2018). Sentimen analysis using modified LDA. In Signal and Information Processing, Networking and Computers: Proceedings of the 3rd International Conference on Signal and Information Processing, Networking and Computers (ICSINC) 3 (pp. 205-212). Springer Singapore.

Downloads


Crossmark Updates

How to Cite

Harmandini, K. P. ., & L, K. M. (2024). Analysis of TF-IDF and TF-RF Feature Extraction on Product Review Sentiment . Sinkron : Jurnal Dan Penelitian Teknik Informatika, 8(2), 929-937. https://doi.org/10.33395/sinkron.v8i2.13376