Detect Fake Reviews Using Random Forest and Support Vector Machine

Authors

  • Zulpan Hadi Universitas Amikom Yogyakarta
  • Ema Utami Universitas AMIKOM Yogyakarta, Indonesia
  • Dhani Ariatmanto Universitas AMIKOM Yogyakarta, Indonesia

DOI:

10.33395/sinkron.v8i2.12090

Keywords:

Tokopedia, Fake Review, Pos Tagging, Support Vector Machine, Random Forest.

Abstract

With the rapid development of e-commerce, which makes it possible to
buy and sell products and services online, customers are increasingly using these
online shop sites to fulfill their needs. After purchase, customers write reviews
about their personal experiences, feelings and emotions. Reviews of a product are
the main source of information for customers to make decisions to buy or not a
product. However, reviews that should be one piece of information that can be
trusted by customers can actually be manipulated by the owner of the seller. Where
sellers can spam reviews to increase their product ratings or bring down their
competitors. Therefore this study discusses detecting fake reviews on product
reviews on Tokopedia. Where the method used is the distribution post tagging
feature to perform detection. By using the post tagging feature method the
distribution got 856 fake reviews and 4478 genuine reviews. In the fake reviews,
there were 628 reviews written with the aim of increasing product sales or brand
names from store owners, while there were 228 reviews aimed at dropping their
competitors or competitors. Furthermore, the classification is carried out using the
random forest algorithm model and the support vector machine. By dividing the
dataset for training data by 80% while 20% for data testing. Here it is known that
the support vector machine gets much higher accuracy than the random forest. The
support vector machine gets an accuracy of 98% while the random forest gets an
accuracy of 60%

GS Cited Analysis

Downloads

Download data is not yet available.

References

Algotar, K., & Bansal, A. (n.d.). Detecting Truthful and Useful Consumer Reviews for Products using Opinion Mining.

Alsubari, S. N., Deshmukh, S. N., Alqarni, A. A., Alsharif, N., Aldhyani, T. H. H., Alsaade, F. W., & Khalaf, O. I. (2022). Data analytics for the identification of fake reviews using supervised learning. Computers, Materials and Continua, 70(2), 3189–3204. https://doi.org/10.32604/cmc.2022.019625

Barbado, R., Araque, O., & Iglesias, C. A. (2019). A framework for fake review detection in online consumer electronics retailers. Information Processing and Management, 56(4), 1234–1244. https://doi.org/10.1016/j.ipm.2019.03.002

Bast, H., Buchhold, B., & Haussmann, E. (2016). Semantic search on text and knowledge bases. Foundations and Trends in Information Retrieval, 10(2–3), 119–271. https://doi.org/10.1561/1500000032

Chandrasekar, P., & Qian, K. (2016). The Impact of Data Preprocessing on the Performance of a Naïve Bayes Classifier. Proceedings - International Computer Software and Applications Conference, 2, 618–619. https://doi.org/10.1109/COMPSAC.2016.205

Elmurngi, E., & Gherbi, A. (2017). Detecting Fake Reviews through Sentiment Analysis Using Machine Learning Techniques. DATA ANALYTICS 2017 : The Sixth International Conference on Data Analytics Detecting, c, 65–72.

Jindal, N., & Liu, B. (2008). Opinion spam and analysis. WSDM’08 - Proceedings of the 2008 International Conference on Web Search and Data Mining, 219–229. https://doi.org/10.1145/1341531.1341560

Kadek, N., Sari, R., Made, I., Suarjaya, A. D., & Buana, W. (2021). Perbandingan Translation Library Pada Python (Studi Kasus: Analisis Sentimen Penyakit Menular Di Indonesia) (Vol. 2, Issue 3).

Kansal, A., Singh, Y., Kumar, N., & Mohindru, V. (2016). Detection of forest fires using machine learning technique: A perspective. Proceedings of 2015 3rd International Conference on Image Information Processing, ICIIP 2015, 241–245. https://doi.org/10.1109/ICIIP.2015.7414773

Kashti, R. P., & Prasad, P. S. (2019). ANALYSIS OF CLASSIFIERS FOR FAKE REVIEW DETECTION. In International Journal For Technological Research In Engineering (Vol. 6, Issue 9). www.ijtre.com

Laurensz, B., & Eko Sediyono. (2021). Analisis Sentimen Masyarakat terhadap Tindakan Vaksinasi dalam Upaya Mengatasi Pandemi Covid-19. Jurnal Nasional Teknik Elektro Dan Teknologi Informasi, 10(2), 118–123. https://doi.org/10.22146/jnteti.v10i2.1421

Le, H. (2020). Detection of Fake Reviews on Social Media Using Machine Learning Algorithms. Issues In Information Systems, 21(1), 185–194. https://doi.org/10.48009/1_iis_2020_185-194

Li, Y., Feng, X., & Zhang, S. (2016). Detecting Fake Reviews Utilizing Semantic and Emotion Model. Proceedings - 2016 3rd International Conference on Information Science and Control Engineering, ICISCE 2016, 317–320. https://doi.org/10.1109/ICISCE.2016.77

Lumbanraja, F. R., Saputra, R. A., Muludi, K., Hijriani, A., & Junaidi, A. (2021). Implementasi Support Vector Machine Dalam Memprediksi Harga Rumah Pada Perumahan Di Kota Bandar Lampung. Jurnal Pepadun, 2(3), 327–335. https://doi.org/10.23960/pepadun.v2i3.90

Pasaribu, B. E., Herdiani, A., & Astuti, W. (2019). Deteksi Fake Reviews Menggunakan Support Vector Machine. 6(2), 8788–8797.

Prakoso, B. H. (n.d.). Pengaruh Preprocessing Data pada Metode SVR dalam Memprediksi Permintaan Obat.

Putu, N., Widiari, A., Agus, M., Suarjaya, D., & Putra Githa, D. (n.d.). Teknik Data Cleaning Menggunakan Snowflake untuk Studi Kasus Objek Pariwisata di Bali.

Suryawati, E., Munandar, D., Riswantini, D., Abka, A. F., & Arisal, A. (2018). POS-Tagging for informal language (study in Indonesian tweets). Journal of Physics: Conference Series, 971(1). https://doi.org/10.1088/1742-6596/971/1/012055

Downloads


Crossmark Updates

How to Cite

Hadi, Z. ., Utami, E. ., & Ariatmanto, D. . (2023). Detect Fake Reviews Using Random Forest and Support Vector Machine. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 7(2), 623-630. https://doi.org/10.33395/sinkron.v8i2.12090