Combination of Lexical Resources and Support Vector Machine for Film Sentiment Analysis

Authors

  • Putri Agustina Sains dan Teknologi,Ilmu Komputer, Universitas Islam Negeri Sumatera Utara, Medan, Indonesia
  • Raissa Amanda Putri Sains dan Teknologi, Universitas Islam Negeri Sumatera Utara, Jl. Lap. Golf No. 120, Medan, Indonesia

DOI:

10.33395/sinkron.v8i3.13733

Keywords:

Text Mining; Support Vector Machine; Lexical Resources; Sentiment; Combination

Abstract

Text data generated by internet users holds potentially valuable information that can be researched for new insights. One strategy for obtaining information from a text data set is to classify text into predetermined categories based on existing data. Text classification is an aspect of Text Mining. One of the popular approaches in Text Mining uses the Support Vector Machine (SVM) classification algorithm, which aims to classify text and separate data into different classes. However, in some cases, SVM classification algorithms may face difficulties in understanding the context of the text properly due to unclear wording, varying sentence structures, or a lack of understanding of interpretation. To address this problem, applying SVM classification using lexical resources can be an effective solution. In this research framework, the first step is to obtain data, which in this case is a film review dataset taken from the kaggle.com site. After obtaining the data, the next step is preprocessing. The results of the preprocessing are then divided into 80:20 percentages. The 80% training data is used to search for the form of polarization, and this training data lexicon is used for training the SVM model. Based on the modeling results, the overall model accuracy is around 85%, calculated using the confusion matrix. The precision value, which shows the proportion of correct positive predictions, reached 88%. The precision for negative predictions reached 80%, and for neutral predictions, it reached 0%. These results show that the Lexicon+SVM model has good performance, with an accuracy of 85%.

GS Cited Analysis

Downloads

Download data is not yet available.

References

Alhaq, Z., Mustopa, A., Mulyatun, S., & Santoso, J. D. (2021). Penerapan Metode Support Vector Machine Untuk Analisis Sentimen Pengguna Twitter. Journal of Information System Management (JOISM), 3(2), 44–49. https://doi.org/10.24076/joism.2021v3i2.558

Aribowo, A. S., & Khomsah, S. (2021). Implementation Of Text Mining For Emotion Detection Using The Lexicon Method (Case Study: Tweets About Covid-19). Telematika, 18(1), 49. https://doi.org/10.31315/telematika.v18i1.4341

Firdaus, A., & Firdaus, W. I. (2021). Text Mining Dan Pola Algoritma Dalam Penyelesaian Masalah Informasi : (Sebuah Ulasan). Jurnal JUPITER, 13(1), 66.

Furqan, M., Sriani, S., & Sari, S. M. (2022). Analisis Sentimen Menggunakan K-Nearest Neighbor Terhadap New Normal Masa Covid-19 Di Indonesia. Techno.Com, 21(1), 51–60. https://doi.org/10.33633/tc.v21i1.5446

Halim, E., & Purba, R. (2021). Consumer Opinion Extraction Using Text Mining for Product Recommendations On E-Commerce. Indonesian Journal of Artificial Intelligence and Data Mining (IJAIDM), 4(1), 19–28.

Hamka, M., & Ratna Sari, D. (2022). Analisis Sentimen Dan Information Extraction Pembelajaran Daring Menggunakan Pendekatan Lexicon. Djtechno: Jurnal Teknologi Informasi, 3(1), 21–32. https://doi.org/10.46576/djtechno.v3i1.2194

Hasibuan, M. S., & Serdano, A. (2022). Analisis Sentimen Kebijakan Pembelajaran Tatap Muka Menggunakan Support Vector Machine dan Naive Bayes. JRST (Jurnal Riset Sains Dan Teknologi), 6(2), 199–204. https://doi.org/10.30595/jrst.v6i2.15145

Hayaty, M., & Pratama, A. H. (2023). Performance of Lexical Resource and Manual Labeling on Long Short-Term Memory Model for Text Classification. Jurnal Ilmiah Teknik Elektro Komputer Dan Informatika, 9(1), 74–84. https://doi.org/10.26555/jiteki.v9i1.25375

Herianto. (2019). Penerapan Text-Mining Untuk Mengidentifikasi. VIII(2), 36–44.

Hofmann, M., & Chisholm, A. (2016). Text Mining and Visualization: Case Studies Using Open-Source Tools. In Text Mining and Visualization: Case Studies Using Open-Source Tools.

Hoiriyah, H., Qomariya, N., Darmawan, A. K., Walid, M., & Efenie, Y. (2023). Sentiment Analysis on Lgbt Issues in Indonesia With Lexicon-Based and Support Vector Machine Algorithms. Jurnal Pilar Nusa Mandiri, 19(1), 27–36. https://doi.org/10.33480/pilar.v19i1.4183

Jo, T. (2019). Text Mining: Concepts, Implementation, and Big Data Challenge. In Studies in Big Data (Vol. 45).

Kavabilla, F. E., Widiharih, T., & Warsito, B. (2023). Analisis Sentimen Pada Ulasan Aplikasi Investasi Online Ajaib Pada Google Play Menggunakan Metode Support Vector Machine Dan Maximum Entropy. Jurnal Gaussian, 11(4), 542–553. https://doi.org/10.14710/j.gauss.11.4.542-553

Kusnia, U., Kurniawan, F., & Artikel, S. (2022). Analisis Sentimen Review Aplikasi Media Berita Online Pada Google Play menggunakan Metode Algoritma Support Vector Machines (SVM) Dan Naive Bayes. Jurnal Keilmuan Dan Aplikasi Teknik Informatika, 5(36), 22–28.

Nanda, R., Haerani, E., Gusti, S. K., & Ramadhani, S. (2022). Klasifikasi Berita Menggunakan Metode Support Vector Machine. Jurnal Nasional Komputasi Dan Teknologi Informasi (JNKTI), 5(2), 269–278. https://doi.org/10.32672/jnkti.v5i2.4193

Oktaviana, N. E., Sari, Y. A., & Indriati, I. (2022). Analisis Sentimen terhadap Kebijakan Kuliah Daring Selama Pandemi Menggunakan Pendekatan Lexicon Based Features dan Support Vector Machine. Jurnal Teknologi Informasi Dan Ilmu Komputer, 9(2), 357–362. https://doi.org/10.25126/jtiik.2022925625

Putri, T. T. A., Mendoza, Mhd. D., & Alie, M. F. (2020). Sentiment Analysis On Twitter Using The Target-Dependent Approach And The Support Vector Machine (SVM) Method. Jurnal Mantik, 3(1), 20–26.

Rahman, O. H., Abdillah, G., & Komarudin, A. (2021). Klasifikasi Ujaran Kebencian pada Media Sosial Twitter Menggunakan Support Vector Machine. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 5(1), 17–23. https://doi.org/10.29207/resti.v5i1.2700

Saputra, F. T., Nurhadryani, Y., Wijaya, S. H., & Defina, D. (2021). Analisis Sentimen Bahasa Indonesia pada Twitter Menggunakan Struktur Tree Berbasis Leksikon. Jurnal Teknologi Informasi Dan Ilmu Komputer, 8(1), 135. https://doi.org/10.25126/jtiik.0814133

Scutelnicu, L. A. (2023). An Approach of Interconnecting Romanian Lexical Resources. Procedia Computer Science, 225, 804–814. https://doi.org/10.1016/j.procs.2023.10.067

Utama, H. S., Rosiyadi, D., Prakoso, B. S., & Ariadarma, D. (2019). Analisis Sentimen Sistem Ganjil Genap di Tol Bekasi Menggunakan Algoritma Support Vector Machine. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 3(2), 243–250. https://doi.org/10.29207/resti.v3i2.1050

Downloads


Crossmark Updates

How to Cite

Agustina, P. ., & Putri, R. A. . (2024). Combination of Lexical Resources and Support Vector Machine for Film Sentiment Analysis. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 8(3), 1526-1538. https://doi.org/10.33395/sinkron.v8i3.13733