Comparison of NB and SVM in Sentiment Analysis of Cyberbullying using Feature Selection

Authors

  • Selamet Riadi Universitas AMIKOM Yogyakarta, seleman
  • Ema Utami Universitas AMIKOM Yogyakarta, seleman
  • Ainul Yaqin Universitas AMIKOM Yogyakarta, seleman

DOI:

10.33395/sinkron.v8i4.12629

Keywords:

Naive Bayes; Support Vector Machine; Cyberbullying; feature selection; Chi-square.

Abstract

In the past few decades, the internet has become an inseparable part of human life. It provides ease of access and permeates almost every aspect of human existence. One of the internet platforms that is widely used by people around the world is social media. Apart from being spoiled with the convenience and efficiency offered by social media to support daily life, it has gained popularity among a wide audience. This has positive implications when utilized effectively, but it cannot be denied that there are negative consequences if not utilized properly. One such consequence is the prevalence of cyberbullying activities on social media. Cyberbullying has become a major concern for the public and social media users, prompting researchers to leverage information technology in developing technologies that can identify the elements of cyberbullying, particularly on social media platforms. Sentiment analysis has been employed by researchers to identify the components of cyberbullying in online platforms. Sentiment analysis involves the application of natural language processing techniques and text analysis to identify and extract subjective information from text. This study aims to compare the Naive Bayes algorithm and the Support Vector Machine algorithm, while utilizing feature selection, specifically chi-square, to enhance the accuracy of both algorithms in classifying Instagram comments. The experimental results indicate that the Multinomial Naive Bayes (MNB) algorithm outperforms the Support Vector Machine (SVM) algorithm, achieving an accuracy of 83.85% without feature selection and 90.77% with feature selection. Meanwhile, SVM achieves an accuracy of 82.31% without feature selection and 90% with feature selection. Evaluation through the confusion matrix and classification report reveals that MNB exhibits better precision and recall rates compared to SVM in identifying bullying and non-bullying classes. The use of feature selection enhances the performance of both algorithms in classifying Instagram comments related to cyberbullying.

GS Cited Analysis

Downloads

Download data is not yet available.

Author Biographies

Selamet Riadi, Universitas AMIKOM Yogyakarta, seleman

 

 

 

Ema Utami, Universitas AMIKOM Yogyakarta, seleman

 

 

Ainul Yaqin, Universitas AMIKOM Yogyakarta, seleman

 

 

 

 

References

Aini, K., & Apriana, R. (2019). Dampak Cyberbullying Terhadap Depresi Pada Mahasiswa Prodi Ners. Jurnal Keperawatan Jiwa, 6(2), 91. https://doi.org/10.26714/jkj.6.2.2018.91-97

Ardiani, L., Sujaini, H., & Tursina, T. (2020). Implementasi Sentiment Analysis Tanggapan Masyarakat Terhadap Pembangunan di Kota Pontianak. Jurnal Sistem Dan Teknologi Informasi (Justin), 8(2), 183. https://doi.org/10.26418/justin.v8i2.36776

Ardianto, R., Rivanie, T., Alkhalifi, Y., Nugraha, F. S., & Gata, W. (2020). Sentiment Analysis on E-Sports for Education Curriculum Using Naive Bayes and Support Vector Machine. Jurnal Ilmu Komputer Dan Informasi, 13(2), 109–122. https://doi.org/10.21609/jiki.v13i2.885

Arifin, N., Enri, U., & Sulistiyowati, N. (2021). Penerapan Algoritma Support Vector Machine (SVM) dengan TF-IDF N-Gram untuk Text Classification. STRING (Satuan Tulisan Riset Dan Inovasi Teknologi), 6(2), 129. https://doi.org/10.30998/string.v6i2.10133

Azhar, M., Hafidz, N., Rudianto, B., & Gata, W. (2020). Marketplace Sentiment Analysis Using Naive Bayes And Support Vector Machine. PIKSEL : Penelitian Ilmu Komputer Sistem Embedded and Logic, 8(2), 91–100. https://doi.org/10.33558/piksel.v8i2.2272

Chu, X. W., Fan, C. Y., Liu, Q. Q., & Zhou, Z. K. (2018). Cyberbullying victimization and symptoms of depression and anxiety among Chinese adolescents: Examining hopelessness as a mediator and self-compassion as a moderator. Computers in Human Behavior, 86, 377–386. https://doi.org/10.1016/j.chb.2018.04.039

Deolika, A., Kusrini, K., & Luthfi, E. T. (2019). Analisis Pembobotan Kata Pada Klasifikasi Text Mining. Jurnal Teknologi Informasi, 3(2), 179. https://doi.org/10.36294/jurti.v3i2.1077

Hussein, D. M. E. D. M. (2016). A survey on sentiment analysis challenges. Journal of King Saud University - Engineering Sciences, 30(4), 330–338. https://doi.org/10.1016/j.jksues.2016.04.002

Khairunnisa, S., Adiwijaya, A., & Faraby, S. Al. (2021). Pengaruh Text Preprocessing terhadap Analisis Sentimen Komentar Masyarakat pada Media Sosial Twitter (Studi Kasus Pandemi COVID-19). Jurnal Media Informatika Budidarma, 5(2), 406. https://doi.org/10.30865/mib.v5i2.2835

Langgeni, D. P., Baizal, Z. K. A., & W, Y. F. A. (2010). Clustering Artikel Berita Berbahasa Indonesia Menggunakan Unsupervised Feature Selection. Seminar Nasional Informatika 2010, 2010(semnasIF), 1–10.

Luthfiana, L., Young, J. C., & Rusli, A. (2020). Implementasi Algoritma Support Vector Machinedan Chi Squareuntuk Analisis Sentimen User FeedbackAplikasi. Ultimatics, XII(2), 125–128.

Marga, N. S., Isnain, A. R., & Alita, D. (2021). Sentimen Analisis Tentang Kebijakan Pemerintah Terhadap Kasus Corona Menggunakan Metode Naive Bayes. Jurnal Informatika Dan Rekayasa Perangkat Lunak (JATIKA), 2(4), 453–463. http://jim.teknokrat.ac.id/index.php/informatika

Maulana, M. I., & Soebroto, A. A. (2019). Klasifikasi Tingkat Stres Berdasarkan Tweet pada Akun Twitter menggunakan Metode Improved k-Nearest Neighbor dan Seleksi Fitur Chi- square. 3(7), 6662–6669.

Merinda Lestandy, Abdurrahim Abdurrahim, & Lailis Syafa’ah. (2021). Analisis Sentimen Tweet Vaksin COVID-19 Menggunakan Recurrent Neural Network dan Naïve Bayes. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 5(4), 802–808. https://doi.org/10.29207/resti.v5i4.3308

Naf’an, M. Z., Bimantara, A. A., Larasati, A., Risondang, E. M., & Nugraha, N. A. S. (2019). Sentiment Analysis of Cyberbullying on Instagram User Comments. Journal of Data Science and Its Applications, 2(1), 88–98. https://doi.org/10.21108/jdsa.2019.2.20

Pramukti, S. D., Nugroho, A., & Sunge, A. S. (2022). Analisis Sentimen Masyarakat Dengan Metode Naïve Bayes dan Particle Swarm Optimization. Techno.Com, 21(1), 61–74. https://doi.org/10.33633/tc.v21i1.5332

Prihatini, P. M. (2016). Implementasi Ekstraksi Fitur Pada Pengolahan Dokumen Berbahasa Indonesia. Jurnal Matrix, 6(3), 174–178.

Ridho Handoko, M., & Neneng. (2021). Sistem Pakar Diagnosa Penyakit Selama Kehamilan Menggunakan Metode Naive Bayes Berbasis Web. Jurnal Teknologi Dan Sistem Informasi (JTSI), 2(1), 50–58. http://jim.teknokrat.ac.id/index.php/JTSI

Ritonga, A. S., & Purwaningsih, E. S. (2018). Penerapan Metode Support Vector Machine ( SVM ) Dalam Klasifikasi Kualitas Pengelasan Smaw ( Shield Metal Arc Welding ). Ilmiah Edutic, 5(1), 17–25.

Rofiqoh, U., Perdana, R. S., & Fauzi, M. A. (2017). Analisis Sentimen Tingkat Kepuasan Pengguna Penyedia Layanan Telekomunikasi Seluler Indonesia Pada Twitter dengan Metode Support Vector Machine dan Lexicon Based Features Twitter event detection View project Human Detection and Tracking View project. Jurnal Pengembangan Teknologi Informasi Dan Ilmu Komputer, 1(12)(October), 1725–1732. https://www.researchgate.net/publication/320234928

Saleh, A. (2015). Implementasi Metode Klasifikasi Naïve Bayes Dalam Memprediksi Besarnya Penggunaan Listrik Rumah Tangga. CREATIVE INFORMATION TECHNOLOGY JOURNAL (CITEC JOURNAL), 2. https://doi.org/10.20895/inista.v1i2.73

Sharazita Dyah Anggita, & Ikmah. (2020). Algorithm Comparation of Naive Bayes and Support Vector Machine based on Particle Swarm Optimization in Sentiment Analysis of Freight Forwarding Services. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 4(2), 362–369. https://doi.org/10.29207/resti.v4i2.1840

Somantri, O., & Apriliani, D. (2018). Support Vector Machine Berbasis Feature Selection Untuk Sentiment Analysis Kepuasan Pelanggan Terhadap Pelayanan Warung dan Restoran Kuliner Kota Tegal. Jurnal Teknologi Informasi Dan Ilmu Komputer, 5(5), 537. https://doi.org/10.25126/jtiik.201855867

Downloads


Crossmark Updates

How to Cite

Riadi, S. ., Utami, E. ., & Yaqin, A. . (2023). Comparison of NB and SVM in Sentiment Analysis of Cyberbullying using Feature Selection. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 7(4), 2414-2424. https://doi.org/10.33395/sinkron.v8i4.12629

Most read articles by the same author(s)