A Comparative Study of Alternative Automatic Labeling Using AI Assistant

Authors

  • Indri Tri Julianto Institut Teknologi Garut
  • Dede Kurniadi Department of Computer Science Institut Teknologi Garut, Indonesia
  • Benedicto B. Balilo Jr CS/IT, Bicol University, Legazpi City, Philippines
  • Fauza Rohman Department of Computer Science Institut Teknologi Garut, Indonesia

DOI:

10.33395/sinkron.v8i4.13950

Abstract

The development of AI assistants has become increasingly sophisticated, as evidenced by their growing adoption in assisting humans with various tasks. In particular, AI assistants have demonstrated potential in the field of sentiment analysis, where they can automate the labeling of text data. Traditionally, this labeling process has been performed manually by humans or using tools like the VADER Lexicon. This study is imperative to evaluate the performance of AI Assistants in sentiment labeling, as compared to traditional human-based labeling and the application of the VADER sentiment analysis algorithm. The methodology involves comparing the labeling results of Gemini and You AI with those of human labeling and VADER. Performance is evaluated using the Naive Bayes and K- Nearest Neighbour algorithms, and K-Fold Cross Validation is employed for evaluation. The results indicate that the performance of both AI assistants can closely approximate the performance of human labeling. Gemini's best accuracy is achieved with the k-NN algorithm at 53.71%, while You AI's best accuracy is achieved with the Naive Bayes algorithm at 48.30%. These results are close to the accuracy of human labeling (61.12%) using the k-NN algorithm and VADER (54.29%) using the Naive Bayes algorithm. This suggests that AI assistants can serve as an alternative for text data labeling, as the differences in performance are not statistically significant.

GS Cited Analysis

Downloads

Download data is not yet available.

Author Biography

Indri Tri Julianto, Institut Teknologi Garut

 

 

References

Ahmad, A., & Gata, W. (2022). Sentimen Analisis Masyarakat Indonesia di Twitter Terkait Metaverse dengan Algoritma Support Vector Machine. Jurnal JTIK (Jurnal Teknologi Informasi Dan Komunikasi), 6(4), 548–555. https://doi.org/10.35870/jtik.v6i4.569

Albab, M. U., P, Y. K., & Fawaiq, M. N. (2023). Optimization of the Stemming Technique on Text preprocessing President 3 Periods Topic. Jurnal TRANSFORMATIKA, 20(2), 1–10.

Amaliah, F., & Nuryana, I. K. D. (2022). Perbandingan Akurasi Metode Lexicon Based Dan Naive Bayes Classifier Pada Analisis Sentimen Pendapat Masyarakat Terhadap Aplikasi Investasi Pada Media Twitter. Journal of Informatics and Computer Science, 3(3), 384–393.

Andriana, H., Hilabi, S. S., & Hananto, A. (2023). Penerapan Metode K-Nearest Neighbor pada Sentimen Analisis Pengguna Twitter Terhadap KTT G20 di Indonesia. JURIKOM (Jurnal Riset Komputer), 10(1), 60–67. https://doi.org/10.30865/jurikom.v10i1.5427

Anwar, M. T., & Permana, D. A. (2023). Analisis Sentimen Masyarakat Indonesia Terhadap Produk Kendaraan Listrik Menggunakan VADER. Jurnal Teknik Informatika Dan Sistem Informasi, 10(1), 783–792.

Asri, Y., Suliyanti, W. N., Kuswardani, D., & Fajri, M. (2022). Pelabelan Otomatis Lexicon Vader dan Klasifikasi Naive Bayes dalam menganalisis sentimen data ulasan PLN Mobile. PETIR: Jurnal Pengkajian Dan Penerapan Teknik Informatika, 15(2), 264–275.

Ayuningsih, K., Sari, Y. A., & Adikara, P. P. (2019). Klasifikasi Citra Makanan Menggunakan HSV Color Moment dan Local Binary Pattern dengan Naïve Bayes Classifier. Jurnal Pengembangan Teknologi Informasi Dan Ilmu Komputer (J-PTIIK) Universitas Brawijaya, 3(4), 3166–3173.

Barik, K., & Misra, S. (2024). Analysis of customer reviews with an improved VADER lexicon classifier. Journal of Big Data, 11(10), 1–29. https://doi.org/10.1186/s40537-023-00861-x

Cholil, S. R., Handayani, T., Prathivi, R., & Ardianita, T. (2021). Implementasi Algoritma Klasifikasi K-Nearest Neighbor (KNN) Untuk Klasifikasi Seleksi Penerima Beasiswa. IJCIT (Indonesian Journal on Computer and Information Technology), 6(2), 118–127.

Gaja, M. Y. R., Maulana, I., & Komarudin, O. (2023). Analisis Sentimen Opini Pengguna Aplikasi Vidio pada Ulasan Playstore Menggunakan Algoritma Naive Bayes. JATI (Jurnal Mahasiswa Teknik Informatika), 7(4), 2767–2774.

Illia, F., Eugenia, M. P., & Rutba, S. A. (2021). Sentiment Analysis on PeduliLindungi Application Using TextBlob and VADER Library. The 1 International Conference on Data Science and Official Statistics (ICDSOS), 278–288.

Insan, M. K., Hayati, U., & Nurdiawan, O. (2023). Analisis Sentimen Aplikasi Brimo Pada Ulasan Pengguna Di Google Play Menggunakan Algoritma Naive Bayes. JATI (Jurnal Mahasiswa Teknik Informatika), 7(1), 478–483.

Julianto, I. T., Kurniadi, D., & Jr, B. B. B. (2023). Enhancing Sentiment Analysis With Chatbots : A Comparative Study Of Text Pre-Processing. JUTIF, 4(6), 1419–1430.

Julianto, I. T., Kurniadi, D., Nashrulloh, M. R., & Mulyani, A. (2022). Comparison Of Classification Algorithm And Feature Selection in Bitcoin Sentiment Analysis. JUTIF, 3(3), 739–744.

Julianto, I. T., Kurniadi, D., Septiana, Y., & Sutedi, A. (2023). Alternative Text Pre-Processing using Chat GPT Open AI. Janapati, 12(1), 67–77. https://wjaets.com/content/artificial-intelligence-ai-based-chatbot-study-chatgpt-google-ai-bard-and-baidu-ai

Kaggle. (2022). Financial Sentiment Analysis. Kaggle.Com. https://www.kaggle.com/datasets/sbhatti/financial-sentiment-analysis

Pebdika, A., Herdiana, R., & Solihudin, D. (2023). Klasifikasi Menggunakan Metode Naive Bayes Untuk Menentukan Calon Penerima PIP. JATI (Jurnal Mahasiswa Teknik Informatika), 7(1), 452–458.

Prasetya, A., Ferdiansyah, F., Kunang, Y. N., Negara, E. S., & Chandra, W. (2021). Sentiment Analisis Terhadap Cryptocurrency Berdasarkan Comment Dan Reply Pada Platform Twitter. Journal of Information Systems and Informatics, 3(2), 268–277. https://doi.org/10.33557/journalisi.v3i2.124

Pratama, A. Y., Umaidah, Y., & Voutama, A. (2021). Analisis Sentimen Media Sosial Twitter Dengan Algoritma K-Nearest Neighbor dan Seleksi Fitur Chi-Square (Kasus Omnibus Law Cipta Kerja). Sains Komputer & Informatika, 5(2), 897–910. https://tunasbangsa.ac.id/ejurnal/index.php/jsakti/article/view/386/365

Rinandyaswara, R., Sari, Y. A., & Furqon, M. T. (2022). Pembentukan Daftar Stopword Menggunakan Term Based Random Sampling Pada Analisis Sentimen Dengan Metode Naïve Bayes ( Studi Kasus : Kuliah Daring Di Masa Pandemi ). Jurnal Teknologi Informasi Dan Ilmu Komputer (JTIIK), 9(4), 717–724. https://doi.org/10.25126/jtiik.202294707

Sekarwati, R. A., Sururi, A., Rakhmat, Arifin, M., & Wibowo, A. (2021). Survei Metode Pengujian Chatbot pada Media Sosial untuk Mengukur Tingkat Akurasi. JURNAL SISFOTENIKA, 11(2), 172–182.

Surbakti, A. Q., Hayami, R., & Al Amien, J. (2021). Analisa Tanggapan Terhadap Psbb Di Indonesia Dengan Algoritma Decision Tree Pada Twitter. Jurnal CoSciTech (Computer Science and Information Technology), 2(2), 91-97.

Telnoni, P. A., Suryatiningsih, & Rosely, E. (2020). Pelabelan Data Dengan Latent Dirichlet Allocation dan K-Means Clustering pada Data Twitter Menggunakan Bahasa Indonesia Data Labeling using Latent Dirichlet Allocation and K-Means Clustering on Indonesian-Based Twitter. Jurnal Elektro Telekomunikasi Terapan (JETT), 7(2), 885–892.

YOU. (2022). You Asisstant AI. You.Com.

Downloads


Crossmark Updates

How to Cite

Julianto, I. T. ., Kurniadi, D. ., Balilo Jr, B. B. ., & Rohman, F. . (2024). A Comparative Study of Alternative Automatic Labeling Using AI Assistant. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 8(4), 2125-2133. https://doi.org/10.33395/sinkron.v8i4.13950