Comparison of IndoBERT and SVM Performance in Sentiment Analysis of Digital Education Platforms

Authors

  • Aldina Bonaria Siva Br Sembiring STMIK TIME
  • Robet
  • Leony Hoki

DOI:

10.33395/sinkron.v10i1.15472

Keywords:

IndoBERT; Ruangguru; Sentiment Analysis; SVM; TF-IDF

Abstract

Sentiment analysis on user-generated reviews is essential for understanding the quality and effectiveness of digital education platforms. This study compares the performance of Support Vector Machine (SVM) and IndoBERT in classifying sentiments from Ruangguru user reviews. The original dataset contains 111,838 reviews, from which a stratified sample of 10,000 entries was selected for experimentation to maintain class proportion. Text preprocessing applied standard/light normalization (case folding and light cleaning, handling URLs/users/hashtags and repetition) without stopword removal to preserve polarity cues. Auto labels are validated on 139 manually annotated samples (accuracy 0.763, Cohen’s κ 0.644), indicating reliable yet imperfect alignment. To ensure a fair, leakage-safe comparison, we use a fixed 20% standard test split for all models; within the remaining data, 10% is used for validation, and IndoBERT checkpoints are selected based on validation macro-F1 (early stopping). The SVM baseline combines word- and character-level TF-IDF with class-balanced LinearSVC and grid search, achieving accuracy 0.888 and macro-F1 0.543, strong on positives but limited for the neutral class. IndoBERT yields more balanced performance: the class-weighted variant attains the best macro-F1 0.601 (accuracy 0.857), while the baseline reaches the highest IndoBERT accuracy (0.867) with macro-F1 0.596. These results show that Transformer models provide a more balanced trade-off under severe imbalance, whereas SVM remains a competitive accuracy-oriented baseline. In practice, platforms should prioritize macro-F1, use optimized IndoBERT when minority opinions matter, and invest in expanded manual labeling and advanced imbalance handling to improve neutral detection further.

GS Cited Analysis

Downloads

Download data is not yet available.

References

Anam, M. K., Fitri, T. A., Agustin, A., Lusiana, L., Firdaus, M. B., & Nurhuda, A. T. (2023). Sentiment Analysis for Online Learning using The Lexicon-Based Method and The Support Vector Machine Algorithm. ILKOM Jurnal Ilmiah, 15(2), 290–302. https://doi.org/10.33096/ilkom.v15i2.1590.290-302

Anilkumar, C., E., S. V, Kanchana, S., & Kumar, S. B. (2023). Sentimental Analysis on Product Reviews Using Support Vector Machine and Nave Bayes. Applied and Computational Engineering, 2(1), 1067–1073. https://doi.org/10.54254/2755-2721/2/20220586

Ayuningtiyas, P., Tania, K. D., & Sari, W. K. (2025). Sentiment-Based Knowledge Discovery pada Aplikasi iPusnas Menggunakan Metode Machine Learning dan Deep Learning. 9(5), 2486–2497.

Baihaqi, W. M., & Munandar, A. (2023). Sentiment Analysis of Student Comment on the College Performance Evaluation Questionnaire Using Naïve Bayes and IndoBERT. JUITA : Jurnal Informatika, 11(2), 213. https://doi.org/10.30595/juita.v11i2.17336

Cunha, W., Rocha, L., & Gonçalves, M. A. (2025). A thorough benchmark of automatic text classification: From traditional approaches to large language models. In Proceedings of Washington Cunha et al. (Vol. 1, Issue 1). Association for Computing Machinery. http://arxiv.org/abs/2504.01930

Fikri, M., & Sarno, R. (2019). A comparative study of sentiment analysis using SVM and Senti Word Net. Indonesian Journal of Electrical Engineering and Computer Science, 13(3), 902–909. https://doi.org/10.11591/ijeecs.v13.i3.pp902-909

Gunasekaran, K. P. (2023). Exploring Sentiment Analysis Techniques in Natural Language Processing: A Comprehensive Review. 1–6. https://doi.org/10.17148/IJARCCE.2019.8126

Imaduddin, H., A’la, F. Y., & Nugroho, Y. S. (2023). Sentiment Analysis in Indonesian Healthcare Applications using IndoBERT Approach. International Journal of Advanced Computer Science and Applications, 14(8), 113–117. https://doi.org/10.14569/IJACSA.2023.0140813

Jannah, N. Z. B., & Kusnawi, K. (2024). Comparison of Naïve Bayes and SVM in Sentiment Analysis of Product Reviews on Marketplaces. Sinkron, 8(2), 727–733. https://doi.org/10.33395/sinkron.v8i2.13559

Jazuli, A., Widowati, & Kusumaningrum, R. (2025). Optimizing Aspect-Based Sentiment Analysis Using BERT for Comprehensive Analysis of Indonesian Student Feedback. Applied Sciences (Switzerland), 15(1), 1–28. https://doi.org/10.3390/app15010172

Juarto, B., & Yulianto. (2023). Indonesian News Classification Using IndoBert. International Journal of Intelligent Systems and Applications in Engineering, 11(2), 454–460.

Koto, F., Rahimi, A., Lau, J. H., & Baldwin, T. (2020). IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP. COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference, 757–770. https://doi.org/10.18653/v1/2020.coling-main.66

Mushtaha, E., Abu Dabous, S., Alsyouf, I., Ahmed, A., & Raafat Abdraboh, N. (2022). The challenges and opportunities of online learning and teaching at engineering and theoretical colleges during the pandemic. Ain Shams Engineering Journal, 13(6), 101770. https://doi.org/10.1016/j.asej.2022.101770

Purba, R., Sinaga, F. M., Pipin, S. J., & Mikroskil, U. (2025). Fine-Grained Sentiment Analysis on Big Data From Multi-. 11(1), 64–75. https://doi.org/10.33480/jitk.v11i1.6549.sentiment

Rafiandi Andhika, F., Witanti, W., & Sabrina, P. N. (2025). Analisis Sentimen Menggunakan Metode IndoBERT pada Ulasan Aplikasi Zoom Menggunakan Fitur Ekstrasi GloVe. 9, 2025. https://doi.org/10.47002/metik.v9i2.1098

Sabrina, S. S., Shiddieq, D. F., & Roji, F. F. (2025). Comparative Analysis of SVM and BERT for Sentiment and Sarcasm Detection in the Boycott of Israeli Products on Platform X. Sinkron, 9(2), 872–883. https://doi.org/10.33395/sinkron.v9i2.14723

Sarasvananda, I. B. G., Selivan, D., Radhitya, M. L., & Putra, I. N. T. A. (2022). Analisis Sentimen Pada Pembelajaran Daring Di Indonesia Melalui Twitter Menggunakan Naïve Bayes Classifier. SINTECH (Science and Information Technology) Journal, 5(2), 227–233. https://doi.org/10.31598/sintechjournal.v5i2.1241

Ulinuha, A., Majid, E., & Nuari, R. (2025). Perbandingan Kinerja Metrik Bert Dan Model Machine Learning Klasik (Svm, Naive Bayes) Untuk Analisis Sentimen. Jurnal Inovtek Polbeng, 10(2), 741–752.

Wafda, A., Fudholi, D. H., & Nugraha, J. (2025). Aspect-Based Sentiment Analysis on Twitter Tweets About the Merdeka Curriculum Using Indobert. JITK (Jurnal Ilmu Pengetahuan Dan Teknologi Komputer), 10(3), 586–599. https://doi.org/10.33480/jitk.v10i3.5692

Downloads


Crossmark Updates

How to Cite

Br Sembiring, A. B. S. ., Robet M.Kom, & S.Kom., S.A.B., M.M, L. H. (2026). Comparison of IndoBERT and SVM Performance in Sentiment Analysis of Digital Education Platforms. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 10(1), 64-74. https://doi.org/10.33395/sinkron.v10i1.15472