Comparison of IndoBERT and SVM Performance in Sentiment Analysis of Digital Education Platforms
DOI:
10.33395/sinkron.v10i1.15472Keywords:
IndoBERT; Ruangguru; Sentiment Analysis; SVM; TF-IDFAbstract
Sentiment analysis on user-generated reviews is essential for understanding the quality and effectiveness of digital education platforms. This study compares the performance of Support Vector Machine (SVM) and IndoBERT in classifying sentiments from Ruangguru user reviews. The original dataset contains 111,838 reviews, from which a stratified sample of 10,000 entries was selected for experimentation to maintain class proportion. Text preprocessing applied standard/light normalization (case folding and light cleaning, handling URLs/users/hashtags and repetition) without stopword removal to preserve polarity cues. Auto labels are validated on 139 manually annotated samples (accuracy 0.763, Cohen’s κ 0.644), indicating reliable yet imperfect alignment. To ensure a fair, leakage-safe comparison, we use a fixed 20% standard test split for all models; within the remaining data, 10% is used for validation, and IndoBERT checkpoints are selected based on validation macro-F1 (early stopping). The SVM baseline combines word- and character-level TF-IDF with class-balanced LinearSVC and grid search, achieving accuracy 0.888 and macro-F1 0.543, strong on positives but limited for the neutral class. IndoBERT yields more balanced performance: the class-weighted variant attains the best macro-F1 0.601 (accuracy 0.857), while the baseline reaches the highest IndoBERT accuracy (0.867) with macro-F1 0.596. These results show that Transformer models provide a more balanced trade-off under severe imbalance, whereas SVM remains a competitive accuracy-oriented baseline. In practice, platforms should prioritize macro-F1, use optimized IndoBERT when minority opinions matter, and invest in expanded manual labeling and advanced imbalance handling to improve neutral detection further.
Downloads
References
Anam, M. K., Fitri, T. A., Agustin, A., Lusiana, L., Firdaus, M. B., & Nurhuda, A. T. (2023). Sentiment Analysis for Online Learning using The Lexicon-Based Method and The Support Vector Machine Algorithm. ILKOM Jurnal Ilmiah, 15(2), 290–302. https://doi.org/10.33096/ilkom.v15i2.1590.290-302
Anilkumar, C., E., S. V, Kanchana, S., & Kumar, S. B. (2023). Sentimental Analysis on Product Reviews Using Support Vector Machine and Nave Bayes. Applied and Computational Engineering, 2(1), 1067–1073. https://doi.org/10.54254/2755-2721/2/20220586
Ayuningtiyas, P., Tania, K. D., & Sari, W. K. (2025). Sentiment-Based Knowledge Discovery pada Aplikasi iPusnas Menggunakan Metode Machine Learning dan Deep Learning. 9(5), 2486–2497.
Baihaqi, W. M., & Munandar, A. (2023). Sentiment Analysis of Student Comment on the College Performance Evaluation Questionnaire Using Naïve Bayes and IndoBERT. JUITA : Jurnal Informatika, 11(2), 213. https://doi.org/10.30595/juita.v11i2.17336
Cunha, W., Rocha, L., & Gonçalves, M. A. (2025). A thorough benchmark of automatic text classification: From traditional approaches to large language models. In Proceedings of Washington Cunha et al. (Vol. 1, Issue 1). Association for Computing Machinery. http://arxiv.org/abs/2504.01930
Fikri, M., & Sarno, R. (2019). A comparative study of sentiment analysis using SVM and Senti Word Net. Indonesian Journal of Electrical Engineering and Computer Science, 13(3), 902–909. https://doi.org/10.11591/ijeecs.v13.i3.pp902-909
Gunasekaran, K. P. (2023). Exploring Sentiment Analysis Techniques in Natural Language Processing: A Comprehensive Review. 1–6. https://doi.org/10.17148/IJARCCE.2019.8126
Imaduddin, H., A’la, F. Y., & Nugroho, Y. S. (2023). Sentiment Analysis in Indonesian Healthcare Applications using IndoBERT Approach. International Journal of Advanced Computer Science and Applications, 14(8), 113–117. https://doi.org/10.14569/IJACSA.2023.0140813
Jannah, N. Z. B., & Kusnawi, K. (2024). Comparison of Naïve Bayes and SVM in Sentiment Analysis of Product Reviews on Marketplaces. Sinkron, 8(2), 727–733. https://doi.org/10.33395/sinkron.v8i2.13559
Jazuli, A., Widowati, & Kusumaningrum, R. (2025). Optimizing Aspect-Based Sentiment Analysis Using BERT for Comprehensive Analysis of Indonesian Student Feedback. Applied Sciences (Switzerland), 15(1), 1–28. https://doi.org/10.3390/app15010172
Juarto, B., & Yulianto. (2023). Indonesian News Classification Using IndoBert. International Journal of Intelligent Systems and Applications in Engineering, 11(2), 454–460.
Koto, F., Rahimi, A., Lau, J. H., & Baldwin, T. (2020). IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP. COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference, 757–770. https://doi.org/10.18653/v1/2020.coling-main.66
Mushtaha, E., Abu Dabous, S., Alsyouf, I., Ahmed, A., & Raafat Abdraboh, N. (2022). The challenges and opportunities of online learning and teaching at engineering and theoretical colleges during the pandemic. Ain Shams Engineering Journal, 13(6), 101770. https://doi.org/10.1016/j.asej.2022.101770
Purba, R., Sinaga, F. M., Pipin, S. J., & Mikroskil, U. (2025). Fine-Grained Sentiment Analysis on Big Data From Multi-. 11(1), 64–75. https://doi.org/10.33480/jitk.v11i1.6549.sentiment
Rafiandi Andhika, F., Witanti, W., & Sabrina, P. N. (2025). Analisis Sentimen Menggunakan Metode IndoBERT pada Ulasan Aplikasi Zoom Menggunakan Fitur Ekstrasi GloVe. 9, 2025. https://doi.org/10.47002/metik.v9i2.1098
Sabrina, S. S., Shiddieq, D. F., & Roji, F. F. (2025). Comparative Analysis of SVM and BERT for Sentiment and Sarcasm Detection in the Boycott of Israeli Products on Platform X. Sinkron, 9(2), 872–883. https://doi.org/10.33395/sinkron.v9i2.14723
Sarasvananda, I. B. G., Selivan, D., Radhitya, M. L., & Putra, I. N. T. A. (2022). Analisis Sentimen Pada Pembelajaran Daring Di Indonesia Melalui Twitter Menggunakan Naïve Bayes Classifier. SINTECH (Science and Information Technology) Journal, 5(2), 227–233. https://doi.org/10.31598/sintechjournal.v5i2.1241
Ulinuha, A., Majid, E., & Nuari, R. (2025). Perbandingan Kinerja Metrik Bert Dan Model Machine Learning Klasik (Svm, Naive Bayes) Untuk Analisis Sentimen. Jurnal Inovtek Polbeng, 10(2), 741–752.
Wafda, A., Fudholi, D. H., & Nugraha, J. (2025). Aspect-Based Sentiment Analysis on Twitter Tweets About the Merdeka Curriculum Using Indobert. JITK (Jurnal Ilmu Pengetahuan Dan Teknologi Komputer), 10(3), 586–599. https://doi.org/10.33480/jitk.v10i3.5692
Downloads
How to Cite
Issue
Section
License
Copyright (c) 2025 Aldina Bonaria Siva Br Sembiring, Robet, Leony Hoki

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.


Moraref
PKP Index
Indonesia OneSearch
OCLC Worldcat
Index Copernicus
Scilit




















