Improving Multi-Class Public Complaint Classification with LSTM, Word2Vec, and Random Oversampling

Authors

  • Azza Nimasari Universitas Dian Nuswantoro
  • Galuh Wilujeng Saraswati Universitas Dian Nuswantoro
  • Erba Lutfina Universitas Dian Nuswantoro

DOI:

10.33395/sinkron.v10i2.15975

Keywords:

LSTM, Public Complaint, Random Oversampling, Text Classification, Word2Vec

Abstract

Digital transformation in the public sector encourages local governments to enhance service quality through online complaint management systems. However, the high volume of incoming complaints and significant data imbalance across 31 Organisasi Perangkat Daerah (OPD) pose challenges for efficient manual classification, often resulting in delays and misclassification. This study proposes an automated text classification model that integrates Long Short-Term Memory (LSTM), Word2Vec, and Random Oversampling (ROS), optimized using the Adam algorithm. The novelty of this research lies in the integration of sequential modeling and imbalance handling to address an extreme multi-class classification problem involving 31 OPD categories within a highly imbalanced dataset. The research stages include text preprocessing, word embedding construction using Word2Vec, data balancing through ROS, and model training using LSTM. Experimental results show that the proposed model achieves an accuracy of 0.72, with macro-average precision, recall, and F1-score of 0.67, 0.67, and 0.66, respectively. Considering the complexity of classifying 31 classes and the presence of severe data imbalance, the macro F1-score of 0.66 indicates that the model is reasonably effective in capturing classification patterns, although performance is not yet evenly distributed across all classes. Overall, the combination of LSTM, Word2Vec, and ROS demonstrates potential as a baseline approach for automating public complaint classification in complex multi-class scenarios. The proposed model can improve the accuracy and speed of complaint distribution to the appropriate OPD, thereby enhancing the efficiency and responsiveness of public service delivery compared to conventional manual methods.

GS Cited Analysis

Downloads

Download data is not yet available.

References

Agus Tri Haryanto. (2024, January 31). APJII: Jumlah Pengguna Internet Indonesia Tembus 221 Juta Orang. Https://Inet.Detik.Com/Cyberlife/d-7169749/Apjii-Jumlah-Pengguna-Internet-Indonesia-Tembus-221-Juta-Orang.

Alex, S. A., Jhanjhi, N. Z., Humayun, M., Ibrahim, A. O., & Abulfaraj, A. W. (2022). Deep LSTM Model for Diabetes Prediction with Class Balancing by SMOTE. Electronics (Switzerland), 11(17). https://doi.org/10.3390/electronics11172737

Alkaff, M., Baskara, A. R., & Maulani, I. (2021). Klasifikasi Laporan Keluhan Pelayanan Publik Berdasarkan Instansi Menggunakan Metode LDA-SVM. Jurnal Teknologi Informasi Dan Ilmu Komputer, 8(6), 1265–1276. https://doi.org/10.25126/jtiik.2021863768

Amalia, J., Pakpahan, J., Pakpahan, M., Panjaitan, Y., Informatika dan Teknik Elektro, F., & Teknologi Del, I. (2022). Model Klasifikasi Berita Palsu Menggunakan Bidirectional LSTM Dan Word2Vec Sebagai Vektorisasi. Jurnal Teknik Informatika Dan Sistem Informasi, 9(4). https://doi.org/https://doi.org/10.35957/jatisi.v9i4.1332

Asrawi, H., Utami, E., & Yaqin, A. (2023). LSTM and Bidirectional GRU Comparison for Text Classification. Sinkron, 8(4), 2264–2274. https://doi.org/10.33395/sinkron.v8i4.12899

Cahyani, S. N., & Saraswati, G. W. (2023). IMPLEMENTATION OF SUPPORT VECTOR MACHINE METHOD IN CLASSIFYING SCHOOL LIBRARY BOOKS WITH COMBINATION OF TF-IDF AND WORD2VEC. Jurnal Teknik Informatika (Jutif), 4(6), 1555–1566. https://doi.org/10.52436/1.jutif.2023.4.6.1536

Fajar Abdillah, M., & Kusnawi, K. (2023). Comparative Analysis of Long Short-Term Memory Architecture for Text Classification. ILKOM Jurnal Ilmiah, 15(3), 455–464. https://doi.org/10.33096/ilkom.v15i3.1906.455-464

I Gusti Ngurah Ady Kusuma, I Made Pradipta, I Made Ari Santosa, & I Komang Dharmendra. (2023). PENANGANAN KETIDAKSEIMBANGAN DATA PADA KLASIFIKASI PENGADUAN MASYARAKAT. Jurnal Teknologi Informasi Dan Komputer, 9(5). https://doi.org/10.36002/jutik.v9i5.2643

Idris, M., Rifai, A., & Tania, K. D. (2025). Sentiment Analysis of Tokopedia App Reviews using Machine Learning and Word Embeddings. Sinkron, 9(1), 210–219. https://doi.org/10.33395/sinkron.v9i1.14278

Khairani, U., Mutiawani, V., & Ahmadian, H. (2024). Pengaruh Tahapan Preprocessing Terhadap Model Indobert Dan Indobertweet Untuk Mendeteksi Emosi Pada Komentar Akun Berita Instagram. Jurnal Teknologi Informasi Dan Ilmu Komputer, 11(4), 887–894. https://doi.org/10.25126/jtiik.1148315

Maulana, A. R., Wijoyo, S. H., & Mursityo, Y. T. (2023). Analisis Sentimen Kebijakan Penerapan Kurikulum Merdeka Sekolah Dasar dan Sekolah Menengah pada Media Sosial Twitter dengan Menggunakan Metode Word Embedding dan Long Short Term Memory Networks (LSTM). Jurnal Teknologi Informasi Dan Ilmu Komputer, 10(3), 523–530. https://doi.org/10.25126/jtiik.2023106977

Perumal, T., Mustapha, N., Mohamed, R., & Shiri, F. M. (2024). A Comprehensive Overview and Comparative Analysis on Deep Learning Models. Journal on Artificial Intelligence, 6(1), 301–360. https://doi.org/10.32604/jai.2024.054314

Purba, M., & Yadi, Y. (2023). Implementation Opinion Mining For Extraction Of Opinion Learning In University. SinkrOn, 8(2), 694–699. https://doi.org/10.33395/sinkron.v8i2.11994

Umam, A. K., Alzami, F., Sani, R. R., Rohmani, A., Prabowo, D. P., Pergiwati, D., Megantara, R. A., & Iswahyudi, I. (2025). Enhancing Entity Extraction in E-Government Complaint Data using LDA-Assisted NER. Sinkron, 9(4), 1878–1888. https://doi.org/10.33395/sinkron.v9i4.15292

Undang-Undang (UU) Nomor 25 Tahun 2009 Tentang Pelayanan Publik, Pub. L. 25, Lembaran Negara Republik Indonesia (2009).

Utami, S., Lhaksmana, K. M., & Sibaroni, Y. (2023). Deep Learning and Imbalance Handling on Movie Review Sentiment Analysis. SinkrOn, 8(3), 1894–1907. https://doi.org/10.33395/sinkron.v8i3.12770

Widhiyasana, Y., Semiawan, T., Gibran, I., Mudzakir, A., & Noor, M. R. (2021). Penerapan Convolutional Long Short-Term Memory untuk Klasifikasi Teks Berita Bahasa Indonesia (Convolutional Long Short-Term Memory Implementation for Indonesian News Classification). In Jurnal Nasional Teknik Elektro dan Teknologi Informasi | (Vol. 10, Number 4).

Wongvorachan, T., He, S., & Bulut, O. (2023). A Comparison of Undersampling, Oversampling, and SMOTE Methods for Dealing with Imbalanced Classification in Educational Data Mining. Information (Switzerland), 14(1). https://doi.org/10.3390/info14010054

Yudhistira, D., Saraswati, G. W., & Lutfina, E. (2025). ANALISIS SENTIMEN OPINI MASYARAKAT INDONESIA TERHADAP KASUS CYBERBULLYING DI MEDIA SOSIAL X (TWITTER) MENGGUNAKAN ALGORITMA SUPPORT VECTOR MACHINE. Information System and Emerging Technology Journal, 6(2), 50131. https://doi.org/https://doi.org/10.23887/insert.v6i2.95243

Downloads


Crossmark Updates

How to Cite

Nimasari, A., Saraswati, G. W., & Lutfina, E. (2026). Improving Multi-Class Public Complaint Classification with LSTM, Word2Vec, and Random Oversampling. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 10(2), 1094-1103. https://doi.org/10.33395/sinkron.v10i2.15975

Most read articles by the same author(s)