Graph Regularized Probabilistic Latent Semantic Analysis for Topic Analysis Using Social Media Data

Authors

  • Muhammad Panji Muslim Universitas Pembangunan Nasional Veteran Jakarta
  • Novi Trisman Hadi Universitas Pembangunan Nasional Veteran Jakarta, Indonesia
  • Muhammad Adrezo Universitas Pembangunan Nasional Veteran Jakarta, Indonesia

DOI:

10.33395/sinkron.v9i1.14348

Abstract

In today's digital era, social media data provides valuable insights into public opinion. This study implements the Graph Regularized Probabilistic Latent Semantic Analysis (GPLSA) method to analyze topics from social media data surrounding the 2024 Indonesian Presidential Election (Pemilu), as well as to evaluate the efficiency of the Probabilistic Latent Semantic Analysis (PLSA) algorithm. The research stages include collecting social media data on presidential debates and elections, text pre-processing, and applying the GPLSA method to identify main topics. The analysis results show that PLSA without graph achieved a topic coherence score of 0.653, indicating good consistency, while GPLSA decreased to 0.5, suggesting that the addition of graph regularization did not significantly enhance coherence. Additionally, PLSA without graph achieved a perplexity score of 12.138, indicating good predictive capability, while GPLSA increased to 12.511, showing that graph regularization did not improve the prediction of new words. PLSA without graph also produced topics relevant to election issues, while GPLSA generated topics influenced by graph regularization, though without significant improvement in topic quality. Sentiment analysis of social media posts provides insights into public responses to debates and election issues. Validation of the GPLSA model ensures relevant topic representation. This research contributes to the development of text analysis methods and offers valuable information for elections and democratic participation. These results can be utilized by stakeholders to make more strategic and informed decisions.

GS Cited Analysis

Downloads

Download data is not yet available.

References

Journal, I., & Science, A. C. (2020). Crime Data Analysis Methodologies for Digital Forensics on Twitter. International Journal of Advanced Computer Science and Applications (IJACSA), pp. 1-11.

Landauer, T.K., McNamara, Danielle S., Dennis, Simon., Kintsch, Walter. (2007). Handbook of Latent Semantic Analysis 1st Edition, Routledge.

Landauer, T.K., Foltz, P.W., Laham, D. (1998). An Introduction to Latent Semantic Analysis. Discourse Processes, 25, (pp. 259-284).

Hofmann T. (1999). Probabilistic Latent Semantic Analysis, Proceedings of the Fifteenth conference on Uncertainty in Artificial Intelligence.

M. M. Islam, A. S. M. L. Hoque. (2010). Automatic essay scoring using. 13th International Conference on Computer and Information Technology (ICCIT) pp.358-363.

Kanejiya. Dharmendra, Kumar. Arun, Surendra. Prasad. (2004). Automatic Evaluation of Students' Answers using Syntactically Enhanced LSA. Doi: 10.3115/1118894.1118902

Wang, Xing., Chang, Ming-Ching & Lyub, Siwei. (2018). Efficient Algorithms for Graph Regularized PLSA for Probabilistic Topic Modeling. Evanston, NY, USA: Department of Computer Science, University at Albany.

Hadi, A. (2018). Bridging Indonesia’s Digital Divide: Rural-Urban Linkages," Jurnal Ilmu Sosial dan Ilmu Politik, vol. 22, no. 2, pp. 123-135.

Arianto, B. (2021). Pandemi Covid-19 dan Transformasi Budaya Digital di Indonesia. Titian Ilmu: a. Jurnal Ilmiah Multi Sciences, vol. 5, no. 2, pp. 45-56

Juliswara, V. and Muryanto, F. (2022). Model Penanggulangan Hoax Mengenai Berita Covid 19 untuk Pengembangan Literasi Digital Masyarakat di Indonesia. Jurnal Ilmiah Ilmu Pendidikan: ol. 5, no. 7, pp. 80-90.

Suhartono. D. (2006). Probabilistic Latent Semantic Analysis (PLSA)untuk Klasifikasi Dokumen Teks Berbahasa Indonesia. Online: http://arxiv.org/abs/1512.00576.

Blei, D. M. (2012). Probabilistic Topics Models. Communications of the ACM, vol. 55, pp. 77–84.

Alghamdi, R & Alfalqi, K. (2015). A Survey of Topic Modelling in Text Mining. International Journal of Advanced Computer Scienxe and Applications, pp. 147–153. https://doi.org/10.14569/ijacsa.2015.060121.

Liu, B., & Yu, P. S. (2004). "Towards Integrating Knowledge and Data Mining for Text Mining". Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 278–285.

Chen, X., & Zhai, C. (2006). "Exploiting Topic Models with Linguistic Structure for Information Retrieval". Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 67–74.

Downloads


Crossmark Updates

How to Cite

Muslim, M. P., Hadi, N. T. ., & Adrezo, M. . (2025). Graph Regularized Probabilistic Latent Semantic Analysis for Topic Analysis Using Social Media Data. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 9(1), 329-337. https://doi.org/10.33395/sinkron.v9i1.14348