Graph Regularized Probabilistic Latent Semantic Analysis for Topic Analysis Using Social Media Data


  • Muhammad Panji Muslim Universitas Pembangunan Nasional Veteran Jakarta
  • Novi Trisman Hadi Universitas Pembangunan Nasional Veteran Jakarta, Indonesia
  • Muhammad Adrezo Universitas Pembangunan Nasional Veteran Jakarta, Indonesia




In today's digital era, social media data provides valuable insights into public opinion. This study implements the Graph Regularized Probabilistic Latent Semantic Analysis (GPLSA) method to analyze topics from social media data surrounding the 2024 Indonesian Presidential Election (Pemilu), as well as to evaluate the efficiency of the Probabilistic Latent Semantic Analysis (PLSA) algorithm. The research stages include collecting social media data on presidential debates and elections, text pre-processing, and applying the GPLSA method to identify main topics. The analysis results show that PLSA without graph achieved a topic coherence score of 0.653, indicating good consistency, while GPLSA decreased to 0.5, suggesting that the addition of graph regularization did not significantly enhance coherence. Additionally, PLSA without graph achieved a perplexity score of 12.138, indicating good predictive capability, while GPLSA increased to 12.511, showing that graph regularization did not improve the prediction of new words. PLSA without graph also produced topics relevant to election issues, while GPLSA generated topics influenced by graph regularization, though without significant improvement in topic quality. Sentiment analysis of social media posts provides insights into public responses to debates and election issues. Validation of the GPLSA model ensures relevant topic representation. This research contributes to the development of text analysis methods and offers valuable information for elections and democratic participation. These results can be utilized by stakeholders to make more strategic and informed decisions.

