The News Classification Using Bidirectional Long Short Term Memory and GloVe

Authors

  • Elisabet Margaret Sirait Information System, Institut Teknologi Del, Sitoluama, Laguboti, Sumatera Utara, Indonesia
  • Raynaldo Silalahi Information System, Institut Teknologi Del, Sitoluama, Laguboti, Sumatera Utara, Indonesia
  • Annessa Aprilly Tambunan Information System, Institut Teknologi Del, Sitoluama, Laguboti, Sumatera Utara, Indonesia
  • Junita Amalia Information System, Institut Teknologi Del, Sitoluama, Laguboti, Sumatera Utara, Indonesia

DOI:

10.33395/sinkron.v9i1.13005

Keywords:

Classification text, Word embedding, Hyperparameter Tuning, GloVe, Bi-LSTM

Abstract

The dissemination of information and news via online media encompasses not only established news platforms but also contributions from internet users, lacking oversight. News constitutes fact-grounded insights into ongoing occurrences. This research employed Bidirectional Long- and Short-Term Memory with Hyperparameter tuning on GloVe for news classification. This research aims to optimize news categorization through hyperparameter tuning on GloVe. GloVe facilitated the transformation of words into vector matrices, exploring its efficacy in news classification with hyperparameter tuning and Bi-LSTM for text analysis. Experiments encompassed untuned and hyperparameter-tuned approaches, employing GloVe's hyperparameters using Gridsearch and manual methods. GloVe's hyperparameter tuning reveals the potential for enhancing word vector representations. Surprisingly, non-hyperparameter tuned news classification yielded superior evaluation results compared to the hyperparameter approach. The untuned experiment achieved an accuracy of 0.98, while the gridsearch method yielded 0.85 accuracy, and hyperparameter tuning generated a 0.88 precision in the -11 model. These findings underscore the nuanced interplay of hyperparameters in optimizing text classification models like GloVe.

GS Cited Analysis

Downloads

Download data is not yet available.

References

Adipradana, R., Nayoga, B. P., Suryadi, R., & Suhartono, D. (2021). Hoax analyzer for Indonesian news using RNNs with fasttext and glove embeddings. Bulletin of Electrical Engineering and Informatics, 10(4), 2130–2136. https://doi.org/10.11591/eei.v10i4.2956

Aggarwal, C. C. (2015). Data Mining: The Textbook. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-14142-8

Augustyniak, Ł., Kajdanowicz, T., & Kazienko, P. (2019). Aspect Detection using Word and Char Embeddings with (Bi)LSTM and CRF. 2019 IEEE Second International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), 43–50. https://doi.org/10.1109/AIKE.2019.00016

Han, J., Kamber, M., & Pei, J. (n.d.). Data Mining Concept and Techniques third edition (3rd ed.). Morgan Kaufmann.

Işik, M., & Dağ, H. (2020). The impact of text preprocessing on the prediction of review ratings. TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES, 28(3), 1405–1421. https://doi.org/10.3906/elk-1907-46

Juditha, C. (n.d.). Interaksi Komunikasi Hoax di Media Sosial serta Antisipasinya Hoax Communication Interactivity in Social Media and Anticipation. 3(1).

Li, H., Caragea, D., Li, X., & Caragea, C. (2018). Comparison of Word Embeddings and Sentence Encodings as Generalized Representations for Crisis Tweet Classification Tasks. New Zealand.

Nayoga, B. P., Adipradana, R., Suryadi, R., & Suhartono, D. (2021). Hoax Analyzer for Indonesian News Using Deep Learning Models. Procedia Computer Science, 179, 704–712. https://doi.org/10.1016/j.procs.2021.01.059

Pardede, J., & Ibrahim, R. G. (2020). Implementasi Long Short-Term Memory untuk Identifikasi Berita Hoax Berbahasa Inggris pada Media Sosial. Journal of Computer Science and Informatics Engineering (J-Cosine), 4(2), 179–187. https://doi.org/10.29303/jcosine.v4i2.361

Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543. Doha, Qatar: Association for Computational Linguistics. https://doi.org/10.3115/v1/D14-1162

Pradnyana, G. A., & Agustini, Dr. K. (n.d.). Konsep Dasar Data Mining.

Reimers, N., & Gurevych, I. (2017, August 16). Optimal Hyperparameters for Deep LSTM-Networks for Sequence Labeling Tasks. arXiv. Retrieved from http://arxiv.org/abs/1707.06799

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (n.d.). Dropout: A Simple Way to Prevent Neural Networks from Overfitting.

Susanty, M., & Sukardi, S. (2021). Perbandingan Pre-trained Word Embedding dan Embedding Layer untuk Named-Entity Recognition Bahasa Indonesia. PETIR, 14(2), 247–257. https://doi.org/10.33322/petir.v14i2.1164

Tambunan, S. M., Nataliani, Y., & Lestari, E. S. (2021). Perbandingan Klasifikasi dengan Pendekatan Pembelajaran Mesin untuk Mengidentifikasi Tweet Hoaks di Media Sosial Twitter. Jurnal Edukasi dan Penelitian Informatika (JEPIN), 7(2), 112. https://doi.org/10.26418/jp.v7i2.47232

Downloads


Crossmark Updates

How to Cite

Sirait, E. M. ., Silalahi, R. ., Tambunan, A. A. ., & Amalia, J. . (2024). The News Classification Using Bidirectional Long Short Term Memory and GloVe. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 9(1), 112-124. https://doi.org/10.33395/sinkron.v9i1.13005