The News Classification Using Bidirectional Long Short Term Memory and GloVe
DOI:
10.33395/sinkron.v9i1.13005Keywords:
Classification text, Word embedding, Hyperparameter Tuning, GloVe, Bi-LSTMAbstract
The dissemination of information and news via online media encompasses not only established news platforms but also contributions from internet users, lacking oversight. News constitutes fact-grounded insights into ongoing occurrences. This research employed Bidirectional Long- and Short-Term Memory with Hyperparameter tuning on GloVe for news classification. This research aims to optimize news categorization through hyperparameter tuning on GloVe. GloVe facilitated the transformation of words into vector matrices, exploring its efficacy in news classification with hyperparameter tuning and Bi-LSTM for text analysis. Experiments encompassed untuned and hyperparameter-tuned approaches, employing GloVe's hyperparameters using Gridsearch and manual methods. GloVe's hyperparameter tuning reveals the potential for enhancing word vector representations. Surprisingly, non-hyperparameter tuned news classification yielded superior evaluation results compared to the hyperparameter approach. The untuned experiment achieved an accuracy of 0.98, while the gridsearch method yielded 0.85 accuracy, and hyperparameter tuning generated a 0.88 precision in the -11 model. These findings underscore the nuanced interplay of hyperparameters in optimizing text classification models like GloVe.
Downloads
References
Adipradana, R., Nayoga, B. P., Suryadi, R., & Suhartono, D. (2021). Hoax analyzer for Indonesian news using RNNs with fasttext and glove embeddings. Bulletin of Electrical Engineering and Informatics, 10(4), 2130–2136. https://doi.org/10.11591/eei.v10i4.2956
Aggarwal, C. C. (2015). Data Mining: The Textbook. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-14142-8
Augustyniak, Ł., Kajdanowicz, T., & Kazienko, P. (2019). Aspect Detection using Word and Char Embeddings with (Bi)LSTM and CRF. 2019 IEEE Second International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), 43–50. https://doi.org/10.1109/AIKE.2019.00016
Han, J., Kamber, M., & Pei, J. (n.d.). Data Mining Concept and Techniques third edition (3rd ed.). Morgan Kaufmann.
Işik, M., & Dağ, H. (2020). The impact of text preprocessing on the prediction of review ratings. TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES, 28(3), 1405–1421. https://doi.org/10.3906/elk-1907-46
Juditha, C. (n.d.). Interaksi Komunikasi Hoax di Media Sosial serta Antisipasinya Hoax Communication Interactivity in Social Media and Anticipation. 3(1).
Li, H., Caragea, D., Li, X., & Caragea, C. (2018). Comparison of Word Embeddings and Sentence Encodings as Generalized Representations for Crisis Tweet Classification Tasks. New Zealand.
Nayoga, B. P., Adipradana, R., Suryadi, R., & Suhartono, D. (2021). Hoax Analyzer for Indonesian News Using Deep Learning Models. Procedia Computer Science, 179, 704–712. https://doi.org/10.1016/j.procs.2021.01.059
Pardede, J., & Ibrahim, R. G. (2020). Implementasi Long Short-Term Memory untuk Identifikasi Berita Hoax Berbahasa Inggris pada Media Sosial. Journal of Computer Science and Informatics Engineering (J-Cosine), 4(2), 179–187. https://doi.org/10.29303/jcosine.v4i2.361
Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543. Doha, Qatar: Association for Computational Linguistics. https://doi.org/10.3115/v1/D14-1162
Pradnyana, G. A., & Agustini, Dr. K. (n.d.). Konsep Dasar Data Mining.
Reimers, N., & Gurevych, I. (2017, August 16). Optimal Hyperparameters for Deep LSTM-Networks for Sequence Labeling Tasks. arXiv. Retrieved from http://arxiv.org/abs/1707.06799
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (n.d.). Dropout: A Simple Way to Prevent Neural Networks from Overfitting.
Susanty, M., & Sukardi, S. (2021). Perbandingan Pre-trained Word Embedding dan Embedding Layer untuk Named-Entity Recognition Bahasa Indonesia. PETIR, 14(2), 247–257. https://doi.org/10.33322/petir.v14i2.1164
Tambunan, S. M., Nataliani, Y., & Lestari, E. S. (2021). Perbandingan Klasifikasi dengan Pendekatan Pembelajaran Mesin untuk Mengidentifikasi Tweet Hoaks di Media Sosial Twitter. Jurnal Edukasi dan Penelitian Informatika (JEPIN), 7(2), 112. https://doi.org/10.26418/jp.v7i2.47232
Downloads
How to Cite
Issue
Section
License
Copyright (c) 2023 Annessa Aprilly Tambunan, Raynaldo Silalahi, Elisabet Margaret Sirait, Junita Amalia
![Creative Commons License](http://i.creativecommons.org/l/by-nc/4.0/88x31.png)
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.