The News Classification Using Bidirectional Long Short Term Memory and GloVe

Elisabet Margaret  Sirait; Raynaldo  Silalahi; Annessa Aprilly  Tambunan; Junita  Amalia

doi:10.33395/sinkron.v9i1.13005

Authors

Elisabet Margaret Sirait Information System, Institut Teknologi Del, Sitoluama, Laguboti, Sumatera Utara, Indonesia
Raynaldo Silalahi Information System, Institut Teknologi Del, Sitoluama, Laguboti, Sumatera Utara, Indonesia
Annessa Aprilly Tambunan Information System, Institut Teknologi Del, Sitoluama, Laguboti, Sumatera Utara, Indonesia
Junita Amalia Information System, Institut Teknologi Del, Sitoluama, Laguboti, Sumatera Utara, Indonesia

DOI:

10.33395/sinkron.v9i1.13005

Keywords:

Classification text, Word embedding, Hyperparameter Tuning, GloVe, Bi-LSTM

Abstract

The dissemination of information and news via online media encompasses not only established news platforms but also contributions from internet users, lacking oversight. News constitutes fact-grounded insights into ongoing occurrences. This research employed Bidirectional Long- and Short-Term Memory with Hyperparameter tuning on GloVe for news classification. This research aims to optimize news categorization through hyperparameter tuning on GloVe. GloVe facilitated the transformation of words into vector matrices, exploring its efficacy in news classification with hyperparameter tuning and Bi-LSTM for text analysis. Experiments encompassed untuned and hyperparameter-tuned approaches, employing GloVe's hyperparameters using Gridsearch and manual methods. GloVe's hyperparameter tuning reveals the potential for enhancing word vector representations. Surprisingly, non-hyperparameter tuned news classification yielded superior evaluation results compared to the hyperparameter approach. The untuned experiment achieved an accuracy of 0.98, while the gridsearch method yielded 0.85 accuracy, and hyperparameter tuning generated a 0.88 precision in the -11 model. These findings underscore the nuanced interplay of hyperparameters in optimizing text classification models like GloVe.

GS Cited Analysis

Downloads

Download data is not yet available.

References

Adipradana, R., Nayoga, B. P., Suryadi, R., & Suhartono, D. (2021). Hoax analyzer for Indonesian news using RNNs with fasttext and glove embeddings. Bulletin of Electrical Engineering and Informatics, 10(4), 2130–2136. https://doi.org/10.11591/eei.v10i4.2956

Aggarwal, C. C. (2015). Data Mining: The Textbook. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-14142-8

Augustyniak, Ł., Kajdanowicz, T., & Kazienko, P. (2019). Aspect Detection using Word and Char Embeddings with (Bi)LSTM and CRF. 2019 IEEE Second International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), 43–50. https://doi.org/10.1109/AIKE.2019.00016

Han, J., Kamber, M., & Pei, J. (n.d.). Data Mining Concept and Techniques third edition (3rd ed.). Morgan Kaufmann.

Işik, M., & Dağ, H. (2020). The impact of text preprocessing on the prediction of review ratings. TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES, 28(3), 1405–1421. https://doi.org/10.3906/elk-1907-46

Juditha, C. (n.d.). Interaksi Komunikasi Hoax di Media Sosial serta Antisipasinya Hoax Communication Interactivity in Social Media and Anticipation. 3(1).

Li, H., Caragea, D., Li, X., & Caragea, C. (2018). Comparison of Word Embeddings and Sentence Encodings as Generalized Representations for Crisis Tweet Classiﬁcation Tasks. New Zealand.

Nayoga, B. P., Adipradana, R., Suryadi, R., & Suhartono, D. (2021). Hoax Analyzer for Indonesian News Using Deep Learning Models. Procedia Computer Science, 179, 704–712. https://doi.org/10.1016/j.procs.2021.01.059

Pardede, J., & Ibrahim, R. G. (2020). Implementasi Long Short-Term Memory untuk Identifikasi Berita Hoax Berbahasa Inggris pada Media Sosial. Journal of Computer Science and Informatics Engineering (J-Cosine), 4(2), 179–187. https://doi.org/10.29303/jcosine.v4i2.361

Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543. Doha, Qatar: Association for Computational Linguistics. https://doi.org/10.3115/v1/D14-1162

Pradnyana, G. A., & Agustini, Dr. K. (n.d.). Konsep Dasar Data Mining.

Reimers, N., & Gurevych, I. (2017, August 16). Optimal Hyperparameters for Deep LSTM-Networks for Sequence Labeling Tasks. arXiv. Retrieved from http://arxiv.org/abs/1707.06799

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (n.d.). Dropout: A Simple Way to Prevent Neural Networks from Overﬁtting.

Susanty, M., & Sukardi, S. (2021). Perbandingan Pre-trained Word Embedding dan Embedding Layer untuk Named-Entity Recognition Bahasa Indonesia. PETIR, 14(2), 247–257. https://doi.org/10.33322/petir.v14i2.1164

Tambunan, S. M., Nataliani, Y., & Lestari, E. S. (2021). Perbandingan Klasifikasi dengan Pendekatan Pembelajaran Mesin untuk Mengidentifikasi Tweet Hoaks di Media Sosial Twitter. Jurnal Edukasi dan Penelitian Informatika (JEPIN), 7(2), 112. https://doi.org/10.26418/jp.v7i2.47232

	CONTACT US
	EDITORIAL BOARD
	AIMS & SCOPE
	COPYRIGHT & LICENSE
	REVIEWER
	FACEBOOK FANPAGE
	AUTHOR PROCESSING CHARGE
	OPEN ACCESS POLICY
	TEMPLATE
	PEER REVIEW PROCESS
	PUBLICATION ETHICS
	STATISTIC VIEWER
	ARCHIVING
	CROSSMARK POLICY
	FREQUENCY
	PLAGIARISM POLICY
	AUTHOR GUIDELINES
	HISTORY
	CALL REVIEWER

The News Classification Using Bidirectional Long Short Term Memory and GloVe

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Current Issue

Make a Submission

Information

Developed By

Acceptance Rate Statistics