Deep Learning and Imbalance Handling on Movie Review Sentiment Analysis

Authors

  • Sri Utami School of Computing, Study Program of Informatics, Telkom University, Bandung, Indonesia
  • Kemas Muslim Lhaksmana School of Computing, Study Program of Informatics, Telkom University, Bandung, Indonesia
  • Yuliant Sibaroni School of Computing, Study Program of Informatics, Telkom University, Bandung, Indonesia

DOI:

10.33395/sinkron.v8i3.12770

Keywords:

CNN, LSTM, movie review, sentiment analysis, SMOTEN

Abstract

Before watching a movie, people usually read reviews written by movie critics or regular audiences to gain insights about the movie’s quality and discover recommended films. However, analyzing movie reviews can be challenging due to several reasons. Firstly, popular movies can receive hundreds of reviews, each comprising several paragraphs, making it time-consuming and effort-intensive to read them all. Secondly, different reviews may express varying opinions about the movie, making it difficult to draw definitive conclusions. To address these challenges, sentiment analysis using CNN and LSTM models, known for their effectiveness in classifying text in various datasets, was performed on the movie reviews. Additionally, techniques such as TF-IDF, Word2Vec, and data balancing with SMOTEN were applied to enhance the analysis. The CNN achieved an impressive sentiment analysis accuracy of 98.56%, while the LSTM achieved a close 98.53%. Moreover, both classifiers performed well in terms of the F1-score, with CNN obtaining 77.87% and LSTM achieving 78.92%. These results demonstrate the effectiveness of the sentiment analysis approach in extracting valuable insights from movie reviews and helping people make informed decisions about which movies to watch.

GS Cited Analysis

Downloads

Download data is not yet available.

Author Biography

Sri Utami, School of Computing, Study Program of Informatics, Telkom University, Bandung, Indonesia

 

 

References

Ahuja, R., Chug, A., Kohli, S., Gupta, S., & Ahuja, P. (2019). The impact of features extraction on the sentiment analysis. In Procedia Computer Science (Vol. 152). doi:10.1016/j.procs.2019.05.008

Alabrah, A. (2023). An Improved CCF Detector to Handle the Problem of Class Imbalance with Outlier Normalization Using IQR Method. Sensors (Basel, Switzerland), 23(9). doi:10.3390/s23094406

Altrabsheh, N., Cocea, M., & Fallahkhair, S. (2014). Sentiment Analysis: Towards a Tool for Analysing Real-Time Students Feedback. In Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI (Vol. 2014-December). doi:10.1109/ICTAI.2014.70

Ardhian Fahmi Sabani, Adiwijaya, & Widi Astuti. (2022). Analisis Sentimen Review Film pada Website Rotten Tomatoes Menggunakan Metode SVM Dengan Mengimplementasikan Fitur Extraction Word2Vec. Retrieved from https://openlibrary.telkomuniversity.ac.id/home/catalog/id/178462/slug/analisis-sentimen-review-film-pada-website-rotten-tomatoes-menggunakan-metode-svm-dengan-mengimplementasikan-fitur-extraction-word2vec.html

Avinash, M., & Sivasankar, E. (2019). A study of feature extraction techniques for sentiment analysis. In Advances in Intelligent Systems and Computing (Vol. 814). doi:10.1007/978-981-13-1501-5_41

Buttar, P., Kaur, J., & Kaur Buttar, P. (2018). A Systematic Review on Stopword Removal Algorithms. International Journal on Future Revolution in Computer Science & Communication Engineering, 4(4).

Chen, L. C., Lee, C. M., & Chen, M. Y. (2020). Exploration of social media for sentiment analysis using deep learning. Soft Computing, 24(11). doi:10.1007/s00500-019-04402-8

Colón-Ruiz, C., & Segura-Bedmar, I. (2020). Comparing deep learning architectures for sentiment analysis on drug reviews. Journal of Biomedical Informatics, 110. doi:10.1016/j.jbi.2020.103539

Dang, N. C., Moreno-García, M. N., & De la Prieta, F. (2020). Sentiment analysis based on deep learning: A comparative study. Electronics (Switzerland), 9(3). doi:10.3390/electronics9030483

DiMaggio, P., Hargittai, E., Russell Neuman, W., & Robinson, J. P. (2001). Social implications of the internet. Annual Review of Sociology, 27. doi:10.1146/annurev.soc.27.1.307

EMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference. (2014). EMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference.

Feldman, R. (2013). Techniques and applications for sentiment analysis. Communications of the ACM, 56(4). doi:10.1145/2436256.2436274

Gharatkar, S., Ingle, A., Naik, T., & Save, A. (2018). Review preprocessing using data cleaning and stemming technique. In Proceedings of 2017 International Conference on Innovations in Information, Embedded and Communication Systems, ICIIECS 2017 (Vol. 2018-January). doi:10.1109/ICIIECS.2017.8276011

Hermanto, D. T., Setyanto, A., & Luthfi, E. T. (2021). Algoritma LSTM-CNN untuk Binary Klasifikasi dengan Word2vec pada Media Online. Creative Information Technology Journal, 8(1). doi:10.24076/citec.2021v8i1.264

Hidayatullah, A. F., Abida, R., & Nayoan, N. (n.d.). Analisis Sentimen Berbasis Fitur pada Ulasan Tempat Wisata Menggunakan Metode Convolutional Neural Network(CNN). Retrieved from www.cnet.com.

Jin, Z., Yang, Y., & Liu, Y. (2020). Stock closing price prediction based on sentiment analysis and LSTM. Neural Computing and Applications, 32(13). doi:10.1007/s00521-019-04504-2

Kabra, B., & Nagar, C. (2023). Convolutional Neural Network based sentiment analysis with TF-IDF based vectorization. Journal of Integrated Science and Technology, 11(3).

Liao, S., Wang, J., Yu, R., Sato, K., & Cheng, Z. (2017). CNN for situations understanding based on sentiment analysis of twitter data. In Procedia Computer Science (Vol. 111). doi:10.1016/j.procs.2017.06.037

Medhat, W., Hassan, A., & Korashy, H. (2014). Sentiment analysis algorithms and applications: A survey. Ain Shams Engineering Journal, 5(4). doi:10.1016/j.asej.2014.04.011

N Murthy, G. S., Rao Allu, S., Andhavarapu, B., Bagadi, M., & Belusonti, M. (n.d.). Text based Sentiment Analysis using LSTM; Text based Sentiment Analysis using LSTM. Retrieved from www.ijert.org

NLTK’s list of english stopwords. (2010, August).

Nurdiansyah, Y., Bukhori, S., & Hidayat, R. (2018). Sentiment analysis system for movie review in Bahasa Indonesia using naive bayes classifier method. In Journal of Physics: Conference Series (Vol. 1008). doi:10.1088/1742-6596/1008/1/012011

Olah, C. (2015). Understanding LSTM Networks [Blog]. Web Page.

Ombabi, A. H., Ouarda, W., & Alimi, A. M. (2020). Deep learning CNN–LSTM framework for Arabic sentiment analysis using textual information shared in social networks. Social Network Analysis and Mining, 10(1). doi:10.1007/s13278-020-00668-1

Ouyang, X., Zhou, P., Li, C. H., & Liu, L. (2015). Sentiment Analysis Using Convolutional Neural Network. In 2015 IEEE International Conference on Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing (pp. 2359–2364). IEEE. doi:10.1109/CIT/IUCC/DASC/PICOM.2015.349

Pan, Q., Dong, H., Wang, Y., Cai, Z., Zhang, L., & Nogueira, M. (2019). Recommendation of crowdsourcing tasks based on word2vec semantic tags. Wireless Communications and Mobile Computing, 2019. doi:10.1155/2019/2121850

Rhanoui, M., Mikram, M., Yousfi, S., & Barzali, S. (2019). A CNN-BiLSTM Model for Document-Level Sentiment Analysis. Machine Learning and Knowledge Extraction, 1(3). doi:10.3390/make1030048

Sari, I. C., & Ruldeviyani, Y. (2020). Sentiment Analysis of the Covid-19 Virus Infection in Indonesian Public Transportation on Twitter Data: A Case Study of Commuter Line Passengers. In 2020 International Workshop on Big Data and Information Security, IWBIS 2020. doi:10.1109/IWBIS50925.2020.9255531

SMOTEN — Version 0.11.0. (n.d.).

Sosa, P. M. (2017). Twitter Sentiment Analysis using combined LSTM-CNN Models. Eprint Arxiv.

Stefano Leone. (2020). Rotten tomatoes movies and critic Reviews Dataset.

Suhariyanto, Firmanto, A., & Sarno, R. (2018). Prediction of Movie Sentiment based on Reviews and Score on Rotten Tomatoes using SentiWordnet. In Proceedings - 2018 International Seminar on Application for Technology of Information and Communication: Creative Technology for Human Life, iSemantic 2018. doi:10.1109/ISEMANTIC.2018.8549704

Widayat, W. (2021). Analisis Sentimen Movie Review menggunakan Word2Vec dan metode LSTM Deep Learning. JURNAL MEDIA INFORMATIKA BUDIDARMA, 5(3). doi:10.30865/mib.v5i3.3111

Yasen, M., & Tedmori, S. (2019). Movies reviews sentiment analysis and classification. In 2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology, JEEIT 2019 - Proceedings. doi:10.1109/JEEIT.2019.8717422

Zhang, L., Wang, S., & Liu, B. (2018). Deep learning for sentiment analysis: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4). doi:10.1002/widm.1253

Downloads


Crossmark Updates

How to Cite

Utami, S., Lhaksmana, K. M., & Sibaroni, Y. (2023). Deep Learning and Imbalance Handling on Movie Review Sentiment Analysis. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 7(3), 1894-1907. https://doi.org/10.33395/sinkron.v8i3.12770

Most read articles by the same author(s)