Stock Price Correlation Analysis with Twitter Sentiment Analysis Using The CNN-LSTM Method
DOI:
10.33395/sinkron.v8i4.12855Keywords:
BBCA, CNN-LSTM, Sentiment Analysis, Stocks, TwitterAbstract
The intricate interplay between stock prices, reflecting a company's intrinsic value, and multifaceted factors like economic conditions, corporate performance, and market sentiment, constitutes a vital research domain. Grounded in sentiment analysis, our study deciphers public opinions from vast textual data to gauge sentiment, leveraging Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) models. We focus on Bank Central Asia (BBCA), a prominent Indonesian banking institution, aiming to forecast stock price fluctuations by analyzing sentiment trends extracted from social media, especially Twitter. Meticulous experimentation, encompassing data segmentation, feature extraction, augmentation, and model refinement, yields significant enhancements in prediction accuracy. The CNN-LSTM model's performance improves from 73.41% to a robust 77.75% accuracy, with F1-scores rising from 73.00% to 75.42%. Importantly, strong correlations emerge between sentiment predictions and actual stock price movements, validated by Spearman correlation coefficients. Positive sentiment exhibits a substantial correlation of 0.745 with stock price changes, while negative sentiment exerts notable influence with a correlation coefficient of 0.691. In summary, our study advances the field of sentiment-driven stock price prediction, showcasing deep learning's effectiveness in extracting sentiment from social media narratives. The implications extend to understanding market dynamics and potentially integrating sentiment-aware strategies into financial decision-making. Future research directions could explore model transferability across financial contexts, real-time sentiment data integration, and interpretability techniques for enhanced practicality in sentiment-driven predictions.
Downloads
References
Ahuja, R., Chug, A., Kohli, S., Gupta, S., & Ahuja, P. (2019). The impact of features extraction on the sentiment analysis. In Procedia Computer Science (Vol. 152). doi:10.1016/j.procs.2019.05.008
Akmal Iftikar, M. (2022). Analisis Sentimen Twitter: Penanganan Pandemi Covid-19 Menggunakan Metode Hybrid Naïve Bayes, Decision Tree, dan Support Vector Machine.
Alam, S., & Yao, N. (2019). The impact of preprocessing steps on the accuracy of machine learning algorithms in sentiment analysis. Computational and Mathematical Organization Theory, 25(3), 319–335. doi:10.1007/s10588-018-9266-8
Alhakiem, H. R., & Setiawan, E. B. (2022). Aspect-Bas1ed Sentiment Analysis on Twitter Using Logistic Regression with FastText Feature Expansion. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 6(5), 840–846. doi:10.29207/resti.v6i5.4429
Azahra, N. M., & Setiawan, E. B. (2023). Sentence-Level Granularity Oriented Sentiment Analysis of Social Media Using Long Short-Term Memory (LSTM) and IndoBERTweet Method. Jurnal Ilmiah Teknik Elektro Komputer Dan Informatika (JITEKI), 9(1), 85–95. doi:10.26555/jiteki.v9i1.25765
Cambria, E., Schuller, B., Xia, Y., & Havasi, C. (2013). New Avenues in Opinion Mining and Sentiment Analysis. IEEE Intelligent Systems, 28(2), 15–21. doi:10.1109/MIS.2013.30
Diaz Tiyasya Putra, & Erwin Budi Setiawan. (2023). Sentiment Analysis on Social Media with Glove Using Combination CNN and RoBERTa. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 7(3), 457–563. doi:10.29207/resti.v7i3.4892
Elgeldawi, E., Sayed, A., Galal, A. R., & Zaki, A. M. (2021). Hyperparameter tuning for machine learning algorithms used for arabic sentiment analysis. Informatics, 8(4). doi:10.3390/informatics8040079
Fadilah, W. R. U., Kusuma, W. A., Minarno, A. E., & Munarko, Y. (2021). Classification of Human Activity Recognition Utilizing Smartphone Data of CNN-LSTM. Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control. doi:10.22219/kinetik.v6i2.1319
Faisal, M. R. (2019). Ekstraksi Fitur Menggunakan Model Word2vec Untuk Analisis Sentimen Pada Komentar Facebook Effect of features Generated from additional segments in protein sequence classification View project IT Asset Management View project. Retrieved from https://www.researchgate.net/publication/343057288
Fama, E. F. (1970). Efficient Capital Markets: A Review of Theory and Empirical Work. The Journal of Finance, 25(2), 383. doi:10.2307/2325486
Gandhi, U. D., Malarvizhi Kumar, P., Chandra Babu, G., & Karthick, G. (2021). Sentiment Analysis on Twitter Data by Using Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM). Wireless Personal Communications. doi:10.1007/s11277-021-08580-3
Gauthier, T. D. (2001). Detecting trends using Spearman’s rank correlation coefficient. Environmental Forensics, 2(4). doi:10.1006/enfo.2001.0061
Hassan, A., & Mahmood, A. (2018). Convolutional Recurrent Deep Learning Model for Sentence Classification. IEEE Access, 6, 13949–13957. doi:10.1109/ACCESS.2018.2814818
Hussein, D. M. E. D. M. (2018). A survey on sentiment analysis challenges. Journal of King Saud University - Engineering Sciences, 30(4), 330–338. doi:10.1016/j.jksues.2016.04.002
Indriani, A. (2014). Klasifikasi Data Forum dengan menggunakan Metode Naïve Bayes Classifier. Seminar Nasional Aplikasi Teknologi Informasi (SNATI) Yogyakarta. Retrieved from www.bluefame.com,
Jacovi, A., Shalom, O. S., & Goldberg, Y. (2018). Understanding Convolutional Neural Networks for Text Classification.
Li, W., Zhu, L., Shi, Y., Guo, K., & Cambria, E. (2020). User reviews: Sentiment analysis using lexicon integrated two-channel CNN–LSTM family models. Applied Soft Computing Journal, 94. doi:10.1016/j.asoc.2020.106435
Liu, B. (2010). Sentiment analysis and subjectivity.
Luo, L. xia. (2019). Network text sentiment analysis method combining LDA text representation and GRU-CNN. Personal and Ubiquitous Computing, 23(3–4), 405–412. doi:10.1007/s00779-018-1183-9
Medhat, W., Hassan, A., & Korashy, H. (2014). Sentiment analysis algorithms and applications: A survey. Ain Shams Engineering Journal, 5(4), 1093–1113. doi:10.1016/j.asej.2014.04.011
Miedema, F. (2018). Sentiment Analysis with Long Short-Term Memory networks.
Naufal Adi Nugroho, & Erwin Budi Setiawan. (2021). Implementation Word2Vec for Feature Expansion in Twitter Sentiment Analysis. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 5(5). doi:10.29207/resti.v5i5.3325
Normawati, D., & Prayogi, S. A. (2021). Implementasi Naïve Bayes Classifier Dan Confusion Matrix Pada Analisis Sentimen Berbasis Teks Pada Twitter. Jurnal Sains Komputer & Informatika (J-SAKTI (Vol. 5).
Novitasari, F., & Dwifebri Purbolaksono, M. (2021). Sentiment Analysis Aspect Level on Beauty Product Reviews Using Chi-Square and Naïve Bayes. OPEN ACCESS J DATA SCI APPL, 4(1), 18–030. doi:10.34818/JDSA.2021.4.72
Pradana, M. G., Nurcahyo, A. C., & Saputro, P. H. (2020). PENGARUH SENTIMEN DI SOSIAL MEDIA DENGAN HARGA SAHAM PERUSAHAAN. Edutic - Scientific Journal of Informatics Education, 6(2). doi:10.21107/edutic.v6i2.6992
Putri, S. I., Setiawan, E. B., & Sibaroni, Y. (2023). JURNAL MEDIA INFORMATIKA BUDIDARMA Aspect-Based Sentiment Analysis on Twitter Using Long Short-Term Memory Method. doi:10.30865/mib.v7i2.5637
Rehman, A. U., Malik, A. K., Raza, B., & Ali, W. (2019). A Hybrid CNN-LSTM Model for Improving Accuracy of Movie Reviews Sentiment Analysis. Multimedia Tools and Applications, 78(18). doi:10.1007/s11042-019-07788-7
Rhanoui, M., Mikram, M., Yousfi, S., & Barzali, S. (2019). A CNN-BiLSTM Model for Document-Level Sentiment Analysis. Machine Learning and Knowledge Extraction, 1(3), 832–847. doi:10.3390/make1030048
Santos, I., Nedjah, N., & de Macedo Mourelle, L. (2017). Sentiment analysis using convolutional neural network with fastText embeddings. In 2017 IEEE Latin American Conference on Computational Intelligence (LA-CCI) (pp. 1–5). IEEE. doi:10.1109/LA-CCI.2017.8285683
Schober, P., & Schwarte, L. A. (2018). Correlation coefficients: Appropriate use and interpretation. Anesthesia and Analgesia, 126(5). doi:10.1213/ANE.0000000000002864
Seo, S., Kim, C., Kim, H., Mo, K., & Kang, P. (2020). Comparative Study of Deep Learning-Based Sentiment Classification. IEEE Access, 8, 6861–6875. doi:10.1109/ACCESS.2019.2963426
Spencer, D. H., Santanen, E. L., & Ellis, T. J. (2014). Introduction to Advances in Teaching and Learning Technologies Minitrack. In 2014 47th Hawaii International Conference on System Sciences (pp. 2–2). IEEE. doi:10.1109/HICSS.2014.10
Sul, H. K., Dennis, A. R., & Yuan, L. (2014). Trading on twitter: The financial information content of emotion in social media. In Proceedings of the Annual Hawaii International Conference on System Sciences (pp. 806–815). IEEE Computer Society. doi:10.1109/HICSS.2014.107
Wahyuni, R. T., Prastiyanto, D., & Supraptono, D. E. (2017). Penerapan Algoritma Cosine Similarity dan Pembobotan TF-IDF pada Sistem Klasifikasi Dokumen Skripsi.
Wang, X., Jiang, W., & Luo, Z. (2016). Combination of Convolutional and Recurrent Neural Network for Sentiment Analysis of Short Texts. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers (pp. 2428–2437). Osaka, Japan: The COLING 2016 Organizing Committee. Retrieved from https://aclanthology.org/C16-1229
Yadav, A., & Vishwakarma, D. K. (2020). Sentiment analysis using deep learning architectures: a review. Artificial Intelligence Review, 53(6), 4335–4385. doi:10.1007/s10462-019-09794-5
Zar, J. H. (2005). Spearman Rank Correlation. In Encyclopedia of Biostatistics. Wiley. doi:10.1002/0470011815.b2a15150
Zhang, C., Wang, X., Yu, S., & Wang, Y. (2018). Research on Keyword Extraction of Word2vec Model in Chinese Corpus. In Proceedings - 17th IEEE/ACIS International Conference on Computer and Information Science, ICIS 2018. doi:10.1109/ICIS.2018.8466534
Downloads
How to Cite
Issue
Section
License
Copyright (c) 2023 Muhammad Noer Ibnu Sina, Erwin Budi Setiawan
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.