Predictions of Indonesia Economic Phenomena Based on Online News Using Random Forest

Authors

  • Fitri Khairani IPB University
  • Anang Kurnia IPB University, Indonesia
  • Muhammad Nur Aidi IPB University, Indonesia
  • Setia Pramana Politeknik, Statistika STIS, Jakarta, Indonesia

DOI:

10.33395/sinkron.v7i2.11401

Keywords:

Classification, Economy, GDP, Online News, Prediction, Random Forest

Abstract

Economic growth in the first quarter of 2021 based on YoY (Year on Year) is around -0.74%. This figure caused the Indonesian economy to recession after contracting four times since the second quarter of 2020. With positive and negative growth in the value of GDP for each category based on the business sector each quarter, can do future economic growth modelling. The prediction results can be used as an early warning for the government on factors that can maximize and factors that must improve. This study aims to predict the state of economic growth in the next quarter using  Random Forest classification. Random Forest combines tree classification and bagging by resampling the data, which reduces the variance of the final model, which is for low variance overfitting. The data used in this study was scrapped from January 2021 to March 2021 on 5 Indonesian online news portals, namely Kompas, Antara, Okezone, Detik, and Bisnis. The independent variable is online news based on GDP category. The dependent variable results from data labelling on each news, up or down, carried out by the Directorate of Balance Sheet of BPS. Based on the calculations with cross-validation of 10, the modelling results obtained 96.51% accuracy, 97% precision, and 97% recall. The random forest method is good for predicting economic growth in the next quarter, namely the second quarter of 2021. Incorrectly predicted only three categories of GDP were: the construction category, the transportation and warehousing category, and the company service category

GS Cited Analysis

Downloads

Download data is not yet available.

References

Barua A, Sharif O, Hoque MM, Barua A, Sharif O, Hoque MM. 2021. Multi-class Sports News Categorization Categorization using Machine Learning Techniques : Resource Creation and Evaluation T. Procedia Computer Science. 193:112–121. doi:10.1016/j.procs.2021.11.002.

Brookes, B. C. (1972). The Shannon model of IR systems. Journal of Documentation, 28, 160–162.

Bramer, M. (2007). Principles of Data Mining. In M. Bramer (Ed.), Springer Science+Business Mediaspringer.com (Issue January 2007). Springer Science+Business Media springer.com. https://doi.org/10.1007/978-1-84628-766-4

Cover, T. M., & Thomas, J. A. (1991). Elements of information theory. New York: John Wileyand Sons Inc

Dhar P, Abedin MZ. 2021. Bengali News Headline Categorization Using Optimized Machine Learning Pipeline. I.J Informatics Enginering Electronic. Busi. 13(1):15–24. doi:10.5815/ijieeb.2021.01.02.

Hastie et al. 2008. The Elements of Statistical Learning. Elem Stat Learn. 26(4):505–516.

Leo Breiman. 2001. Random Forests. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics). 12343 LNCS:503–515. doi:10.1007/978-3-030-62008-0_35.

Manning C.D, Raghavan P, Scutze, H. 2009. Introduction to Information Retrieval. Ed ke-1. Cambridge: Cambridge University Press

Lakshmanaprabu, S., Shankar, K., Ilayaraja, M., Nasir, A.W., Vijayakumar, V., Chilamkurti, N., 2019. Random forest for big data classification in the internet of things using optimal features. Int. J. Mach. Learn. Cybern. 10 (10), 2609–2618.

Luque, A., Carrasco, A., Martín, A., & de las Heras, A. (2019). The impact of class imbalance in classification performance metrics based on the binary confusion matrix. Pattern Recognition Elsevier Ltd, 91, 216–231. https://doi.org/10.1016/j.patcog.2019.02.023

Mohammed, M., Khan, M. B., & Bashie, E. B. M. (2017). Machine learning: Algorithms and applications. In E.

B. M. B. Mohssen Mohammed, Muhammad Badruddin Khan (Ed.), Machine Learning: Algorithms and Applications. CRC Press Taylor & Francis Group. https://doi.org/10.1201/9781315371658.

Parida, U., Nayak, M., Nayak, A.K., 2021. News text categorization using random forest and naïve bayes. In: 2021 1st Odisha International Conference on Electrical Power Engineering, Communication and Computing Technology (ODICON), IEEE, pp. 1–4.

Pathak MA. 2014. Beginning Data Science with R. Springer.

Sarkar D. 2019. Text Analytics with Python. India :Apress.

U. Suleymanov, S. Rustamov, M. Zulfugarov, O. Orujov, N. Musayev, A. Alizade, Empirical study of online news classification using machine learning approaches, in: IEEE Int. Conf. on Application of Information and Communication Technologies, IEEE, 2018, pp. 1–6.

Wong, S. K. M., & Yao, Y. Y. (1992). An information-theoretic measure of term specificity. Journal of the American Society for Information Science, 43(1), 54–61

Wongso R, Luwinda FA, Trisnajaya BC, Rusli O, Wongso R, Luwinda FA, Trisnajaya BC, Rusli O. 2017. News Article Text Classification in Indonesian Language. Procedia Computer Science. 116:137–143. doi:10.1016/j.procs.2017.10.039.

Zhu M. 2008. Kernels and ensembles: Perspectives on statistical learning. Am Stat. 62(2):97–109. doi:10.1198/000313008X306367

Downloads


Crossmark Updates

How to Cite

Khairani, F. ., Kurnia, A. ., Aidi, M. N. ., & Pramana, S. . (2022). Predictions of Indonesia Economic Phenomena Based on Online News Using Random Forest. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 6(2), 532-540. https://doi.org/10.33395/sinkron.v7i2.11401