Prediction of Netizen Tweets Using Random Forest, Decision Tree, Naïve Bayes, and Ensemble Algorithm

Authors

  • Yan Rianto STMIK Nusa Mandiri, Indonesia
  • Antonius Yadi Kuntoro STMIK Nusa Mandiri, Indonesia

DOI:

10.33395/sinkron.v5i1.10565

Keywords:

Decision Tree, Naïve Bayes, Random Forest, Set, Twitter

Abstract

The current Governor of DKI Jakarta, even though he has been elected since 2017 is always interesting to talk about or even comment on. Comments that appear come from the media directly or through social media. Twitter has become one of the social media that is often used as a media to comment on elected governors and can even become a trending topic on Twitter social media. Netizens who comment are also varied, some are always Tweeting criticism, some are commenting Positively, and some are only re-Tweeting. In this research, a prediction of whether active Netizens will tend to always lead to Positive or Negative comments will be carried out in this study. Model algorithms used are Decision Tree, Naïve Bayes, Random Forest, and also Ensemble. Twitter data that is processed must go through preprocessing first before proceeding using Rapidminer. In trials using Rapidminer conducted in four trials by dividing into two parts, namely testing data and training data. Comparisons made are 10% testing data: 90% Training data, then 20% testing data: 80% training data, then 30% testing data: 70% training data, and the last is 35% testing data: 65% training data. The average Accuracy for the Decision Tree algorithm is 93.15%, while for the Naïve Bayes algorithm the Accuracy is 91.55%, then for the Random Forest algorithm is 93.41, and the last is the Ensemble algorithm with an Accuracy of 93, 42%. here.

GS Cited Analysis

Downloads

Download data is not yet available.

References

Al-Rubaiee, H., Qiu, R., & Li, D. (2016). Analysis of the relationship between Saudi twitter posts and the Saudi stock market. 2015 IEEE 7th International Conference on Intelligent Computing and Information Systems, ICICIS 2015, December, 660–665. https://doi.org/10.1109/IntelCIS.2015.7397193

Alhamad, A., Azis, A. I. S., Santoso, B., & Taliki, S. (2019). Heart Disease Prediction using methods of Machine Learning based on Ensemble – Weighted Vote. 5(3), 352 – 360.

Blatnik, A., Jarm, K., & Meža, M. (2014). Movie sentiment analysis based on public tweets. Elektrotehniski Vestnik/Electrotechnical Review, 81(4), 160–166.

Buntoro, G. A. (2017). Analysis of candidates for governor of DKI Jakarta 2017 on Twitter. Integer Journal March, 1(1),32–41.

Https://www.researchgate.net/profile/Ghulam_Buntoro/publication/316617194_Analisis_Sentimen_Calon_Gubernur_DKI_Jakarta_2017_Di_Twitter/links/5907eee44585152d2e9ff992/Analisis-Sentimen-Calon-Gubernur-DKI-Jakarta-2017-Di-Twitter.pdf

Cureg, M. Q., De La Cruz, J. A. D., Solomon, J. C. A., Saharkhiz, A. T., Balan, A. K. D., & Samonte, M. J. C. (2019). Sentiment analysis on tweets with punctuations, emoticons, and negations. ACM International Conference Proceeding Series, Part F1483(1), 266–270. https://doi.org/10.1145/3322645.3322657

Da Silva, N. F. F., Hruschka, E. R., & Hruschka, E. R. (2014). Tweet sentiment analysis with classifier ensembles. Decision Support Systems. https://doi.org/10.1016/j.dss.2014.07.003

Flux, A. W., Pareto, V. (1897). Political Economy Course. The Economic Journal. https://doi.org/10.2307/2956966

Gorunescu, F. (2011). Data mining Concepts, Models, and Techniques. Verlag Berlin Heidelberg: Springer

Han, J., & Kamber, M. (2007). Data mining Concepts and Techniques. Morgan Kaufmann publisher.

Jiawei Han, & Kamber, M. (2013). Data Mining: Concepts and Techniques Second Edition. In Morgan Kaufmann. https://doi.org/10.1017/CBO9781107415324.004

Larose, D. T. (2005). Discovering Knowledge in Data. New Jersey: John Willey & Sons, Inc.

Kartiko, M., & Sfenrianto. (2019). Accuracy for Sentiment Analysis of Twitter Students on ELearning in Indonesia using Naive Bayes Algorithm Based on Particle Swarm Optimization. Journal of Physics: Conference Series, 1179(1). https://doi.org/10.1088/1742-6596/1179/1/012027

Mentari, N. D., Fauzi, M. A., & Muflikhah, L. (2018). 2013 curriculum sentiment analysis on Twitter social Media using the K-Nearest Neighbor method and the Feature Selection Query Expansion Ranking. Journal of Information Technology and Computer science development (J-Ptiik) Universitas Brawijaya, 2(8), 2739 – 2743.

Pratama, B., Saputra, D. D., Novianti, D., Purnamasari, E. P., Kuntoro, A. Y., Hermanto, Gata, W., Wardhani, N. K.,

Sfenrianto, S., & Budamsono, S. (2019). Sentiment Analysis of the Indonesian Police Mobile Brigade Corps Based on Twitter Posts Using the SVM and NB Methods. Journal of Physics: Conference Series, 1201(1). https://doi.org/10.1088/1742-6596/1201/1/012038

Puyalnithi, T., V, M. V., & Singh, A. (2016). Comparison of Performance of Various Data Classification Algorithms with Ensemble Methods Using Rapidminer. 6(5), 1–6.

Rachmat, A., & Lukito, Y. (2016). Implementation of WEB based Crowdsourced Labelling system with Weighted Majority Voting method. ULTIMA Infosys Journal, 6(2), 76 – 82. https://doi.org/10.31937/si.v6i2.223

Ratul, A. R., & Engineering, F. (n.d.). A Comparative Study on Crime in Denver City Based on Machine Learning and Data Mining.

Witten, I. H., Frank, E., & Hall, M. a. (2011). Data Mining: Practical Machine Learning Tools and Techniques (Google eBook). In Complementary literature None. http://books.google.com/books?id=bDtLM8CODsQC&pgis=1

Downloads


Crossmark Updates

How to Cite

Rianto, Y., & Kuntoro, A. Y. (2020). Prediction of Netizen Tweets Using Random Forest, Decision Tree, Naïve Bayes, and Ensemble Algorithm. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 5(1), 58-71. https://doi.org/10.33395/sinkron.v5i1.10565