Prediction of Netizen Tweets Using Random Forest, Decision Tree, Naïve Bayes, and Ensemble Algorithm
DOI:
10.33395/sinkron.v5i1.10565Keywords:
Decision Tree, Naïve Bayes, Random Forest, Set, TwitterAbstract
The current Governor of DKI Jakarta, even though he has been elected since 2017 is always interesting to talk about or even comment on. Comments that appear come from the media directly or through social media. Twitter has become one of the social media that is often used as a media to comment on elected governors and can even become a trending topic on Twitter social media. Netizens who comment are also varied, some are always Tweeting criticism, some are commenting Positively, and some are only re-Tweeting. In this research, a prediction of whether active Netizens will tend to always lead to Positive or Negative comments will be carried out in this study. Model algorithms used are Decision Tree, Naïve Bayes, Random Forest, and also Ensemble. Twitter data that is processed must go through preprocessing first before proceeding using Rapidminer. In trials using Rapidminer conducted in four trials by dividing into two parts, namely testing data and training data. Comparisons made are 10% testing data: 90% Training data, then 20% testing data: 80% training data, then 30% testing data: 70% training data, and the last is 35% testing data: 65% training data. The average Accuracy for the Decision Tree algorithm is 93.15%, while for the Naïve Bayes algorithm the Accuracy is 91.55%, then for the Random Forest algorithm is 93.41, and the last is the Ensemble algorithm with an Accuracy of 93, 42%. here.
Downloads
References
Alhamad, A., Azis, A. I. S., Santoso, B., & Taliki, S. (2019). Heart Disease Prediction using methods of Machine Learning based on Ensemble – Weighted Vote. 5(3), 352 – 360.
Blatnik, A., Jarm, K., & Meža, M. (2014). Movie sentiment analysis based on public tweets. Elektrotehniski Vestnik/Electrotechnical Review, 81(4), 160–166.
Buntoro, G. A. (2017). Analysis of candidates for governor of DKI Jakarta 2017 on Twitter. Integer Journal March, 1(1),32–41.
Https://www.researchgate.net/profile/Ghulam_Buntoro/publication/316617194_Analisis_Sentimen_Calon_Gubernur_DKI_Jakarta_2017_Di_Twitter/links/5907eee44585152d2e9ff992/Analisis-Sentimen-Calon-Gubernur-DKI-Jakarta-2017-Di-Twitter.pdf
Cureg, M. Q., De La Cruz, J. A. D., Solomon, J. C. A., Saharkhiz, A. T., Balan, A. K. D., & Samonte, M. J. C. (2019). Sentiment analysis on tweets with punctuations, emoticons, and negations. ACM International Conference Proceeding Series, Part F1483(1), 266–270. https://doi.org/10.1145/3322645.3322657
Da Silva, N. F. F., Hruschka, E. R., & Hruschka, E. R. (2014). Tweet sentiment analysis with classifier ensembles. Decision Support Systems. https://doi.org/10.1016/j.dss.2014.07.003
Flux, A. W., Pareto, V. (1897). Political Economy Course. The Economic Journal. https://doi.org/10.2307/2956966
Gorunescu, F. (2011). Data mining Concepts, Models, and Techniques. Verlag Berlin Heidelberg: Springer
Han, J., & Kamber, M. (2007). Data mining Concepts and Techniques. Morgan Kaufmann publisher.
Jiawei Han, & Kamber, M. (2013). Data Mining: Concepts and Techniques Second Edition. In Morgan Kaufmann. https://doi.org/10.1017/CBO9781107415324.004
Larose, D. T. (2005). Discovering Knowledge in Data. New Jersey: John Willey & Sons, Inc.
Kartiko, M., & Sfenrianto. (2019). Accuracy for Sentiment Analysis of Twitter Students on ELearning in Indonesia using Naive Bayes Algorithm Based on Particle Swarm Optimization. Journal of Physics: Conference Series, 1179(1). https://doi.org/10.1088/1742-6596/1179/1/012027
Mentari, N. D., Fauzi, M. A., & Muflikhah, L. (2018). 2013 curriculum sentiment analysis on Twitter social Media using the K-Nearest Neighbor method and the Feature Selection Query Expansion Ranking. Journal of Information Technology and Computer science development (J-Ptiik) Universitas Brawijaya, 2(8), 2739 – 2743.
Pratama, B., Saputra, D. D., Novianti, D., Purnamasari, E. P., Kuntoro, A. Y., Hermanto, Gata, W., Wardhani, N. K.,
Sfenrianto, S., & Budamsono, S. (2019). Sentiment Analysis of the Indonesian Police Mobile Brigade Corps Based on Twitter Posts Using the SVM and NB Methods. Journal of Physics: Conference Series, 1201(1). https://doi.org/10.1088/1742-6596/1201/1/012038
Puyalnithi, T., V, M. V., & Singh, A. (2016). Comparison of Performance of Various Data Classification Algorithms with Ensemble Methods Using Rapidminer. 6(5), 1–6.
Rachmat, A., & Lukito, Y. (2016). Implementation of WEB based Crowdsourced Labelling system with Weighted Majority Voting method. ULTIMA Infosys Journal, 6(2), 76 – 82. https://doi.org/10.31937/si.v6i2.223
Ratul, A. R., & Engineering, F. (n.d.). A Comparative Study on Crime in Denver City Based on Machine Learning and Data Mining.
Witten, I. H., Frank, E., & Hall, M. a. (2011). Data Mining: Practical Machine Learning Tools and Techniques (Google eBook). In Complementary literature None. http://books.google.com/books?id=bDtLM8CODsQC&pgis=1