K-NN Based Air Classification as Indicator of the Index of Air Quality in Palembang
DOI:
10.33395/sinkron.v7i3.11469Keywords:
Air quality, classification, environment, K-NN, Palembang, pollutionAbstract
Good air quality is something that is wanted by every human who lives in big cities. Clean air and no pollution is one of the proper environmental requirements. One of the most severe causes of air pollution is due to large-scale forest fires due to the long dry season or is carried out by irresponsible persons which they commonly refer to as land clearing in an easy and inexpensive way by utilizing the reason of the dry season. The purpose of this study is to classify air quality in Palembang using a data mining approach. Then use the results of the classification as an indicator of the level of air quality in the city of Palembang. The data mining approach that researchers use is the K-Nearest Neighbor algorithm. Based on the test results of K-NN calculations and measured using a confusion matrix produce an accuracy of 80 percent, 82.3 percent for precision, and 93.3 percent for recall. The measurement results show that the calculation using the K-NN algorithm can be used as an indicator in measuring air quality, of the 20 that have been trained and tested only 4 inaccurate data, this inaccuracy occurs because the source data has unbalanced classes such as unhealthy and very unhealthy healthy have 1 sample each. So it proves that the performance of classifiers using the K-NN algorithm relevant as an indicator of air quality levels in the city of Palembang.
Downloads
References
Dennekamp, M., & Abramson, M. J. (2011). The effects of bushfire smoke on respiratory health. Respirology, vol. 16, pp. 198-209.
Frank, A., & Asuncion, A. (2010). UCI machine learning repository. School of Information and Computer Sciences, Univ. California, Irvine, CA.
Glover, D., & Jessup, T. (1999). Indonesia's Fires and Haze: The Cost of Catastrophe (With a 2006 Update). International Development Research Centre.
Hall, P., Park, B. U., & Samworth, R. J. (2008). Choice of neighbor order in nearest-neighbor classification. Annals of Statistics, vol. 36 (5), pp. 2135–2152.
Ho, A. F. W. et al. (2018). Health impacts of the Southeast Asian haze problem – A time-stratified case crossover study of the relationship between ambient air pollution and sudden cardiac deaths in Singapore. International Journal of Cardiology, vol. 271, pp. 352–358.
Jadhav, R. J., & Pawar, U. T. (2011). Churn Prediction in Telecommunication Using Data Mining Technology. (IJACSA) International Journal of Advanced Computer Science and Applications, vol. 2(2), pp. 17-19.
Korb, K. B., & Nicholson, A. E. (2011). Bayesian Artificial Intelligence (2nd ed.). CRC Press, Florida.
Raza, A. et al. (2014). Short-term effects of air pollution on out-of-hospital cardiac arrest in Stockholm. Eur. Heart J., vol. 35, pp. 861-868.
Samworth, R. J. (2012). Optimal weighted nearest neighbour classifiers. Annals of Statistics, vol. 40 (5), pp. 2733–2763.
Steele, B. M. (2009). Exact bootstrap k-nearest neighbour learners. Mach. Learn, vol. 74, pp. 235–255.
Refaeilzadeh, P., Tang, L., & Liu, H. (2009). Cross-validation. Encyclopedia of Database Systems, Springer, Berlin, pp. 532-538.
Toussaint, G. T. (2005). Geometric proximity graphs for improving nearest neighbor methods in instance-based learning and data mining. International Journal of Computational Geometry and Applications, vol. 15 (2), pp. 101–150.
Witten, I. H., Frank, E., & Hall, M. A. (2011). Data Mining: Practical Machine Learning Tools and Techniques (3rd ed.). Morgan Kaufmann, Burlington.
Zaki, M. J., & Meira Jr., W. (2014). Data Mining and Analysis: Fundamental Concepts and Algorithms. Cambridge University Press, New York.
Zanobetti, A. et al. (2009). The effect of fine and coarse particulate air pollution on mortality: a national analysis. Environ. Health Perspect, vol. 117, pp. 898-903.
Downloads
How to Cite
Issue
Section
License
Copyright (c) 2022 Ahmad Sanmorino, Juhaini Alie, Nining Ariati, Sanza Vittria Wulanda
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.