Machine Learning Analysis of Jakarta Bay Water Quality: Comparing Models
DOI:
10.33395/sinkron.v10i1.15540Keywords:
Jakarta Bay, Water Quality Classification, LightGBM, CatBoost, Explanable AI (SHAP)Abstract
Jakarta Bay experiences persistent anthropogenic pressures that produce spatially heterogeneous water-quality conditions. This study develops a regulation-aligned, explainable classification framework using a 2024 in-situ dataset collected at 53 stations across two sampling periods (March and August). After preprocessing—including unit harmonization, outlier screening, missing-value imputation, and treatment of below-detection-limit measurements—the dataset yielded 104 complete samples classified into Good (n=46), Lightly Polluted (n=28), and Moderately Polluted (n=34) categories based on KEPMEN LH No. 51/2004. Three ensemble algorithms (LightGBM, CatBoost, and Random Forest) were evaluated using stratified cross-validation to maintain class balance and prevent spatial leakage. CatBoost achieved the best overall performance (Accuracy = 0.8338; F1 = 0.8257), followed by Random Forest, while LightGBM showed the highest variability across folds. Class-level metrics indicate that CatBoost produced the most balanced predictions, particularly for the borderline Lightly Polluted class. SHAP analysis identified turbidity/TSS, nutrients, dissolved oxygen, salinity, and spatial gradients as dominant predictors, enabling transparent interpretation of model decisions. The resulting framework provides a reproducible and operationally deployable approach for rapid screening, hotspot detection, and decision support in Jakarta Bay’s water-quality management.
Downloads
References
Ardyan, P. A. N. (2025). Water Quality Analysis Using NDTI and TSS Parameters Based on Sentinel Image Data in Jakarta Bay Waters. Maritime Park: Journal of Maritime Technology and Society, 4(June), 103–109. https://doi.org/10.62012/mp.vi.43831
Bai, Y., Xu, Z., Lan, W., Peng, X., Deng, Y., Chen, Z., Xu, H., Wang, Z., Xu, H., Chen, X., & Cheng, J. (2024). Predicting Coastal Water Quality with Machine Learning, a Case Study of Beibu Gulf, China. Water (Switzerland), 16(16), 1–20. https://doi.org/10.3390/w16162253
Chen, B., Chen, Y., & Chen, H. (2024). An Interpretable CatBoost Model Guided by Spectral Morphological Features for the Inversion of Coastal Water Quality Parameters. Water, 16(24), 3615. https://doi.org/10.3390/w16243615
Ding, F., Hao, S., Zhang, W., Jiang, M., Chen, L., Yuan, H., Wang, N., Li, W., & Xie, X. (2025). Using multiple machine learning algorithms to optimize the water quality index model and their applicability. Ecological Indicators, 172, 113299. https://doi.org/10.1016/j.ecolind.2025.113299
Edward, E., & Kusnadi, A. (2023). Review on monitoring of water quality of the Jakarta Bay, Indonesia. E3S Web of Conferences, 454, 1–18. https://doi.org/10.1051/e3sconf/202345402003
Frincu, R. M. (2025). Artificial intelligence in water quality monitoring: A review of water quality assessment applications. Water Quality Research Journal, 60(1), 164–176. https://doi.org/10.2166/wqrj.2024.049
Gharehbaghi, A., Heddam, S., Mehdizadeh, S., & Kim, S. (2025). Development of interpretable intelligent frameworks for estimating river water turbidity. Engineering Applications of Computational Fluid Mechanics, 19(1). https://doi.org/10.1080/19942060.2025.2511886
Hindarto, D. (2022). Perbandingan Kinerja Akurasi Klasifikasi K-NN, NB dan DT pada APK Android. JATISI (Jurnal Teknik Informatika Dan Sistem Informasi), 9(1), 486–503. https://doi.org/10.35957/jatisi.v9i1.1542
Hindarto, D. (2024). Case Study: Gradient Boosting Machine vs Light GBM in Potential Landslide Detection. Journal of Computer Networks, Architecture and High Performance Computing, 6(1), 169–178. https://doi.org/10.47709/cnahpc.v6i1.3374
Hindarto, D., & Santoso, H. (2022). PERFORMANCE COMPARISON OF SUPERVISED LEARNING USING NON-NEURAL NETWORK AND NEURAL NETWORK. Janapati, 11, 49–62.
Lokman, A., Ismail, W. Z. W., & Aziz, N. A. A. (2025). A Review of Water Quality Forecasting and Classification Using Machine Learning Models and Statistical Analysis. Water, 17(15), 2243. https://doi.org/10.3390/w17152243
Makumbura, R. K., Mampitiya, L., Rathnayake, N., Meddage, D. P. P., Henna, S., Dang, T. L., Hoshino, Y., & Rathnayake, U. (2024). Advancing water quality assessment and prediction using machine learning models, coupled with explainable artificial intelligence (XAI) techniques like shapley additive explanations (SHAP) for interpreting the black-box nature. Results in Engineering, 23, 102831. https://doi.org/10.1016/j.rineng.2024.102831
Nishat, M. H., Khan, Md. H. R. B., Ahmed, T., Hossain, S. N., Ahsan, A., El-Sergany, M. M., Shafiquzzaman, Md., Imteaz, M. A., & Alresheedi, M. T. (2025). Comparative analysis of machine learning models for predicting water quality index in Dhaka’s rivers of Bangladesh. Environmental Sciences Europe, 37(1), 31. https://doi.org/10.1186/s12302-025-01078-w
Setiawan, S. (2025). Trends and gaps in ai‐driven predictive models for coastal water quality: A bibliometric study. BIO Web of Conferences, 188, 04004. https://doi.org/10.1051/bioconf/202518804004
Shah, F. U., Khan, A. U., Khan, A. W., Ullah, B., Khan, M. R., & Javed, I. (2024). Comparative analysis of ensemble learning algorithms in water quality prediction. Journal of Hydroinformatics, 26(12), 3041–3059. https://doi.org/10.2166/hydro.2024.071
Singh, P., Hasija, T., Bharany, S., Naeem, H. N. T., Rao, B. C., Hussen, S., & Rehman, A. U. (2025). An ensemble-driven machine learning framework for enhanced water quality classification. Discover Sustainability, 6(1), 552. https://doi.org/10.1007/s43621-025-01467-4
Uddin, M. G., Nash, S., Rahman, A., & Olbert, A. I. (2023). Performance analysis of the water quality index model for predicting water state using machine learning techniques. Process Safety and Environmental Protection, 169, 808–828. https://doi.org/10.1016/j.psep.2022.11.073
Downloads
How to Cite
Issue
Section
License
Copyright (c) 2025 Aura Savira, Andrianingsih

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.


Moraref
PKP Index
Indonesia OneSearch
OCLC Worldcat
Index Copernicus
Scilit




















