Determining The Optimal Number of K-Means Clusters Using The Calinski Harabasz Index and Krzanowski and Lai Index Methods for Groupsing Flood Prone Areas In North Sumatra

Ziana  Syahputri; Sutarman; Machrani Adi Putri Siregar

doi:10.33395/sinkron.v9i1.13246

Authors

Ziana Syahputri North Sumatra State Islamic University
Sutarman University of North Sumatra
Machrani Adi Putri Siregar State Islamic University of North Sumatra

DOI:

10.33395/sinkron.v9i1.13246

Keywords:

Cluster, K-Means, CH Index, KL Index, Cluster Tightness Measure (CTM), Flood

Abstract

The k-means algorithm is a partitional clustering method. K-Means has several advantages, including being easy to implement, having a high level of convergence and producing denser clusters. Meanwhile, the drawback is that it is difficult to determine the optimal number of clusters. The K-Means method will be used to solve problems in areas prone to flood disasters in North Sumatra. This research aims to find the optimal number of clusters with the Calinski Harabasz Index and Krzanowski And Lai Index based on the Cluster Tightness Measure (CTM) value. There are eleven variables used in this research. Based on the research results, it was concluded that the CTM CH result of 0.376 was smaller than the CTM KL of 0.7843. So it can be said that determining the optimal number of clusters using CH with k = 6 is better than KL with k = 2.

GS Cited Analysis

Downloads

Download data is not yet available.

References

Azizah, Oscarini, D. R., Saputra, F. M., & Multazam, H. (2021). Grouping Districts in Jakarta Based on Their Level of Susceptibility to Floods Using K-Means Clustering. 3, 150–159.

Brito Da Silva, L. E., Melton, N. M., & Wunsch, D. C. (2020). Incremental Cluster Validity Indices for Online Learning of Hard Partitions: Extensions and Comparative Study. IEEE Access, 8, 22025–22047.

Charrad, M., Ghazzali, N., Boiteau, V., & Niknafs, A. (2014). Nbclust: An R package for determining the relevant number of clusters in a data set. Journal of Statistical Software, 61(6), 1–36.

Fernandes, A. A. R. (2021). Comparison of Cluster and Linkage Validity Indices in Integrated Cluster Analysis with Structural Equation Modeling War-PLS Approach. Journal of Hunan University (Natural Sciences), 48(4).

Heer, J., & Chi, E. H. (2002). Mining the Structure of User Activity using Cluster Stability. Proceedings of the Workshop on Web Analytics SIAM Conference on Data Mining, February.

Khairati, A. F., Adlina, A. ., Hertono, G. ., & Handari, B. . (2019). Validity Index Study on the K-Means Enhanced Algorithm and K-Means MMCA. PRISMA, Prosiding Seminar Nasional Matematika, 2, 161–170.

Madani, B.J. (2014). Hybrid Hirerchical Clustering Analysis Through Mutual Clusters, Bottom-Up and Top Down Using Euclidean and Mahalanobis Distances .

Milligan, G. W., & Cooper, M. C. (1985). An examination of procedures for determining the number of clusters in a data set. Psychometrika , 50 (2).

Ni'matuzzahroh, L., Andrea Tri Rian, D., & Adrianingsih, NY (2022). Clustering Regencies / Cities in Kalimantan Island Based on Poverty Indicators using Agglomerative Hierarchical Clustering (AHC). Journal of Mathematics, Statistics, and Computing , 19 (1), 79–89.

Saidah, D. A., Santoso, R., & Widiharih, T. (2022). Grouping Provinces in Indonesia Based on Environmental Health Indicators Using the Partitioning Around Medoids Method with Internal Index Validation. Jurnal Gaussian, 11(2), 302–312.

Saitta, S., Raphael, B., & Smith, I. F. C. (2008). A comprehensive validity index for clustering. Intelligent Data Analysis, 12(6), 529–548.

Saputro, D. R. S. (2022). Algoritme Partitioning Around Medoid (Pam) Dengan Calinski-Harabasz Index Untuk Clustering Data Outlier. UNEJ E-Proceeding.

Sikana, AM, & Wijayanto, AW (2021). Comparative Analysis of 2019 Indonesian Human Development Index Groupings using Partitioning and Hierarchical Clustering Methods. Journal of Computer Science , 14 (2), 66.

Suyanto. (2019). DATA MINING; For Data Classification and Clustering . Bandung Informatics.

Ulinnuha, N., & Sholihah, SA (2021). Cluster Analysis for Mapping Covid - 19 Case Data in Indonesia Using K- Means. MSA Journal (Mathematics and Statistics and Their Applications) , 9 (2).

	CONTACT US
	EDITORIAL BOARD
	AIMS & SCOPE
	COPYRIGHT & LICENSE
	REVIEWER
	FACEBOOK FANPAGE
	AUTHOR PROCESSING CHARGE
	OPEN ACCESS POLICY
	TEMPLATE
	PEER REVIEW PROCESS
	PUBLICATION ETHICS
	STATISTIC VIEWER
	ARCHIVING
	CROSSMARK POLICY
	FREQUENCY
	PLAGIARISM POLICY
	AUTHOR GUIDELINES
	HISTORY
	CALL REVIEWER

Determining The Optimal Number of K-Means Clusters Using The Calinski Harabasz Index and Krzanowski and Lai Index Methods for Groupsing Flood Prone Areas In North Sumatra

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Current Issue

Make a Submission

Information

Developed By

Acceptance Rate Statistics