Performance Single Linkage and K-Medoids on Data with Outliers

Authors

  • Caecilia Bintang Girik Allo Universitas Cenderawasih
  • Winda Ade Fitriya B Universitas Cenderawasih, Indonesia
  • Nicea Roona Paranoan Universitas Cenderawasih, Indonesia

DOI:

10.33395/sinkron.v8i4.14072

Abstract

One way to assess the economic growth of a province is by examining its Gross Regional Domestic Product (GRDP). GRDP calculated through the production approach reflects the total value added by goods and services from various sectors within a particular region over a specified period. To determine the GRDP, 17 business sectors are considered. In 2023, the GRDP growth rate in Papua has decreased to 3.44%, down from 4.11% the previous year. To help the government improve Papua’s GRDP, an analysis is required. Clustering methods can group regencies and cities with similar characteristics. Boxplots are used to identify outliers in the data. The data contains outliers, so one method that can be used is K-Medoids. Euclidean Distance is used to calculate the distance matrix. Before calculating the distances, standardization using z-score normalization is performed to ensure that the data ranges are the same. This article aims to identify the most effective method for clustering regencies and cities in Papua using GRDP at constant price data. Both Single Linkage and K-Medoids methods are applied in this study. The DBI is used for evaluation, with lower DBI values indicating better methods. According to the DBI results, Single Linkage outperforms K-Medoids for clustering regencies and cities in Papua, with the optimal number of clusters being three.

Keywords: Euclidean Distance; Davies Bouldin Index (DBI); Gross Regional Domestic Bruto; K-Medoids; Single Linkage; z-score Normalization

GS Cited Analysis

Downloads

Download data is not yet available.

References

Alamtaha, Z., Djakaria, I., & Yahya, N. I. (2023). Implementasi Algoritma Hierarchical Clustering dan Non-Hierarchical Clustering untuk Pengelompokkan Pengguna Media Sosial. Estimasi: Journal of Statistics and Its Application, 4(1), 33-43.

Badan Pusat Statistik (BPS) Provinsi Papua. (2024). Produk Domestik Regional Bruto Provinsi Papua Menurut Lapangan Usaha 2019 – 2023. Jayapura: Badan Pusat Statistik (BPS) Provinsi Papua.

Fathia, A. N., Rahmawati, R. & Tarno. (2016). Analisis Klaster Kecamatan di Kabuapten Semarang Berdasarkan Potensi Desa Menggunakan Metode Ward dan Single Linkage. Jurnal Gaussian, 5(4), 801-810.

Febriyati, N. A., DS, A. D., & Wanto, A. (2020). GRDP Growth Rate Clustering in Surabaya City uses the K-Means Algorithm. International Journal of Information System & Technology, 3(2), 276-283.

Han, J., & Kamber, M. (2006). Data Mining: Concept and Techniques. San Fransisco: Morgan Kauffman Publisher.

Ibrahim, R. N., Hayati, M. N., & Amijaya, F. D. T. (2020). Penerapan Algoritma K-Medoids Pada Pengelompokkan Wilayah Desa atau Kelurahan di Kabupaten Kutai Kartanegara (Studi Kasus: Data Hasil Pendataan Potensi Desa (PODES) Tahun 2018). Jurnal Eksponensial, 11(2), 153-158.

Insani, P. N., Darmawan, E., & Sugiyarto. (2024). K-Medoids Algorithm to Clustering COVID-19 patients with Various Age Levels at Hospital in Yogyakarta. Sinkron: Jurnal dan Penelitian Teknik Informatika, 8(2), 1014-1018.

Kaufman, L., & Rousseeuw, P. J. (1990). Finding Groups in Data. New York. John Willey & Sons.

Nahdliyah, M. A., Widiharih, T. & Prahutama, A. (2019). Metode K-Medoids Clustering Dengan Validasi Silhouette Index dan C-Index (Studi Kasus Jumlah Kriminalitas Kabupaten/Kota di Jawa Tengah Tahun 2018). Jurnal Gaussian, 8(2), 161-170.

Ningrum, A. F., & Ahadi, G. D. (2022). Analisis Cluster Kabupaten/Kota di Provinsi Jawa Timur Berdasarkan Laju Produk Domestik Regional Bruto dengan Pendekatan K-Means. Jurnal Kompetitif: Media Informasi Ekonomi Pembangunan, Manajemen dan Akuntansi, 8(2), 60-76.

Nurjannah, E., Nasution, M., & Muti’ah, R. (2024). Data Mining Clustering Analysis of Child Growth and Development Using K-Means Method. Sinkron: Jurnal dan Penelitian Teknik Informatika, 8(3), 1909-1919.

Reinaldi Y., Ulinnuha, N., Hartono, T. & Hafiyusholeh, M. (2021). Comparison of Single Linkage, Complete Linkage, and Average Linkage Methods on Community Welfare Analysis in Cities and Regencies in East Java. Jurnal Matematika, Statistika & Komputasi, 18(1), 130-140.

Sapriyanti, S., & Rianto, Y. (2020). Komparasi Metode Clustering K-Means dan Single Linkage untuk Penentuan Kelompok Agent Pada Call Center. JISAMAR: Journal of Information System, Applied, Management, Accounting and Research, 4(3), 1-7.

Suraya, G. R., & Wijayanto, A. W. (2022). Comparison of Hierarchical Clustering, K-Means, K-Medoids, and Fuzzy C-Means Methods in Grouping Provinces in Indonesia according to the Special Index for Handling Stunting. Indonesian Journal of Statistics and Its Applications, 6(2), 180-201.

Syafiyah, U., Asrafi, I., Wicaksono, B., Puspitasari, D. P., & Sirait, F. M. (2022). Analisis Perbandingan Hierarchical dan Non-Hierarchical Clustering Pada Data Indikator Ketenagakerjaan di Jawa Barat Tahun 2020. Seminar Nasional Official Statistics, 1, 803-812.

Thamrin, D. R., & Murni, D. (2022). Analisis ClusterHierarki Metode Single LinkagePada Kabupaten/Kota di Provinsi Sumatera Barat Berdasarkan Indikator Kesehatan. Journal of Mathematics UNP, 7(3), 45-51.

Ulvi, H. A., & Ikhsan, M. (2024). Comparison of K-Means and K-Medoids Clustering Algorithms for Export and Import Grouping of Goods in Indonesia. Sinkron: Jurnal dan Penelitian Teknik Informatika, 8(3), 1671-1685.

Wardani, S. E., Harahap, S. Z., & Muti’ah, R. (2024). Implementation of the K-Means Method for Clustering Regency/City in North Sumatra based on Poverty Indicators. Sinkron: Jurnal dan Penelitian Teknik Informatika, 8(3), 1429-1442.

Warisa, & Nurahman. (2023). Perbandingan Performa Cluster Model Algoritma K-Means Dalam Mengelompokkan Penerima Bantuan Program Keluarga Harapan. Jurnal Sistem Informasi Bisnis, 1, 20-28.

Wororomi, J. K., Allo, C. B. G., Paranoan, N. R., Gusthvi, W. (2023). Performance of K-Means and DBSCAN Algorithm in Clustering Gross Regional Domestic Product. JICP: Journal of International Conference Proceedings, 6(5), 179-193.

Downloads


Crossmark Updates

How to Cite

Allo, C. B. G., B, W. A. F. ., & Paranoan, N. R. . (2024). Performance Single Linkage and K-Medoids on Data with Outliers. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 8(4), 2185-2191. https://doi.org/10.33395/sinkron.v8i4.14072