Integration Of Pca And K-Means Clustering For Staple Food Segmentation In Support Of National Food Policy
DOI:
10.33395/sinkron.v9i4.15343Keywords:
PCA, K-Means, regional segmentation, staple foods, IndonesiaAbstract
This study aims to develop cross-provincial staple-food segmentation by integrating Principal Component Analysis (PCA) and K-Means to support policy formation. The dataset comprises 2023 staple-food consumption for 34 Indonesian provinces across six indicators from BPS/SUSENAS. All indicators were standardized using z-score, reduced via PCA, and the resulting component scores were used as inputs to K-Means. Three components (PC1–PC3) explained 73.86% of the variance and captured shifts between sweet/animal-based vs. plant foods, fatty or animal-based grains, and the energy contribution of fat. The optimal number of clusters was determined as k = 3, yielding Silhouette = 0.466 and DBI = 0.733, indicating sufficiently compact and well-separated groups. The results reveal three segments: the first group consists of 11 provinces that are predominantly plant-based with low sugar and low animal-based consumption; the second group includes 13 provinces characterized by high animal-based and high-fat consumption; and the third group comprises 10 provinces with low-fat diets and fresh plant-based consumption. Stability checks on initialization and a leave-one-feature-out procedure confirmed consistent assignments. This fills an empirical gap: to our knowledge, no prior research integrates PCA with K-Means for cross-provincial staple-food segmentation in Indonesia while also reporting internal validation. Practically, the study provides operational segmentation to support food-security interventions moving beyond composite indices toward actionable targeting for production support, supply/price stabilization, and improved nutritional access thereby reframing IKP/FSVA from index-ranking to evidence-based segmentation.
Downloads
References
Anuragi, A., Sisodia, D. S., & Pachori, R. B. (2024). Mitigating the curse of dimensionality using feature projection techniques on electroencephalography datasets: an empirical review. Artificial Intelligence Review, 57(3), 1–28. https://doi.org/10.1007/s10462-024-10711-8
Azzam, A. F., Maghrabi, A., El-Naqeeb, E., Aldawood, M., & Elghawalby, H. (2024). Morphological Accuracy Data Clustering: A Novel Algorithm for Enhanced Cluster Analysis. Applied Computational Intelligence and Soft Computing, 2024(3). https://doi.org/10.1155/2024/3795126
Badan Pangan Nasional. (2022). Indeks Ketahanan Pangan 2022. Antimicrobial Agents and Chemotherapy, 58(12), 7250–7257.adan Pangan Nasional. (2022). Indeks Ketahanan Pangan 2022. Antimicrobial Agents and Chemotherapy, 58(12), 7250–7257.
Bougiouklis, J. N., Barouchas, P. E., Petropoulos, P., Tsesmelis, D. E., & Moustakas, N. (2025). Precision soil sampling strategy for the delineation of management zones in olive cultivation using unsupervised machine learning methods. Scientific Reports, 15(1), 1–26. https://doi.org/10.1038/s41598-025-89395-1
Davies, D. L., & Bouldin, D. W. (1979). A Cluster Separation Measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-1(2), 224–227. https://doi.org/10.1109/TPAMI.1979.4766909
Dongyu, Q., Lario, A., Russel, C., Hensley McCain, C., & Adhanom Ghebreyesus, T. (2024). The State of Food Security and Nutrition in the World 2024. In The State of Food Security and Nutrition in the World 2024. https://doi.org/10.4060/cd1254en
Facendola, R., Ottomano Palmisano, G., De Boni, A., Acciani, C., & Roma, R. (2023). Profiling Citizens on Perception of Key Factors of Food Security: An Application of K-Means Cluster Analysis. Sustainability (Switzerland), 15(13). https://doi.org/10.3390/su15139915
Festa, D., Novellino, A., Hussain, E., Bateson, L., Casagli, N., Confuorto, P., Del Soldato, M., & Raspini, F. (2023). Unsupervised detection of InSAR time series patterns based on PCA and K-means clustering. International Journal of Applied Earth Observation and Geoinformation, 118(November 2022), 103276. https://doi.org/10.1016/j.jag.2023.103276
Fite, N. B., Wegari, G. M., & Steendam, H. (2025). Integration of Artificial Neural Network Regression and Principal Component Analysis for Indoor Visible Light Positioning. Sensors, 25(4), 1–22. https://doi.org/10.3390/s25041049
Fitra, R. A. (n.d.). Penerapan Metode K-Means Clustering pada Hasil Produksi Beras di Wilayah Sumatera Utara. 1(6), 2–8.
Ha, J., Kambe, M., & Pe, J. (2011). Data Mining: Concepts and Techniques. In Data Mining: Concepts and Techniques. https://doi.org/10.1016/C2009-0-61819-5
Ikotun, A. M., Habyarimana, F., & Ezugwu, A. E. (2025). Benchmarking validity indices for evolutionary K-means clustering performance. Scientific Reports, 15(1), 1–24. https://doi.org/10.1038/s41598-025-08473-6
Iqbal, M., Sipayung, S. P., Sinaga, A. R., & Hasugian, P. M. (2024). Analysis of Student Achievement with K-Means on Socioeconomic , Behavioral , and Psychological Factors. 14(04), 715–728. https://doi.org/10.54209/infosains.v14i04
Konishi, T. (2025). Means and Issues for Adjusting Principal Component Analysis Results. Algorithms, 18(3). https://doi.org/10.3390/a18030129
Maugeri, A., Barchitta, M., Favara, G., La Mastra, C., La Rosa, M. C., Magnano San Lio, R., & Agodi, A. (2023). The Application of Clustering on Principal Components for Nutritional Epidemiology: A Workflow to Derive Dietary Patterns. Nutrients, 15(1). https://doi.org/10.3390/nu15010195
Nardo, M., Saisana, M., Saltelli, A., Tarantola, S., Hoffman, A., & Giovannini, E. (2005). Handbook on constructing composite indicators. In OECD Statistics Working Papers (Issue 03). http://www.oecd-ilibrary.org/docserver/download/5lgmz9dkcdg4.pdf?expires=1471336777&id=id&accname=guest&checksum=158391DADFA324416BB9015F3E4109AF
Qarmiche, N., El Kinany, K., Otmani, N., El Rhazi, K., & Chaoui, N. E. H. (2023). Cluster analysis of dietary patterns associated with colorectal cancer derived from a Moroccan case-control study. BMJ Health and Care Informatics, 30(1), 1–9. https://doi.org/10.1136/bmjhci-2022-100710
Roh, H. R., Kim, C. S., Lee, Y., & Lee, J. M. (2025). Dimensionality Reduction for Clustering of Nonlinear Industrial Data: A Tutorial. Korean Journal of Chemical Engineering, 42(5), 987–1001. https://doi.org/10.1007/s11814-025-00402-7
Sciaraffa, N., Gagliano, A., Augugliaro, L., & Coronnello, C. (2025). Optimization of clustering parameters for single-cell RNA analysis using intrinsic goodness metrics. Frontiers in Bioinformatics, 5(June), 1–21. https://doi.org/10.3389/fbinf.2025.1562410
Tahun, F. N. (2023). FSVA Nasional Tahun 2023 1.
Tarekegn, A. N., Tessem, B., & Rabbi, F. (2025). A New Cluster Validation Index Based on Stability Analysis. International Conference on Pattern Recognition Applications and Methods, 1(Icpram), 377–384. https://doi.org/10.5220/0013309100003905
The Global Food Security Index 2022. (2022). Global Food Security Index 2022. Economist Impact. https://impact.economist.com/sustainability/project/food-security-index/explore-countries/indonesia
Ville, B. de. (2001). Introduction to Data Mining. In Microsoft Data Mining. https://doi.org/10.1016/b978-155558242-5/50003-6
Wani, A. A. (2025). Comprehensive review of dimensionality reduction algorithms: challenges, limitations, and innovative solutions. PeerJ Computer Science, 11, e3025. https://doi.org/10.7717/peerj-cs.3025
Downloads
How to Cite
Issue
Section
License
Copyright (c) 2025 Sardo Sipayung, Paska Marto Hasugian

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.


Moraref
PKP Index
Indonesia OneSearch
OCLC Worldcat
Index Copernicus
Scilit




















