Analisis Pola Risiko Stroke pada Data Pasien Menggunakan Model K-Means Clustering Berbasis RapidMiner
Main Article Content
Abstract
This study aims to identify patterns of stroke risk among patients by applying the K-Means clustering method using the RapidMiner platform. The stroke dataset was loaded from an Excel file and prepared through a preprocessing stage, including normalization of numerical attributes and conversion of categorical attributes into numerical form to ensure accurate distance calculations in the K-Means algorithm. The K-Means algorithm was then applied to form several clusters based on similarities in patients’ clinical and demographic data. The quality of the clustering results was evaluated using the Davies–Bouldin Index to measure cluster separation and intra-cluster similarity. The results indicate that the K-Means method is capable of producing a more structured representation of stroke risk patterns and can support data-driven decision-making for stroke prevention and management.
Article Details

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
References
Care, P., Delord, M., Sun, X., Learoyd, A., Curcin, V., Wolfe, C., Ashworth, M., & Douiri, A. (2024). Patient ‑ oriented unsupervised learning to uncover the patterns of multimorbidity associated with stroke using primary care electronic health records. BMC Primary Care. https://doi.org/10.1186/s12875-024-02636-6
Ceskoutsé, R. F. T., Bomgni, A. B., Zanfack, D. R. G., Agany, D. D. M., Thomas, B. B., & Zohim, E. G. (2025). Sub-clustering based recommendation system for stroke patient: Identification of a specific drug class for a given patient. 1–35. https://doi.org/10.1016/j.compbiomed.2024.108117.Sub-clustering
Davies, D. L., & Bouldin, D. W. (1979). A Cluster Separation Measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-1(2), 224–227. https://doi.org/10.1109/TPAMI.1979.4766909
Guo, X., Liu, P., Guo, J., Zhang, N., Huang, H., Liu, J., Tan, Z., & Dan, G. (2025). An unsupervised cluster analysis of multimorbidity patterns in older adults in Shenzhen , China. June, 1–14. https://doi.org/10.3389/fpubh.2025.1557721
Id, S. S. Z., Rutter, M. K., Sun, L. Y., Ullah, W., Rashid, M., Id, D. M. A., Steinke, D. T., Weng, S., Id, E. K., & Mamas, M. A. (2023). PLOS ONE Comorbidity clusters and in-hospital outcomes in patients admitted with acute myocardial infarction in the USA : A national population-based study. 1–18. https://doi.org/10.1371/journal.pone.0293314
Jain, A. K. (2010). Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 31(8), 651–666. https://doi.org/https://doi.org/10.1016/j.patrec.2009.09.011
Jiang, Y., Dang, Y., Wu, Q., & Yuan, B. (n.d.). Using a k-means clustering to identify novel phenotypes of acute ischemic stroke and development of its Clinlabomics models.
K-means, A. (2024). Clustering Pasien Rawat Inap Di RS USU Menggunakan. 03(2), 54–63.
Kim, J. T., Kim, N. R., Choi, S. H., Oh, S., Park, M. S., Lee, S. H., Kim, B. C., Choi, J., & Kim, M. S. (2022). Neural network ‑ based clustering model of ischemic stroke patients with a maximally distinct distribution of 1 ‑ year vascular outcomes. Scientific Reports, 0123456789, 1–10. https://doi.org/10.1038/s41598-022-13636-w
Mao, H. (2025). Cluster Analysis of Patients With Acute Ischemic Stroke : Identifying Characteristics of Long Hospital Stays in a Comprehensive Hospital. 1–10. https://doi.org/10.1002/brb3.70940
Ren, Z., & Fu, X. (2021). Stroke Risk Factors in United States: An Analysis of the 2013-2018 National Health and Nutrition Examination Survey. International Journal of General Medicine, 14, 6135–6147. https://doi.org/10.2147/IJGM.S327075
Rumah, P., Royal, S., Hidayah, A., Dulisep, D., & Angga, B. (2024). Implementasi Algoritma K - Means Menggunakan RapidMiner untuk Klasterisasi Data Obat. 7(2), 200–211.
Shin, S., Chang, W. H., Kim, D. Y., Lee, J., Sohn, M. K., Song, M., Shin, Y., Lee, Y., Joo, M. C., Lee, S. Y., Han, J., Ahn, J., Oh, G., Kim, Y., Kim, K., & Kim, Y. (n.d.). Clustering and prediction of long-term functional recovery patterns in first-time stroke patients. https://doi.org/10.30597/mkmi.v21i3.45641
Suhito, H. P., Azam, M., Nur, D., Ningrum, A., City, S., & Office, H. (2025). Media Kesehatan Masyarakat Indonesia Indonesia : Based on Indonesian Health Survey Data. 21(3), 225–235. https://doi.org/10.30597/mkmi.v21i3.45641
Wala, J., & Umar, R. (2024). Implementasi K-Means Clustering pada Pengelompokan Pasien Penyakit Jantung. 9(3), 205–216. https://doi.org/10.14421/jiska.2024.9.3.205-216