Penerapan Algoritma C4.5 untuk Klasifikasi Gender Berdasarkan Ciri Fisik pada Dataset Gender Classification v7
Main Article Content
Abstract
Gender classification based on physical attributes is one of the applications of data mining techniques widely used to support identification systems and demographic analysis. This study aims to apply the Decision Tree C4.5 algorithm for gender classification based on physical characteristics using the Gender Classification v7 Dataset. The C4.5 algorithm was selected due to its ability to construct an interpretable decision tree model and effectively handle both numerical and categorical attributes. The research stages include dataset retrieval, data preprocessing, label attribute assignment, data splitting into training and testing sets, model training using the C4.5 algorithm, and performance evaluation using accuracy, precision, and recall metrics. The experimental process was conducted using Altair AI Studio 2026.0.1, with a training data proportion of 70% and testing data proportion of 30%. The experimental results show that the C4.5 algorithm achieved an accuracy of 97.20%. Based on the confusion matrix, the Male class obtained a recall value of 95.97% and a precision value of 98.35%, while the Female class achieved a recall value of 98.41% and a precision value of 96.12%. These results indicate that physical attributes contribute significantly to gender differentiation and demonstrate that the C4.5 algorithm is an effective and reliable method for physical attribute-based gender classification.
Article Details

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
References
Algoritma, C., Carolina, A., Ade, K., & Kunci, K. (2020). Penerapan Data Mining dengan Menggunakan Provinsi di Indonesia Pendahuluan. 19, 27–38.
Amaliyah, A., & Fatah, Z. (2024). Implementasi Prediksi Penyakit Ginjal Kronis dengan Menggunakan Metode Decision Tree. 3(2), 180–186.
Dan, M., Pada, M., & Numerik, D. (2022). Implementasi algoritma decision tree c4.5 dengan improvisasi mean dan median pada dataset numerik. 5, 105–114. https://doi.org/10.37600/tekinkom.v5i1.435
Kunci, K. (2023). Indonesian Journal of Computer Science. 12(1), 4228–4242.
Podstawski, R., Borysławski, K., Zsolt, B., & Alföldi, Z. (2022). Sex Differences in Anthropometric and Physiological Profiles of Hungarian Rowers of Different Ages.
Science, C. (2022). A REVIEW : CLASSIFICATION BASED DECISION TREE INDUCTION. 20(19), 535–540. https://doi.org/10.48047/nq.2022.20.19.NQ99049
Siregar, S. (2024). JU-KOMI JU-KOMI. 2(02), 30–37.
Studi, P., Informatika, T., Teknik, F., Ponorogo, U. M., & Ponorogo, K. (n.d.). Sistem Klasifikasi Penerima Bantuan Sosial Dengan Algoritma Decision Tree. 7–16.
Su, Y., Ho, C., Lee, P., Lin, C., Hung, Y., & Chen, P. (2023). Gender and Age Differences in Anthropometric Characteristics of Taiwanese Older Adults Aged 65 Years and Older. 1993, 1–16.