Klasifikasi Tipe Konsumen Retail Supermarket Menggunakan Decision Tree Berdasarkan Data Transaksi

I Gusti Ngurah Adi Prayoga; I Wayan Sudiarsa; Ni Kadek Nila Agustini Dewi; Kadek Dio Angginantara  Putra; Muhammad Iqbal

doi:10.32672/mister.v3i1.4106

PDF (File Download)

Published: Jan 11, 2026

DOI: https://doi.org/10.32672/mister.v3i1.4106

Keywords:

consumer classification 1; decision tree 2; retail analytics 3; machine learning 4; data mining 5

I Gusti Ngurah Adi Prayoga

Institut Bisnis dan Teknologi Indonesia Denpasar

I Wayan Sudiarsa

Institut Bisnis dan Teknologi Indonesia Denpasar

Ni Kadek Nila Agustini Dewi

Institut Bisnis dan Teknologi Indonesia Denpasar

Kadek Dio Angginantara Putra

Institut Bisnis dan Teknologi Indonesia Denpasar

Muhammad Iqbal

Institut Bisnis dan Teknologi Indonesia Denpasar

Abstract

The goal of this study is to use a Decision Tree algorithm to group different types of shoppers in a retail supermarket setting. The dataset used in this study came from a public Kaggle repository and has 1,000 transaction records. These records include information about the customers, the types of products they bought, the details of the transactions, and the time of day they took place. We used Python-based libraries like pandas and scikit-learn in Google Colab to process and analyze the data. The research method includes preprocessing the data, creating new features, encoding categorical variables, splitting the data, training the model, and evaluating it. We used accuracy, precision, recall, and F1-score to measure how well the model worked. The experimental results show that the Decision Tree model was about 56.5% accurate when using all features and 61.5% accurate when using only a few key features with the tree depth set to a certain level. These results show that it's possible to group customers based on their transactional and demographic information, but the model's performance could still be better. This study helps us understand how to segment customers in retail analytics and gives us a starting point for future improvements using more advanced machine learning methods.

How to Cite

Adi Prayoga, I. G. N., Sudiarsa, I. W., Dewi, N. K. N. A., Putra, K. D. A., & Iqbal, M. (2026). Klasifikasi Tipe Konsumen Retail Supermarket Menggunakan Decision Tree Berdasarkan Data Transaksi. Journal of Multidisciplinary Inquiry in Science, Technology and Educational Research, 3(1), 1515–1525. https://doi.org/10.32672/mister.v3i1.4106

Issue

Vol. 3 No. 1 (2026): JANUARI 2026

Section

Articles

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

References

Ahmed, A., Rızaner, A., & Ulusoy, A. (2018). A novel decision tree classification based on post-pruning with Bayes minimum risk. PLOS ONE, 13(4), e0194168. https://doi.org/10.1371/journal.pone.0194168

Almagribi, A. (2025). Clustering and classification of retail sales data: A big data and data mining analysis. Journal of Innovations in Computer Science, 4(2), 242–253. https://doi.org/10.56347/jics.v4i2.303

Cant, M., & Toit, M. (2012). Identifying the factors that influence retail customer loyalty and capitalising them. International Business & Economics Research Journal (IBER), 11(11), 1223. https://doi.org/10.19030/iber.v11i11.7370

Chang, J., Travaglione, A., & O’Neill, G. (2015). How can gender signal employee qualities in retailing? Journal of Retailing and Consumer Services, 27, 24–30. https://doi.org/10.1016/j.jretconser.2015.07.004

Chen, J. (2024). Advanced analytics for retail inventory and demand forecasting. TEBMR, 10, 113–119. https://doi.org/10.62051/jme9b319

Dixon, L., Li, J., Sorensen, J., Thain, N., & Vasserman, L. (2018). Measuring and mitigating unintended bias in text classification. Proceedings of the ACM Conference, 67–73. https://doi.org/10.1145/3278721.3278729

Gunawan, I., & Setiawan, T. (2023). Analisis regresi linier dalam memprediksi data penjualan supermarket. Jurnal Saintikom (Jurnal Sains Manajemen Informatika dan Komputer), 22(1), 198. https://doi.org/10.53513/jis.v22i1.7556

He, W., & Zeng, Q. (2021). Research on sales forecast based on XGBoost-LSTM algorithm model. Journal of Physics: Conference Series, 1754(1), 012191. https://doi.org/10.1088/1742-6596/1754/1/012191

Jia, S., & Cristianini, N. (2015). Learning to classify gender from four million images. Pattern Recognition Letters, 58, 35–41. https://doi.org/10.1016/j.patrec.2015.02.006

Liu, H. (2024). Comparative analysis of machine learning algorithms for sales forecasting in the Russian toy retail sector. Advances in Economics, Management and Political Sciences, 128(1), 180–187. https://doi.org/10.54254/2754-1169/2024.18672

Mansur, S., Sattar, K., Hosseini, S., Pervez, S., Ahmad, I., Saleem, K., & Elhendi, A. (2025). Sales forecasting for retail stores using hybrid neural networks and sales-affecting variables. PeerJ Computer Science, 11, e3058. https://doi.org/10.7717/peerj-cs.3058

Mukhlisin, M., & Nugroho, H. (2025). Customer loyalty classification using KNN and decision tree for sales strategy development. Sinkron, 9(3), 1159–1166. https://doi.org/10.33395/sinkron.v9i3.15110

Mühlbacher, T., Linhardt, L., Möller, T., & Piringer, H. (2018). TreePOD: Sensitivity-aware selection of Pareto-optimal decision trees. IEEE Transactions on Visualization and Computer Graphics, 24(1), 174–183. https://doi.org/10.1109/TVCG.2017.2745158

Phillips, N., Neth, H., Woike, J., & Gaissmaier, W. (2017). FFTrees: A toolbox to create, visualize, and evaluate fast-and-frugal decision trees. Judgment and Decision Making, 12(4), 344–368.https://doi.org/10.1017/S1930297500006239

Safavian, S. R., & Landgrebe, D. (1991). A survey of decision tree classifier methodology. IEEE Transactions on Systems, Man, and Cybernetics, 21(3), 660–674. https://doi.org/10.1109/21.97458

Sousa, A., Moro, S., & Pereira, R. (2023). Cluster-based approaches toward developing a customer loyalty program in a private security company. Applied Sciences, 14(1), 78. https://doi.org/10.3390/app14010078

Vera-Salmerón, E., Domínguez-Nogueira, C., Sáez, J., Romero-Béjar, J., & Mota-Romero, E. (2024). Differentiating pressure ulcer risk levels through interpretable classification models based on readily measurable indicators. Healthcare, 12(9), 913. https://doi.org/10.3390/healthcare12090913

Wen, K., Joseph, M., & Sivakumar, V. (2024). Big Mart sales prediction using machine learning. EAI Endorsed Transactions on Internet of Things, 10. https://doi.org/10.4108/eetiot.6453

Klasifikasi Tipe Konsumen Retail Supermarket Menggunakan Decision Tree Berdasarkan Data Transaksi

Abstract

References

Similar Articles

Most read articles by the same author(s)

Article Sidebar

Main Article Content

Abstract

Article Details

References

Similar Articles

Most read articles by the same author(s)