Klasifikasi Tipe Konsumen Retail Supermarket Menggunakan Decision Tree Berdasarkan Data Transaksi

Main Article Content

I Gusti Ngurah Adi Prayoga
I Wayan Sudiarsa
Ni Kadek Nila Agustini Dewi
Kadek Dio Angginantara Putra
Muhammad Iqbal

Abstract

The goal of this study is to use a Decision Tree algorithm to group different types of shoppers in a retail supermarket setting. The dataset used in this study came from a public Kaggle repository and has 1,000 transaction records. These records include information about the customers, the types of products they bought, the details of the transactions, and the time of day they took place. We used Python-based libraries like pandas and scikit-learn in Google Colab to process and analyze the data. The research method includes preprocessing the data, creating new features, encoding categorical variables, splitting the data, training the model, and evaluating it. We used accuracy, precision, recall, and F1-score to measure how well the model worked. The experimental results show that the Decision Tree model was about 56.5% accurate when using all features and 61.5% accurate when using only a few key features with the tree depth set to a certain level. These results show that it's possible to group customers based on their transactional and demographic information, but the model's performance could still be better. This study helps us understand how to segment customers in retail analytics and gives us a starting point for future improvements using more advanced machine learning methods.

Article Details

How to Cite
Adi Prayoga, I. G. N., Sudiarsa, I. W., Dewi, N. K. N. A., Putra, K. D. A., & Iqbal, M. (2026). Klasifikasi Tipe Konsumen Retail Supermarket Menggunakan Decision Tree Berdasarkan Data Transaksi. Journal of Multidisciplinary Inquiry in Science, Technology and Educational Research, 3(1), 1515–1525. https://doi.org/10.32672/mister.v3i1.4106
Section
Articles

References

Ahmed, A., Rızaner, A., & Ulusoy, A. (2018). A novel decision tree classification based on post-pruning with Bayes minimum risk. PLOS ONE, 13(4), e0194168. https://doi.org/10.1371/journal.pone.0194168

Almagribi, A. (2025). Clustering and classification of retail sales data: A big data and data mining analysis. Journal of Innovations in Computer Science, 4(2), 242–253. https://doi.org/10.56347/jics.v4i2.303

Cant, M., & Toit, M. (2012). Identifying the factors that influence retail customer loyalty and capitalising them. International Business & Economics Research Journal (IBER), 11(11), 1223. https://doi.org/10.19030/iber.v11i11.7370

Chang, J., Travaglione, A., & O’Neill, G. (2015). How can gender signal employee qualities in retailing? Journal of Retailing and Consumer Services, 27, 24–30. https://doi.org/10.1016/j.jretconser.2015.07.004

Chen, J. (2024). Advanced analytics for retail inventory and demand forecasting. TEBMR, 10, 113–119. https://doi.org/10.62051/jme9b319

Dixon, L., Li, J., Sorensen, J., Thain, N., & Vasserman, L. (2018). Measuring and mitigating unintended bias in text classification. Proceedings of the ACM Conference, 67–73. https://doi.org/10.1145/3278721.3278729

Gunawan, I., & Setiawan, T. (2023). Analisis regresi linier dalam memprediksi data penjualan supermarket. Jurnal Saintikom (Jurnal Sains Manajemen Informatika dan Komputer), 22(1), 198. https://doi.org/10.53513/jis.v22i1.7556

He, W., & Zeng, Q. (2021). Research on sales forecast based on XGBoost-LSTM algorithm model. Journal of Physics: Conference Series, 1754(1), 012191. https://doi.org/10.1088/1742-6596/1754/1/012191

Jia, S., & Cristianini, N. (2015). Learning to classify gender from four million images. Pattern Recognition Letters, 58, 35–41. https://doi.org/10.1016/j.patrec.2015.02.006

Liu, H. (2024). Comparative analysis of machine learning algorithms for sales forecasting in the Russian toy retail sector. Advances in Economics, Management and Political Sciences, 128(1), 180–187. https://doi.org/10.54254/2754-1169/2024.18672

Mansur, S., Sattar, K., Hosseini, S., Pervez, S., Ahmad, I., Saleem, K., & Elhendi, A. (2025). Sales forecasting for retail stores using hybrid neural networks and sales-affecting variables. PeerJ Computer Science, 11, e3058. https://doi.org/10.7717/peerj-cs.3058

Mukhlisin, M., & Nugroho, H. (2025). Customer loyalty classification using KNN and decision tree for sales strategy development. Sinkron, 9(3), 1159–1166. https://doi.org/10.33395/sinkron.v9i3.15110

Mühlbacher, T., Linhardt, L., Möller, T., & Piringer, H. (2018). TreePOD: Sensitivity-aware selection of Pareto-optimal decision trees. IEEE Transactions on Visualization and Computer Graphics, 24(1), 174–183. https://doi.org/10.1109/TVCG.2017.2745158

Phillips, N., Neth, H., Woike, J., & Gaissmaier, W. (2017). FFTrees: A toolbox to create, visualize, and evaluate fast-and-frugal decision trees. Judgment and Decision Making, 12(4), 344–368.https://doi.org/10.1017/S1930297500006239

Safavian, S. R., & Landgrebe, D. (1991). A survey of decision tree classifier methodology. IEEE Transactions on Systems, Man, and Cybernetics, 21(3), 660–674. https://doi.org/10.1109/21.97458

Sousa, A., Moro, S., & Pereira, R. (2023). Cluster-based approaches toward developing a customer loyalty program in a private security company. Applied Sciences, 14(1), 78. https://doi.org/10.3390/app14010078

Vera-Salmerón, E., Domínguez-Nogueira, C., Sáez, J., Romero-Béjar, J., & Mota-Romero, E. (2024). Differentiating pressure ulcer risk levels through interpretable classification models based on readily measurable indicators. Healthcare, 12(9), 913. https://doi.org/10.3390/healthcare12090913

Wen, K., Joseph, M., & Sivakumar, V. (2024). Big Mart sales prediction using machine learning. EAI Endorsed Transactions on Internet of Things, 10. https://doi.org/10.4108/eetiot.6453

Similar Articles

You may also start an advanced similarity search for this article.

Most read articles by the same author(s)