Implementasi Logistic Regression untuk Klasifikasi Jenis Tanah pada Paddy Dataset Menggunakan RapidMiner
Main Article Content
Abstract
This study aims to develop and evaluate a paddy yield classification model using the Logistic Regression algorithm within a machine learning framework. The dataset used is the Paddy Dataset obtained from the UCI Machine Learning Repository and categorized as secondary data. The research process includes data preprocessing, consisting of missing value handling, categorical attribute transformation using One-Hot Encoding, and numerical feature normalization with Min-Max Scaling. The dataset is then split into training and testing sets using an 80:20 ratio to assess model generalization. The Logistic Regression model is trained on the training data and evaluated on the testing data using a confusion matrix and performance metrics such as accuracy, precision, and recall through RapidMiner. The results indicate that Logistic Regression can be applied for paddy yield classification after converting the target variable into binary classes; however, further improvements are required to enhance predictive performance.
Article Details

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
References
Article, S., Rocha, A., & Azevedo, B. C. De. (2019). Digital mapping of soil attributes using machine learning 1 Mapeamento digital de atributos do solo utilizando aprendizado de máquina. 519–528. https://doi.org/10.5935/1806-6690.20190061
Cao, X., Du, J., Qu, C., Wang, J., & Tu, R. (2024). An early diagnosis method for overcharging thermal runaway of energy storage lithium batteries. Journal of Energy Storage, 75, 109661. https://doi.org/https://doi.org/10.1016/j.est.2023.109661
Correndo, A. A., Rotundo, J. L., Tremblay, N., Archontoulis, S., Coulter, J. A., Ruiz-Diaz, D., Franzen, D., Franzluebbers, A. J., Nafziger, E., Schwalbert, R., Steinke, K., Williams, J., Messina, C. D., & Ciampitti, I. A. (2021). Assessing the uncertainty of maize yield without nitrogen fertilization. Field Crops Research, 260, 107985. https://doi.org/https://doi.org/10.1016/j.fcr.2020.107985
Dixit, A., & Mani, A. (2023). Sampling technique for noisy and borderline examples problem in imbalanced classification. Applied Soft Computing, 142, 110361. https://doi.org/https://doi.org/10.1016/j.asoc.2023.110361
Fukase, E., & Martin, W. (2020). Economic growth, convergence, and world food demand and supply. World Development, 132, 104954. https://doi.org/https://doi.org/10.1016/j.worlddev.2020.104954
Lekkas, D., Yom-Tov, E., Heinz, M. V, Gyorda, J. A., Nguyen, T., Barr, P. J., & Jacobson, N. C. (2024). The trajectories of online mental health information seeking: Modeling search behavior before and after completion of self-report screens. Computers in Human Behavior, 157, 108267. https://doi.org/https://doi.org/10.1016/j.chb.2024.108267
Moraitis, N., & Panagopoulos, A. D. (2022). Coexistence of Satellite Ground Stations in Teleport Facilities: Interference Assessment, Real Application Scenario and Measurements. In Sensors (Vol. 22, Issue 3, p. 1234). https://doi.org/10.3390/s22031234
Shamsolmoali, P., Celebi, M. E., & Wang, R. (2020). Deep learning approaches for real-time image super-resolution. Neural Comput. Appl., 32(18), 14519–14520. https://doi.org/10.1007/s00521-020-05176-z
Shao, T., & Xu, X. (2025). Evaluation and Prediction of Agricultural Water Use Efficiency in the Jianghan Plain Based on the Tent-SSA-BPNN Model. 1–31.
Song, W., & Zhou, Y. (2021). Linking leaf δ15N and δ13C with soil fungal biodiversity, ectomycorrhizal and plant pathogenic abundance in forest ecosystems of China. CATENA, 200, 105176. https://doi.org/https://doi.org/10.1016/j.catena.2021.105176
Wadoux, A. M. J.-C., Minasny, B., & McBratney, A. B. (2020). Machine learning for digital soil mapping: Applications, challenges and suggested solutions. Earth-Science Reviews, 210, 103359. https://doi.org/https://doi.org/10.1016/j.earscirev.2020.103359
Zdravković, M., Panetto, H., & Weichhart, G. (2022). AI-enabled Enterprise Information Systems for Manufacturing. Enterprise Information Systems, 16(4), 668–720. https://doi.org/10.1080/17517575.2021.1941275
Article, S., Rocha, A., & Azevedo, B. C. De. (2019). Digital mapping of soil attributes using machine learning 1 Mapeamento digital de atributos do solo utilizando aprendizado de máquina. 519–528. https://doi.org/10.5935/1806-6690.20190061
Cao, X., Du, J., Qu, C., Wang, J., & Tu, R. (2024). An early diagnosis method for overcharging thermal runaway of energy storage lithium batteries. Journal of Energy Storage, 75, 109661. https://doi.org/https://doi.org/10.1016/j.est.2023.109661
Correndo, A. A., Rotundo, J. L., Tremblay, N., Archontoulis, S., Coulter, J. A., Ruiz-Diaz, D., Franzen, D., Franzluebbers, A. J., Nafziger, E., Schwalbert, R., Steinke, K., Williams, J., Messina, C. D., & Ciampitti, I. A. (2021). Assessing the uncertainty of maize yield without nitrogen fertilization. Field Crops Research, 260, 107985. https://doi.org/https://doi.org/10.1016/j.fcr.2020.107985
Dixit, A., & Mani, A. (2023). Sampling technique for noisy and borderline examples problem in imbalanced classification. Applied Soft Computing, 142, 110361. https://doi.org/https://doi.org/10.1016/j.asoc.2023.110361
Fukase, E., & Martin, W. (2020). Economic growth, convergence, and world food demand and supply. World Development, 132, 104954. https://doi.org/https://doi.org/10.1016/j.worlddev.2020.104954
Lekkas, D., Yom-Tov, E., Heinz, M. V, Gyorda, J. A., Nguyen, T., Barr, P. J., & Jacobson, N. C. (2024). The trajectories of online mental health information seeking: Modeling search behavior before and after completion of self-report screens. Computers in Human Behavior, 157, 108267. https://doi.org/https://doi.org/10.1016/j.chb.2024.108267
Moraitis, N., & Panagopoulos, A. D. (2022). Coexistence of Satellite Ground Stations in Teleport Facilities: Interference Assessment, Real Application Scenario and Measurements. In Sensors (Vol. 22, Issue 3, p. 1234). https://doi.org/10.3390/s22031234
Shamsolmoali, P., Celebi, M. E., & Wang, R. (2020). Deep learning approaches for real-time image super-resolution. Neural Comput. Appl., 32(18), 14519–14520. https://doi.org/10.1007/s00521-020-05176-z
Shao, T., & Xu, X. (2025). Evaluation and Prediction of Agricultural Water Use Efficiency in the Jianghan Plain Based on the Tent-SSA-BPNN Model. 1–31.
Song, W., & Zhou, Y. (2021). Linking leaf δ15N and δ13C with soil fungal biodiversity, ectomycorrhizal and plant pathogenic abundance in forest ecosystems of China. CATENA, 200, 105176. https://doi.org/https://doi.org/10.1016/j.catena.2021.105176
Wadoux, A. M. J.-C., Minasny, B., & McBratney, A. B. (2020). Machine learning for digital soil mapping: Applications, challenges and suggested solutions. Earth-Science Reviews, 210, 103359. https://doi.org/https://doi.org/10.1016/j.earscirev.2020.103359
Zdravković, M., Panetto, H., & Weichhart, G. (2022). AI-enabled Enterprise Information Systems for Manufacturing. Enterprise Information Systems, 16(4), 668–720. https://doi.org/10.1080/17517575.2021.1941275