Prediksi Resiko Kematian Penderita Gagal Ginjal KronisDengan Voting Classifier Dan Random Forest Pada Data Tidak Seimbang

Penulis

  • Luthfatul Amaliana Universitas Brawijaya, Malang https://orcid.org/0000-0002-6624-891X
  • Ani Budi Astuti Universitas Brawijaya, Malang
  • Rossanda Sevia Gadis Universitas Brawijaya, Malang
  • Naurah Atikah Rabbani Universitas Brawijaya, Malang
  • Nabila Ayunda Sovia Universitas Brawijaya, Malang

DOI:

https://doi.org/10.25126/jtiik.124

Kata Kunci:

Ensemble Machine Learning, Chronic Kidney Disease, Random Forest, Voting Classifier

Abstrak

Gagal ginjal kronis merupakan salah satu penyakit serius yang dapat menyebabkan kematian jika tidak terdeteksi dan ditangani secara dini. Penelitian ini bertujuan memprediksi risiko kematian pada pasien gagal ginjal kronis menggunakan metode ensemble learning, yaitu random forest dan voting classifier (hard voting dan soft voting). Voting classifier digunakan untuk menggabungkan prediksi dari beberapa model klasifikasi tunggal, di mana hard voting mengambil keputusan berdasarkan suara terbanyak, sedangkan soft voting mempertimbangkan rata-rata probabilitas prediksi. Data yang digunakan pada penelitian ini merupakan data sekunder dari RSUD Dr. Saiful Anwar, Kota Malang. Proporsi pasien rawat inap yang pulang dalam kondisi meninggal lebih kecil dibanding kondisi tidak meninggal. Kondisi data tidak seimbang ini menyebabkan model cenderung bias terhadap kelas mayoritas. Untuk mengatasi hal tersebut, synthetic minority over-sampling technique (SMOTE) diterapkan guna menyeimbangkan distribusi kelas. Selain itu, random forest dipilih karena kemampuannya dalam menangani ketidakimbangan data melalui pembobotan pada pohon-pohon keputusan, sehingga mengurangi bias terhadap kelas mayoritas. Evaluasi performa model dilakukan menggunakan metrik akurasi, presisi, dan recall. Hasil evaluasi menunjukkan bahwa random forest memberikan kinerja terbaik dengan akurasi 77%, presisi 36%, dan recall 60%, mengungguli hard voting dan soft voting. Penggunaan random forest dan SMOTE terbukti meningkatkan prediksi pada kelas minoritas, yang sangat penting dalam mendeteksi pasien berisiko kematian tinggi. Pendekatan ini dapat membantu dalam deteksi dini dan pengelolaan yang lebih baik terhadap pasien gagal ginjal kronis, sehingga berpotensi menurunkan angka kematian akibat penyakit ini.

 

Abstract

Chronic kidney disease (CKD) is a life-threatening condition that can lead to fatal outcomes if not diagnosed and treated promptly. This study aims to forecast mortality risk in CKD patients using ensemble learning techniques, including random forest an d voting classifier (hard voting and soft voting). The voting classifier combines predictions from various single classification models, with hard voting selecting outcomes based on majority decisions, while soft voting averages prediction probabilities. The data used in this study is secondary data from RSUD Dr. Saiful Anwar, Malang City. The proportion of hospitalized patients who were discharged in a deceased condition is smaller than those who were discharged alive. This imbalance in the data causes the model to be biased toward the majority class. However, models tend to favor the majority class when dealing with imbalanced data. To mitigate this, the synthetic minority over-sampling technique (SMOTE) was applied to balance the class distribution. Random forest was also selected for its ability to manage data imbalance through weighted decision trees, reducing bias toward the majority class. Model performance was evaluated using metrics such as accuracy, precision, and recall. Results indicated that random forest outperformed hard voting and soft voting, achieving 77% accuracy, 36% precision, and 60% recall. The combination of random forest and SMOTE significantly enhanced the prediction of minority class outcomes, which is essential for identifying high-risk patients. This method has the potential to support early detection and improved management of CKD patients, thus reducing mortality rates associated with the disease.

Downloads

Download data is not yet available.

Referensi

BASHA, S.J., MADALA, S.R., VIVEK, K., KUMAR, E.S., AMMANNAMMA, T., 2022. A Review on Imbalanced Data Classification Techniques. 2022 Int. Conf. Adv. Comput. Technol. Appl. ICACTA 2022 1–6. https://doi.org/10.1109/ICACTA54488.2022.9753392

CHUMUANG, N., MEESANG, N., KETCHAM, M., YIMYAM, W., CHALERMDIT, J., WITTAYAKHOM, N., PRAMKEAW, P., 2020. An Efficiency Random Forest Algorithm for Classification of Patients with Kidney Dysfunction. Proc. - 2020 15th Int. Jt. Symp. Artif. Intell. Nat. Lang. Process. iSAI-NLP 2020 5–10. https://doi.org/10.1109/iSAI-NLP51646.2020.9376785

DA SILVEIRA, A.C.M., SOBRINHO, Á., DA SILVA, L.D., COSTA, E. DE B., PINHEIRO, M.E., PERKUSICH, A., 2022. Exploring Early Prediction of Chronic Kidney Disease Using Machine Learning Algorithms for Small and Imbalanced Datasets. Appl. Sci. 12. https://doi.org/10.3390/app12073673

DE OLIVEIRA, G.P., FONSÊCA, A., RODRIGUES, P.C., 2022. Diabetes diagnosis based on hard and soft voting classifiers combining statistical learning models. Brazilian J. Biometrics 40, 415–427. https://doi.org/10.28951/bjb.v40i4.605

EVANS, M., LEWIS, R.D., MORGAN, A.R., WHYTE, M.B., HANIF, W., BAIN, S.C., DAVIES, S., DASHORA, U., YOUSEF, Z., PATEL, D.C., STRAIN, W.D., 2022. A Narrative Review of Chronic Kidney Disease in Clinical Practice: Current Challenges and Future Perspectives. Adv. Ther. 39, 33–43. https://doi.org/10.1007/s12325-021-01927-z

GNIP, P., VOKOROKOS, L., DROTÁR, P., 2021. Selective oversampling approach for strongly imbalanced data. PeerJ Comput. Sci. 7, 1–22. https://doi.org/10.7717/PEERJ-CS.604

HUSTRINI, N.M., SUSALIT, E., ROTMANS, J.I., 2022. Prevalence and risk factors for chronic kidney disease in Indonesia: An analysis of the National Basic Health Survey 2018. J. Glob. Health 12, 4074. https://doi.org/10.7189/jogh.12.04074

JONGBO, O.A., ADETUNMBI, A.O., OGUNRINDE, R.B., BADEJI-AJISAFE, B., 2020. Development of an ensemble approach to chronic kidney disease diagnosis. Sci. African 8, e00456. https://doi.org/10.1016/j.sciaf.2020.e00456

LIU, L., WU, X., LI, S., LI, Y., TAN, S., BAI, Y., 2022. Solving the class imbalance problem using ensemble algorithm: application of screening for aortic dissection. BMC Med. Inform. Decis. Mak. 22, 1–16.

LUBETZKY, M., TANTISATTAMO, E., MOLNAR, M.Z., LENTINE, K.L., BASU, A., PARSONS, R.F., WOODSIDE, K.J., PAVLAKIS, M., BLOSSER, C.D., SINGH, N., CONCEPCION, B.P., ADEY, D., GUPTA, G., FARAVARDEH, A., KRAUS, E., ONG, S., RIELLA, L. V., FRIEDEWALD, J., WISEMAN, A., AALA, A., DADHANIA, D.M., ALHAMAD, T., 2021. The failing kidney allograft: A review and recommendations for the care and management of a complex group of patients. Am. J. Transplant. 21, 2937–2949. https://doi.org/10.1111/ajt.16717

MENDAPARA, K., 2024. Development and evaluation of a chronic kidney disease risk prediction model using random forest. Front. Genet. 15, 1–11. https://doi.org/10.3389/fgene.2024.1409755

MIENYE, I.D., SUN, Y., 2022. A Survey of Ensemble Learning: Concepts, Algorithms, Applications, and Prospects. IEEE Access 10, 99129–99149. https://doi.org/10.1109/ACCESS.2022.3207287

PAL, S., 2023. Prediction for chronic kidney disease by categorical and non_categorical attributes using different machine learning algorithms. Multimed. Tools Appl. 82, 41253–41266. https://doi.org/10.1007/s11042-023-15188-1

REIS PINHEIRO, C.A., PATETTA, M., 2021. Introduction to Statistical and Machine Learning Methods for Data Science. SAS Institute.

SHARMA, P., KAUR, G., 2022. Optimization of Support Vector Machine Classifier Using Grey Wolf Optimization Algorithm for Chronic Kidney Disease Prediction. Asian Pacific J. Heal. Sci. 9, 227–231. https://doi.org/10.21276/apjhs.2022.9.3.46

SHERBINY, M.M. EL, ABDELHALIM, E., EL-DIN MOSTAFA, H., EL-SEDDIK, M.M., 2023. Classification of chronic kidney disease based on machine learning techniques. Indones. J. Electr. Eng. Comput. Sci. 32, 945–955. https://doi.org/10.11591/ijeecs.v32.i2.pp945-955

TOPE-OKE, A., BADEJI-AJISAFE, B., OGUNTIMILEHIN, A., INYANG, M.V., AWEH, O., ABIOLA, O., 2024. K- Nearest Neighbour-Based Chronic Kidney Disease Prediction System: A Case of Toxic Metals in Urine. Int. Conf. Sci. Eng. Bus. Driv. Sustain. Dev. Goals, SEB4SDG 2024 1–6. https://doi.org/10.1109/SEB4SDG60871.2024.10630163

VUJOVIĆ, Ž., 2021. Classification Model Evaluation Metrics. Int. J. Adv. Comput. Sci. Appl. 12, 599–606. https://doi.org/10.14569/IJACSA.2021.0120670

PASARIBU, Y.R., SEFTI S.J. ROMPAS, KUNDRE, R.M., 2021. Perbedaan Tekanan Darah Pada Pasien Ckd Sebelum Dan Setelah Hemodialisis Di Ruang Hemodialisars Swasta Di Sulawesi Utara. J. Keperawatan Volume 9.

Diterbitkan

29-08-2025

Terbitan

Bagian

Ilmu Komputer

Cara Mengutip

Prediksi Resiko Kematian Penderita Gagal Ginjal KronisDengan Voting Classifier Dan Random Forest Pada Data Tidak Seimbang. (2025). Jurnal Teknologi Informasi Dan Ilmu Komputer, 12(4), 859-866. https://doi.org/10.25126/jtiik.124