Implementasi Algoritma Random Forest Untuk Menentukan Penerima Bantuan Raskin

Penulis

  • Ilham Kurniawan Universitas Bina Sarana Informatika, Jakarta
  • Duwi Cahya Putri Buani Universitas Nusa Mandiri, Jakarta Pusat
  • Abdussomad Abdussomad Universitas Bina Sarana Informatika, Jakarta
  • Widya Apriliah Universitas Bina Sarana Informatika, Jakarta
  • Rizal Amegia Saputra Universitas Bina Sarana Informatika, Jakarta

DOI:

https://doi.org/10.25126/jtiik.20231026225

Abstrak

Kemiskinan adalah salah satu perhatian mendasar dari setiap pemerintah. Program Beras Keluarga Miskin (Raskin) merupakan  salah satu program pemerintah. Skema raskin mempunyai tujuan meminimalisir beban rumah tangga tidak mampu sebagai bentuk bantuan untuk menaikkan ketahanan pangan melalui perlindungan sosial. Tujuan penelitian ini adalah menemukan akurasi tertinggi di antara algoritma klasifikasi prediktif yang diusulkan penerima bantuan raskin menggunakan tools python machine learning dan di implementasikan melalui suatu website. Klasifikasi adalah metode penambangan data yang menentukan kategori pada kelompok data untuk mendukung prediksi dan analisa yang semakin akurat. Beberapa algoritma klasifikasi pembelajaran mesin seperti, SVM, NB dan RF, digunakan pada penelitian ini demi menentukan penerima bantuan raskin. Eksperimen dilakukan menggunakan dataset Raskin Kelurahan Gunungparang, Kota Sukabumi yang bersumber dari Kelurahan Gunungparang. Kinerja algoritma klasifikasi dievaluasi dengan beragam metrik seperti Precision, Accuracy, F-Measure, dan Recall. Akurasi diukur melalui contoh yang dikelompokan dengan benar atau salah. Hasil yang diperoleh menunjukkan algoritma klasifikasi RF memiliki nilai precision, recall, f-measure dengan nilai 97%, nilai accuracy sebesar  97,26% dan nilai ROC 0,970, lebih baik dari algoritma klasifikasi lainnya yaitu perbedaan sebesar 5,11% dengan algoritma klasifikasi support vector machine dan 8,87% dengan algoritma klasifikasi naive bayes. Akurasi sangat baik digunakan sebagai acuan kinerja algoritma apabila jumlah False Negative dan False Positive jumlah nya mendekati. Hasil penelitian ini dibuktikan secara akurat dan sistematis menggunakan Receiver Operating Characteristic (ROC).

 

Abstract

 

The problem of poverty is one of the fundamental concerns of every government. The Raskin  program is one of the government's programs. The Raskin scheme has the aim of minimizing the burden on poor households in the form of assistance to improve food security by providing social protection. The purpose of this study is to find the highest accuracy among the predictive classification algorithms proposed by Raskin beneficiaries using python machine learning tools and implemented through a website. Classification is a data mining method that determines categories in data groups to support more accurate predictions and analysis. Therefore, three machine learning classification algorithms such as, support vector machine, naive bayes and random forest, were used in this experiment. to determine recipients of Raskin assistance. The experiment was carried out using the Raskin dataset, Gunungparang Village, Sukabumi City, which was sourced from Gunungparang Village. The performance of the classification algorithm is evaluated by various metrics such as Precision, Accuracy, F-Measure, and Recall. Accuracy is measured by correctly and incorrectly grouped samples. The results obtained show that the random forest classification algorithm has precision, recall, f-measure values with a value of 97%, an accuracy value of 97.26% and an ROC value of 0.970, better than other classification algorithms, namely the difference of 5.11% with the support vector classification algorithm. machine and 8.87% with naive bayes classification algorithm. Very good accuracy is used as a reference for algorithm performance if the number of False Negatives and False Positives is close. These results were proven accurately and systematically using Receiver Operating Characteristics (ROC).

Downloads

Download data is not yet available.

Referensi

ALMAHADEEN, L., AKKAYA, M. dan SARI, A., 2017. Mining Student Data Using CRISP- DM Model. International Journal of Computer Science and Information Security.

ALWIDIAN, J., HAMMO, B.H. dan OBEID, N., 2018. WCBA: Weighted classification based on association rules algorithm for breast cancer disease. Applied Soft Computing Journal, [daring] 62, hal.536–549. https://doi.org/10.1016/j.asoc.2017.11.013.

ANDRIAWAN, Z.A., PURNAMA, S.R., DARMAWAN, A.S., RICKO, WIBOWO, A., SUGIHARTO, A. dan WIJAYANTO, F., 2020. Prediction of Hotel Booking Cancellation using CRISP-DM. ICICoS 2020 - Proceeding: 4th International Conference on Informatics and Computational Sciences, hal.0–5. https://doi.org/10.1109/ICICoS51170.2020.9299011.

APRILIAH, W., KURNIAWAN, I., BAYDHOWI, M. dan HARYATI, T., 2021. Prediksi Kemungkinan Diabetes pada Tahap Awal Menggunakan Algoritma Klasifikasi Random Forest. Sistemasi, 10(1), hal.163. https://doi.org/10.32520/stmsi.v10i1.1129.

ARIA, M., CUCCURULLO, C. DAN GNASSO, A., 2021. A comparison among interpretative proposals for Random Forests. Machine Learning with Applications, [daring] 6(June), hal.100094. https://doi.org/10.1016/j.mlwa.2021.100094.

BARROS, R., PERES, A., LORENZI, F., KRUG WIVES, L. dan HUBERT DA SILVA JACCOTTET, E., 2018. Case law analysis with machine learning in Brazilian court. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 10868 LNAI, hal.857–868. https://doi.org/10.1007/978-3-319-92058-0_82.

BUNKER, R.P. dan THABTAH, F., 2019. A machine learning framework for sport result prediction. Applied Computing and Informatics, [daring] 15(1), hal.27–33. https://doi.org/10.1016/j.aci.2017.09.005.

CASTRO R., L.F., ESPITIA P., E. dan MONTILLA, A.F., 2018. Applying CRISP-DM in a KDD process for the analysis of student attrition. Communications in Computer and Information Science, 885, hal.386–401. https://doi.org/10.1007/978-3-319-98998-3_30.

CONTRERAS, Y., VERA, M., HUÉRFANO, Y., VALBUENA, O., SALAZAR, W., VERA, M.I., BORRERO, M., BARRERA, D., HERNÁNDEZ, C., MOLINA, Á.V., MARTÍNEZ, L.J., SÁENZ, F., VIVAS, M., SALAZAR, J. dan GELVEZ, E., 2018. Digital processing of medical images: Application in synthetic cardiac datasets using the CRISP_DM methodology. Revista Latinoamericana de Hipertension, 13(4), hal.310–315.

DADERMAN, A. dan ROSANDER, S., 2018. Evaluating Frameworks for Implementing Machine Learning in Signal Processing. Examensarbete Inom Teknik, hal.1–36.

ERMAWATI, E., 2019. Algoritma Klasifikasi C4.5 Berbasis Particle Swarm Optimization Untuk Prediksi Penerima Bantuan Pangan Non Tunai. Sistemasi, 8(3), hal.513. https://doi.org/10.32520/stmsi.v8i3.576.

GONÇALVES, C., FERREIRA, D., NETO, C., ABELHA, A. dan MACHADO, J., 2020. Prediction of mental illness associated with unemployment using data mining. Procedia Computer Science, [daring] 177, hal.556–561. https://doi.org/10.1016/j.procs.2020.10.078.

HIDAYAH, N., 2019. Sistem Klasifikasi Penerima Beras Miskin Menggunakan Algoritma Decision Tree C4.5. Program Studi Teknik Informatika, Fakultas Teknologi Informasi & Elektro, Universitas Teknologi Yogyakarta, 5.

HUBER, S., WIEMER, H., SCHNEIDER, D. dan IHLENFELDT, S., 2019. DMME: Data mining methodology for engineering applications - A holistic extension to the CRISP-DM model. Procedia CIRP, [daring] 79, hal.403–408. https://doi.org/10.1016/j.procir.2019.02.106.

JAIN, S. dan SAHA, A., 2021. Improving performance with hybrid feature selection and ensemble machine learning techniques for code smell detection. Science of Computer Programming, [daring] 212, hal.102713. https://doi.org/10.1016/j.scico.2021.102713.

KURNIAWAN, I., 2020. Prediksi Gejala Autism Spectrum Disorders pada Remaja Menggunakan Optimasi Particle Swarm Optimization dan Algoritma Support Vector Machine. INFORMATICS FOR EDUCATORS AND PROFESSIONALS, 4(2), hal.113–122.

MAKARIOU, D., BARRIEU, P. dan CHEN, Y., 2021. A random forest based approach for predicting spreads in the primary catastrophe bond market. Insurance: Mathematics and Economics, [daring] 101, hal.140–162. https://doi.org/10.1016/j.insmatheco.2021.07.003.

MOHANA, R.M., REDDY, C.K.K., ANISHA, P.R. dan MURTHY, B.V.R., 2021. Random forest algorithms for the classification of tree-based ensemble. Materials Today: Proceedings, [daring] (xxxx). https://doi.org/10.1016/j.matpr.2021.01.788.

MOREIRA, L.B. dan NAMEN, A.A., 2018. A hybrid data mining model for diagnosis of patients with clinical suspicion of dementia. Computer Methods and Programs in Biomedicine, [daring] 165, hal.139–149. https://doi.org/10.1016/j.cmpb.2018.08.016.

MUJUMDAR, A. dan VAIDEHI, V., 2019. Diabetes Prediction using Machine Learning Algorithms. Procedia Computer Science, [daring] 165, hal.292–299. https://doi.org/10.1016/j.procs.2020.01.047.

NAGASHIMA, H. dan KATO, Y., 2019. APREP-DM: A Framework for Automating the Pre-Processing of a Sensor Data Analysis based on CRISP-DM. 2019 IEEE International Conference on Pervasive Computing and Communications Workshops, PerCom Workshops 2019, hal.555–560. https://doi.org/10.1109/PERCOMW.2019.8730785.

PINTO, A., FERREIRA, D., NETO, C., ABELHA, A. dan MACHADO, J., 2020. Data mining to predict early stage chronic kidney disease. Procedia Computer Science, 177(2018), hal.562–567. https://doi.org/10.1016/j.procs.2020.10.079.

ROGALEWICZ, M. dan SIKA, R., 2016. Methodologies of knowledge discovery from data and data mining methods in mechanical engineering. Management and Production Engineering Review, 7(4), hal.97–108. https://doi.org/10.1515/mper-2016-0040.

SHANKAR, K., LAKSHMANAPRABU, S.K., GUPTA, D., MASELENO, A. dan DE ALBUQUERQUE, V.H.C., 2020. Optimal feature-based multi-kernel SVM approach for thyroid disease classification. Journal of Supercomputing, [daring] 76(2), hal.1128–1143. https://doi.org/10.1007/s11227-018-2469-4.

TIGGA, N.P. dan GARG, S., 2020. Prediction of Type 2 Diabetes using Machine Learning Classification Methods. Procedia Computer Science, [daring] 167(2019), hal.706–716. https://doi.org/10.1016/j.procs.2020.03.336.

VERAWATI, Y. dan HASIBUAN, M.S., 2021. Perbandingan Data Set IRIS Dengan Aplikasi Rapid Miner dan Orange menggunakan Algoritma Klasifikasi. Institut Informatika dan Bisnis Darmajaya, hal.158–163.

VAN DER VOORT, H., VAN BULDEREN, S., CUNNINGHAM, S. dan JANSSEN, M., 2021. Data science as knowledge creation a framework for synergies between data analysts and domain professionals. Technological Forecasting and Social Change, [daring] 173(January), hal.121160. https://doi.org/10.1016/j.techfore.2021.121160

Diterbitkan

14-04-2023

Terbitan

Bagian

Ilmu Komputer

Cara Mengutip

Implementasi Algoritma Random Forest Untuk Menentukan Penerima Bantuan Raskin. (2023). Jurnal Teknologi Informasi Dan Ilmu Komputer, 10(2), 421-428. https://doi.org/10.25126/jtiik.20231026225