Peningkatan Konerja Metode SVM Menggunakan Metode KNN Imputasi dan K-Means-Smote untuk Klasifikasi Kelulusan Mahasiswa Universitas Bumigora

Penulis

Hairani Hairani

Abstrak

Salah satu permasalahan utama Universitas Bumigora adalah rasio antara mahasiswa yang masuk dengan mahasiswa lulus tepat waktu  tidak seimbang, sehingga akan mengakibatkan penurunan penilaian akreditasi dikemudian hari. Salah satu indikator penilaian dalam proses akreditasi adalah rasio kelulusan mahasiswa. Data kelulusan mahasiswa yang tersimpan pada basisdata kampus, tetapi belum dimanfaatkan dengan maksimal. Dengan memanfaatkan data kelulusan mahasiswa dapat mengetahui pattern atau pola-pola mahasiswa yang lulus tepat waktu atau tidak, sehingga dapat minimalisir terjadinya mahasiswa yang drop out. Tidak hanya itu, pengambil keputusan dapat dimudahkan membuat kebijakan secara dini untuk membantu mahasiswa yang berpotensi drop out dan lulus tidak tepat waktu. Solusi yang ditawarkan pada penelitian ini adalah menggunakan teknik data mining. Salah satu metode data mining yang digunakan penelitian ini adalah metode SVM. Adapun tujuan penelitian ini adalah meningkatkan kinerja metode SVM untuk klasifikasi kelulusan mahasiswa Universitas Bumigora menggunakan metode KNN Imputasi dan K-Means-Smote. Penelitian ini terdiri dari beberapa tahapan yaitu pengumpulan data kelulusan mahasiswa, pra-pengolahan seperti penanganan nilai hilang menggunakan metode KNNI, penanganan ketidakseimbangan kelas menggunakan K-Means-Smote, klasifikasi menggunakan metode SVM. Tahapan terakhir adalah pengujian kinerja SVM berdasarkan akurasi, sensitivitas, spesifisitas, dan f-measure.  Berdasarkan hasil pengujian yang telah dilakukan, integrasi metode KNNI, K-Means-Smote, dan SVM mendapatkan akurasi 83.9%, sensitivitas 81.3%, spesifisitas 86.6%, dan f-measure 83.5%.  Penggunaan metode KNNI dan K-Means-Smote dapat meningkatkan kinerja metode SVM berdasarkan akurasi, sensitivitas, spesifisitas, dan f-measure. 


Abstract

 One of the main problems of Bumigora University is the ratio between incoming students and students graduating on time is not balanced, so that it will result in a decrease in accreditation assessment in the future. One of the assessment indicators in the accreditation process is the student graduation ratio. Student graduation data stored in the campus database, but has not been maximally utilized. By utilizing graduation data, students can find out patterns or patterns of students who graduate on time or not, so as to minimize the occurrence of students who drop out. Not only that, decision makers can make it easier to make policies early to help students who have the potential to drop out and not graduate on time. The solution offered in this research is to use data mining techniques. One of the data mining methods used in this study is the SVM method. The purpose of this study is to improve the performance of the SVM method for the classification of Bumigora University graduation students using the KNN Imputation and K-Means-Smote methods. This research consists of several stages, namely the collection of student graduation data, pre-processing such as handling missing values using KNNI method, handling class imbalances using K-Means-Smote, classification the SVM method. The last stage is testing SVM performance based on accuracy, sensitivity, specificity, and f-measure. Based on the results of test that have been carried out, the integration of the KNNI, K-Means-Smote, and SVM method get an accuracy of 83.9%, sensitivity 81.3%, specificity 86.6%, and f-measure 83.5%. The use of KNNI and K-Means-Smote method can improve the performance of the SVM method based on accuracy, sensitivity, specificity, and f-measure. 

Teks Lengkap:

PDF

Referensi


AESYI, U. S. AND WARDOYO, R. (2019) ‘Prediction of Length of Study of Student Applicants Using Case Based Reasoning’, IJCCS (Indonesian Journal of Computing and Cybernetics Systems), 13(1), p. 11. doi: 10.22146/ijccs.28076.

ARIFIN, D. AND HADIANA, A. (2019) ‘Computer-based Techniques for Predicting the Failure of Student Studies Using the Decision Tree method’, IOP Conference Series: Materials Science and Engineering, 662(2), pp. 1–9. doi: 10.1088/1757-899X/662/2/022112.

BISRI, A. AND RACHMATIKA, R. (2019) ‘Integrasi Gradient Boosted Trees dengan SMOTE dan Bagging untuk Deteksi Kelulusan Mahasiswa’, Jurnal Nasional Teknik Elektro dan Teknologi Informasi (JNTETI), 8(4), p. 309. doi: 10.22146/jnteti.v8i4.529.

CAPARIÑO, E. T., SISON, A. M. AND MEDINA, R. P. (2019) ‘Application of the modified imputation method to missing data to increase classification performance’, in 2019 IEEE 4th International Conference on Computer and Communication Systems, ICCCS 2019. IEEE, pp. 134–139. doi: 10.1109/CCOMS.2019.8821632.

CHAWLA, N. V, BOWYER, K. W. AND HALL, L. O. (2002) ‘SMOTE : Synthetic Minority Over-sampling TEchnique’, Journal of Artificial Intelligence Research, 16, pp. 341–378.

DOUZAS, G., BACAO, F. AND LAST, F. (2018a) ‘Improving Imbalanced Learning Through A Heuristic Oversampling Method Based On K-means and SMOTE’, Information Sciences. Elsevier Inc., 465, pp. 1–20. doi: 10.1016/j.ins.2018.06.056.

FADLI, A., ZULFA, M. I. AND RAMADHANI, Y. (2018) ‘Performance Comparison of Data Mining Classification Algorithms for Early Warning System of Students Graduation Timeliness’, Jurnal Teknologi dan Sistem Komputer, 6(4), p. 158. doi: 10.14710/jtsiskom.6.4.2018.158-163.

HAIRANI, H., SAPUTRO, K. E. AND FADLI, S. (2020) ‘K-means-SMOTE untuk menangani ketidakseimbangan kelas dalam klasifikasi penyakit diabetes dengan C4.5, SVM, dan naive Bayes’, Jurnal Teknologi dan Sistem Komputer, 8(2), pp. 89–93. doi: https://doi.org/10.14710/jtsiskom.8.2.2020.89-93.

HAN, J., KAMBER, M. AND PEI, J. (2012) ‘Data Mining Concepts and Techniques’, in. Waltham: Morgan Kaufmann.

HARTINI, E. (2017) ‘Implementation of Missing Values Handling Method for Evaluating the System/Component Maintenance Historical Data’, Jurnal Teknologi Reaktor Nuklir Tri Dasa Mega, 19(1), p. 11. doi: 10.17146/tdm.2017.19.1.3159.

JORDANOV, I., PETROV, N. AND PETROZZIELLO, A. (2018) ‘Classifiers Accuracy Improvement Based on Missing Data Imputation’, Journal of Artificial Intelligence and Soft Computing Research, 8(1), pp. 31–48. doi: 10.1515/jaiscr-2018-0002.

LUQUE, A. ET AL. (2019) ‘The impact of class imbalance in classification performance metrics based on the binary confusion matrix’, Pattern Recognition. Elsevier Ltd, 91, pp. 216–231. doi: 10.1016/j.patcog.2019.02.023.

MAESYA, A. AND HENDIYANTI, T. (2019) ‘Forecasting Student Graduation with Classification and Regression Tree (CART) Algorithm’, IOP Conference Series: Materials Science and Engineering, 621(1), pp. 1–6. doi: 10.1088/1757-899X/621/1/012005.

MINAKSHI, VOHRA, R. AND GIMPY (2014) ‘Missing Value Imputation in Multi Attribute Data Set’, International Journal of Computer Science and Information Technologies, 5(4), pp. 5315–5321.

PRASETYO, E. (2014) Data Mining, Mengolah Data Menjadi Informasi Menggunakan Matlab. Yogyakarta: Andi.

RAHMAN, M. M. AND DAVIS, D. N. (2013) ‘Cluster Based Under-Sampling for Unbalanced Cardiovascular Data’, in Proceedings of the World Congress on Engineering, Vol. 3.

ZAKI, M. J., MEIRA JR, W. AND MEIRA, W. (2014) Data mining and analysis: fundamental concepts and algorithms. Cambridge University Press.




DOI: http://dx.doi.org/10.25126/jtiik.2021843428