Komparasi Data Mining Naive Bayes dan Neural Network memprediksi Masa Studi Mahasiswa S1

Gambar Sampul

Penulis

Azahari Azahari, Yulindawati Yulindawati, Dewi Rosita, Syamsuddin Mallala

Abstrak

Prediksi  kelulusan  dibutuhkan  oleh  manajemen  perguruan  tinggi  dalam  menentukan kebijakan  preventif  terkait  pencegahan  dini  kasus drop  out. Lama masa studi setiap mahasiswa bisa disebabkan dengan berbagai faktor.  Dengan  menggunakan data mining algoritma naive bayes dan neural network dapat  dilakukan  prediksi  kelulusan  mahasiswa di  STMIK  Widya  Cipta  Dharma (WiCiDa) Samarinda . Atribut yang digunakan yaitu, umur saat masuk kuliah, klasifikasi kota asal Sekolah Menengah Atas, pekerjaan ayah, program studi, kelas, jumlah saudara, dan Indeks Prestasi Kumulatif (IPK). Sampel mahasiswa yang lulus dan drop-out pada tahun 2011 sampai 2019 dijadikan sebagai data training dan data testing. Sedangkan angkatan 2015–2018 digunakan sebagai data target yang akan diprediksi masa studinya. Sebanyak 3229 mahasiswa, 1769 sebagai data training, 321 sebagai data testing, dan 1139 sebagai data target. Semua data diambil dari data mahasiswa program strata 1, dan tidak mengikut sertakan data mahasiswa D3 dan alih jenjang/transfer.  Dari data testing diperoleh tingkat akurasi hanya 57,63%. Hasil penelitian menunjukkan banyaknya kelemahan dari hasil prediksi naive bayes dikarenakan tingkat akurasi kevalidannya tergolong tidak terlalu tinggi. Sedangkan akurasi prediksi neural network adalah 72,58%, sehingga metode alternatif inilah yang lebih baik. Proses evaluasi dan analisis dilakukan untuk melihat dimana letak kesalahan dan kebenaran dalam hasil prediksi masa studi.

Abstract

Graduation predictions are required by the higher education institution preventive policies related to the early prevention of drop-out cases. The duration of study, for each student can be caused by various factors. By using the data mining algorithm Naive bayes and neural network, the student graduation in STMIK Widya Cipta Dharma (WiCiDa) can be predicted. The attributes used are as follows: age at admission, classification of cities from high school, father’s occupation, study program, class, number of siblings, and grade point average (GPA). Samples of students who graduated and dropped out between year 2011 and 2019 were used as training data and testing data. While the year class of 2015to 2018 is used as the target data, which will be predicted during the study period. According to the data mining algorithm Naive bayes, there are 3229 students; 1769 as training data, 321 as testing data, and 1139 as target data. All data is taken from students enrolled in undergraduate program and does not include data on diploma students and transfer student. From the testing data, an accuracy rate only 57.63%. The other side, prediction accuracy of the neural network is 72.58%, so this alternative method is the best chosen. The research results show the many weaknesses of the results of prediction of Naive bayes because the level of accuracy of its validity is not high. The evaluation and analysis process are conducted to see where the errors and truths are in the results of the study period predictions.


Teks Lengkap:

PDF

Referensi


ADEKITAN, A. I., & NOMA-OSAGHAE, E. 2019. Data mining approach to predicting the performance of first year student in a university using the admission requirements. Education and Information Technologies, 24(2), 1527-1543.

BAN-PT. 2010. Pedoman Evaluasi-Diri Untuk Akreditasi Program Studi Dan Institusi Perguruan Tinggi. Jakarta: DiktiJanner.

DEVITA, R. N., HERWANTO, H. W., & WIBAWA, A. P. 2018. Perbandingan Kinerja Metode Naive bayes dan K-Nearest Neighbor untuk Klasifikasi Artikel Berbahasa indonesia. Jurnal Teknologi Informasi dan Ilmu Komputer, 5(4), 427-434.

GAO, C. Z., CHENG, Q., HE, P., SUSILO, W., & LI, J. 2018. Privacy-preserving Naive Bayes classifiers secure against the substitution-then-comparison attack. Information Sciences, 444, 72-88

JANANTO, A. 2013. Algoritma Naive bayes untuk Mencari Perkiraan Waktu Studi Mahasiswa. Dinamik, 18(1).

KURNIAWAN, Y. I. 2018. Perbandingan Algoritma Naive bayes dan C. 45 dalam Klasifikasi Data mining. Jurnal Teknologi Informasi dan Ilmu Komputer, 5(4), 455-464.

MASTERS, T. 2018. Data mining Algorithms in C++.USA: Springer

MOHAMMAD, A. H.,ALWADA’N, T., & AL-MOMANI, O. 2018. Arabic text categorization using support vector machine, Naïve Bayes and neural network.GSTF Journal on Computing (JoC), 5(1)

MUFLIKHAH, L., RATNAWATI, D.E., & PUTRI, R.R.M. 2018. Data mining. Malang: Universitas Brawijaya Press.

NURHUDA, A., & ROSITA, D. 2017. Prediction Student Graduation on Time Using Artificial Neural Network on Data mining Students STMIK Widya Cipta Dharma Samarinda.. Proceedings of the 2017 International Conference on E-commerce, E-Business and E-Government (pp. 86-89). ACM.

SUN, J., ZHOU, A., KEATES, S., & LIAO, S. 2017. Simultaneous Bayesian Clustering and Feature Selection Through Student’s t Mixtures Model. IEEE transactions on neural networks and learning systems, 29(4), 1187-1199.

TASNIM, N., PAUL, M. K., & SATTAR, A. S. 2019. Identification of Drop Out Students Using Educational Data mining. In 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE) (pp. 1-5). IEEE.

TEMPOLA, F., MUHAMMAD, M., & KHAIRAN, A. 2018. Perbandingan Klasifikasi Antara KNN dan Naive bayes pada Penentuan Status Gunung Berapi dengan K-Fold Cross Validation. Jurnal Teknologi Informasi dan Ilmu Komputer (JTIIK), 5(5).

TIM PENYUSUN BUKU PEDOMAN STMIK WIDYA CIPTA DHARMA. 2018. Buku Pedoman STMIK Widya Cipta Dharma. Samarinda: STMIK Widya Cipta Dharma

WIJAYANTI, S., & ANDREA, R. 2017. K-Means Cluster Analysis for Students Graduation: Case Study: STMIK Widya Cipta Dharma. In Proceedings of the 2017 International Conference on E-commerce, E-Business and E-Government (pp. 20-23). ACM.

WITTEN, I. H., FRANK, E., HALL, M. A., & PAL, C. J. 2016. Data mining: Practical machine learning tools and techniques. Morgan Kaufmann.




DOI: http://dx.doi.org/10.25126/jtiik.2020732093