Seleksi Fitur dengan Particle Swarm Optimization pada Klasifikasi Penyakit Parkinson Menggunakan XGBoost

Penulis

  • Deni Kurnia Universitas Lambung Mangkurat, Banjarmasin
  • Muhammad Itqan Mazdadi Universitas Lambung Mangkurat, Banjarmasin
  • Dwi Kartini Universitas Lambung Mangkurat, Banjarmasin
  • Radityo Adi Nugroho Universitas Lambung Mangkurat, Banjarmasin
  • Friska Abadi Universitas Lambung Mangkurat, Banjarmasin

DOI:

https://doi.org/10.25126/jtiik.20231057252

Abstrak

Penyakit Parkinson merupakan gangguan pada sistem saraf pusat yang mempengaruhi sistem motorik. Diagnosis penyakit ini cukup sulit dilakukan karena gejalanya yang serupa dengan penyakit lain. Saat ini diagnosa dapat dilakukan menggunakan machine learning dengan memanfaatkan rekaman suara pasien. Fitur yang dihasilkan dari ekstraksi rekaman suara tersebut relatif cukup banyak sehingga seleksi fitur perlu dilakukan untuk menghindari memburuknya kinerja sebuah model. Pada penelitian ini, Particle Swarm Optimization digunakan sebagai seleksi fitur, sedangkan XGBoost akan digunakan sebagai model klasifikasi. Selain itu model juga akan diterapkan SMOTE untuk mengatasi masalah ketidakseimbangan kelas data dan hyperparameter tuning pada XGBoost untuk mendapatkan hyperparameter yang optimal. Hasil pengujian menunjukkan bahwa nilai AUC pada model dengan seleksi fitur tanpa SMOTE dan hyperparameter tuning adalah 0,9325, sedangkan pada model tanpa seleksi fitur hanya mendapat nilai AUC sebesar 0,9250. Namun, ketika kedua teknik SMOTE dan hyperparameter tuning digunakan bersamaan, penggunaan seleksi fitur mampu memberikan peningkatan kinerja pada model. Model dengan seleksi fitur mendapat nilai AUC sebesar 0,9483, sedangkan model tanpa seleksi fitur hanya mendapat nilai AUC sebesar 0,9366.

 

Abstract

 

Parkinson's disease is a disorder of the central nervous system that affects the motor system. Diagnosis of this disease is quite difficult because the symptoms are similar to other diseases. Currently, diagnosis can be done using machine learning by utilizing patient voice recordings. The features generated from the extraction of voice recordings are relatively large, so feature selection needs to be done to avoid deteriorating the performance of a model. In this research, Particle Swarm Optimization is used as feature selection, while XGBoost will be used as a classification model. In addition, the model will also be applied SMOTE to overcome the problem of data class imbalance and hyperparameter tuning on XGBoost to get optimal hyperparameters. The test results show that the AUC value on the model with feature selection without SMOTE and hyperparameter tuning is 0.9325, while the model without feature selection only gets an AUC value of 0.9250. However, when both SMOTE and hyperparameter tuning techniques are used together, the use of feature selection is able to provide improved performance on the model. The model with feature selection gets an AUC value of 0.9483, while the model without feature selection only gets an AUC value of 0.9366.

Downloads

Download data is not yet available.

Referensi

ABDURRAHMAN, G. and SINTAWATI, M., 2020. Implementation of XGBoost for Classification of Parkinson’s Disease. Journal of Physics: Conference Series, 1538(1). https://doi.org/10.1088/1742-6596/1538/1/012024.

AHMED, I., ALJAHDALI, S., KHAN, M.S. and KADDOURA, S., 2022. Classification of Parkinson Disease Based on Patient’s Voice Signal Using Machine Learning. Intelligent Automation and Soft Computing, 32(2), pp.705–722. https://doi.org/10.32604/iasc.2022.022037.

ARAFA, A., EL-FISHAWY, N., BADAWY, M. and RADAD, M., 2022. RN-SMOTE: Reduced Noise SMOTE based on DBSCAN for Enhancing Imbalanced Data Classification. Journal of King Saud University - Computer and Information Sciences, 34(8), pp.5059–5074. https://doi.org/10.1016/j.jksuci.2022.06.005.

DIVVA MEUTHIA ZULMA, G. and CHAMIDAH, N., 2021. Perbandingan Metode Klasifikasi Naive Bayes, Decision Tree Dan K-Nearest Neighbor Pada Data Log Firewall. Seminar Nasional Mahasiswa Ilmu Komputer dan Aplikasinya (SENAMIKA) Jakarta-Indonesia.

DONG, H., HE, D. and WANG, F., 2020. SMOTE-XGBoost using Tree Parzen Estimator Optimization for Copper Flotation Method Classification. Powder Technology, 375, pp.174–181. https://doi.org/10.1016/j.powtec.2020.07.065.

DWI, M., FORDANA, Y. and ROCHMAWATI, N., 2022. Optimisasi Hyperparameter CNN Menggunakan Random Search Untuk Deteksi COVID-19 Dari Citra X-Ray Dada. Journal of Informatics and Computer Science, 04.

DWI YULIAN PRAKOSO, R., SOEJONO WIRIAATMADJA, B. and WAHYU WIBOWO, F., 2020. Sistem Klasifikasi Pada Penyakit Parkinson Dengan Menggunakan Metode K-Nearest Neighbor. In: Seminar Nasional Teknologi Komputer & Sains (SAINTEKS) . [online] pp.63–68. Available at: <https://prosiding.seminar-id.com/index.php/sainteks> [Accessed 29 June 2023].

FAHIM, N.W., ESHTI, S.A., JUBAYER, M., ABIR, H., NURA, K.A., NAHID, M. and MAHBUB, I., 2020. Parkinson Disease Detection: Using XGBoost Algorithm to Detect Early Onset Parkinson Disease. [online] Available at: <https://www.researchgate.net/publication/354835987>.

GIVARI, M.R., MOCHAMAD, R. and SULAEMAN, Y.U., 2022. Perbandingan Algoritma SVM, Random Forest Dan XGBoost Untuk Penentuan Persetujuan Pengajuan Kredit. Jurnal Nuansa Informatika, 16(1), pp.141–149. https://doi.org/https://doi.org/10.25134/nuansa.v16i1.5406.

HUANG, M.W., CHIU, C.H., TSAI, C.F. and LIN, W.C., 2021. On Combining Feature Selection and Over-Sampling Techniques for Breast Cancer Prediction. Applied Sciences (Switzerland), 11(14). https://doi.org/10.3390/app11146574.

KARABAYIR, I., GOLDMAN, S.M., PAPPU, S. and AKBILGIC, O., 2020. Gradient Boosting for Parkinson’s Disease Diagnosis from Voice Recordings. BMC Medical Informatics and Decision Making, 20(1). https://doi.org/10.1186/s12911-020-01250-7.

KURNIATI and REZA WARDANA, R., 2020. Penerapan Algoritma Particle Swarm Optimization pada Segmentasi Citra Pengenalan Aksara Bugis. Jurnal Pengembangan Sistem Informasi dan Informatika, 1(3), pp.138–148. https://doi.org/https://doi.org/10.47747/jpsii.v1i3.177.

MUSLIM, M.A., RUKMANA, S.H., SUGIHARTI, E., PRASETIYO, B. and ALIMAH, S., 2018. Optimization of C4.5 Algorithm-Based Particle Swarm Optimization for Breast Cancer Diagnosis. In: Journal of Physics: Conference Series. Institute of Physics Publishing. https://doi.org/10.1088/1742-6596/983/1/012063.

NIKENTARI, N., KURNIAWAN, H., RITHA, N. and KURNIAWAN, D., 2018. Optimasi Jaringan Syaraf Tiruan Backpropagation dengan Particle Swarm Optimization untuk Prediksi Pasang Surut Air Laut. Jurnal Teknologi Informasi dan Ilmu Komputer (JTIIK), 5(5), pp.605–612. https://doi.org/10.25126/jtiik2018551055.

NURHAYATI, AGUSTIAN, F. and LUBIS, M.D.I., 2020. Particle Swarm Optimization Feature Selection for Breast Cancer Prediction. In: 2020 8th International Conference on Cyber and IT Service Management, CITSM 2020. Institute of Electrical and Electronics Engineers Inc. pp.1–6. https://doi.org/10.1109/CITSM50537.2020.9268865.

QU, Y., LIN, Z., LI, H. and ZHANG, X., 2019. Feature Recognition of Urban Road Traffic Accidents Based on GA-XGBoost in the Context of Big Data. IEEE Access, 7, pp.170106–170115. https://doi.org/10.1109/ACCESS.2019.2952655.

RUPAPARA, V., RUSTAM, F., SHAHZAD, H.F., MEHMOOD, A., ASHRAF, I. and CHOI, G.S., 2021. Impact of SMOTE on Imbalanced Text Features for Toxic Comments Classification Using RVVC Model. IEEE Access, 9, pp.78621–78634. https://doi.org/10.1109/ACCESS.2021.3083638.

SAKAR, C.O., SERBES, G., GUNDUZ, A., TUNC, H.C., NIZAM, H., SAKAR, B.E., TUTUNCU, M., AYDIN, T., ISENKUL, M.E. and APAYDIN, H., 2019. A Comparative Analysis of Speech Signal Processing Algorithms for Parkinson’s Disease Classification and The Use of The Tunable Q-factor Wavelet Transform. Applied Soft Computing Journal, 74, pp.255–263. https://doi.org/10.1016/j.asoc.2018.10.022.

SEJATI, P., PILLIANG, M. and AKBAR, H., 2022. Studi Komparasi Naive Bayes, K-Nearest Neighbor, dan Random Forest untuk Prediksi Calon Mahasiswa yang Diterima atau Mundur. Jurnal Teknologi Informasi dan Ilmu Komputer (JTIIK), 9(7), pp.1341–1348. https://doi.org/10.25126/jtiik.202296737.

SEPTIANINGRUM, F. and IRAWAN, A.S.Y., 2021. Metode Seleksi Fitur Untuk Klasifikasi Sentimen Menggunakan Algoritma Naive Bayes: Sebuah Literature Review. JURNAL MEDIA INFORMATIKA BUDIDARMA, 5(3), pp.799–805. https://doi.org/10.30865/mib.v5i3.2983.

SHAMI, T.M., El-SALEH, A.A., ALSWAITTI, M., AL-TASHI, Q., SUMMAKIEH, M.A. and MIRJAILILI, S., 2022. Particle Swarm Optimization: A Comprehensive Survey. IEEE Access, 10, pp.10031–10061. https://doi.org/10.1109/ACCESS.2022.3142859.

SINAGA, H.H. and AGUSTIAN, S., 2022. Pebandingan Metode Decision Tree dan XGBoost untuk Klasifikasi Sentimen Vaksin Covid-19 di Twitter. Jurnal Nasional Teknologi dan Sistem Informasi, 8(3), pp.107–114. https://doi.org/10.25077/teknosi.v8i3.2022.107-114.

SIRINGORINGO, R. and JAMALUDDIN, 2018. Peningkatan Performa Cluster Fuzzy C-Means pada Pengklasteran Sentimen Menggunakan Particle Swarm Optimization. Jurnal Teknologi Informasi dan Ilmu Komputer (JTIIK), 6(4), pp.349–354. https://doi.org/10.25126/jtiik.2018561090.

SYUKRON, M., SANTOSO, R. and WIDIHARIH, T., 2020. Perbandingan Metode SMOTE Random Forest Dan SMOTE XGBoost Untuk Klasifikasi Tingkat Penyakit Hepatitis C Pada Imbalance Class Data. Jurnal Gaussian, 9(3), pp.227–236. https://doi.org/https://doi.org/10.14710/j.gauss.9.3.227-236.

WIJAYANTI, N.P.Y.T., N. KENCANA, E. and SUMARJAYA, I.W., 2021. SMOTE: Potensi dan Kekurangannya pada Survei. E-Jurnal Matematika, 10(4), pp.235–240. https://doi.org/10.24843/mtk.2021.v10.i04.p348.

WILIAM MAHARDIKA, K., SARI, Y.A. and ARWAN, A., 2018. Optimasi K-Nearest Neighbour Menggunakan Particle Swarm Optimization pada Sistem Pakar untuk Monitoring Pengendalian Hama pada Tanaman Jeruk. Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, [online] 2(9), pp.3333–3344. Available at: <http://j-ptiik.ub.ac.id>.

Diterbitkan

17-10-2023

Terbitan

Bagian

Ilmu Komputer

Cara Mengutip

Seleksi Fitur dengan Particle Swarm Optimization pada Klasifikasi Penyakit Parkinson Menggunakan XGBoost. (2023). Jurnal Teknologi Informasi Dan Ilmu Komputer, 10(5), 1083-1094. https://doi.org/10.25126/jtiik.20231057252