Analisis Performa Algoritma Decision Tree, Naive Bayes, K-Nearest Neighbor untuk Klasifikasi Zona Daerah Risiko Covid-19 di Indonesia

Penulis

  • Ainurohmah Ainurrohmah Universitas Negeri Semarang, Semarang
  • Dian Tri Wiyanti Universitas Negeri Semarang, Semarang

DOI:

https://doi.org/10.25126/jtiik.20231015935

Abstrak

Pandemi Covid-19 terjadi di Indonesia. Pemerintah berupaya melakukan penanganan Covid-19, salah satunya dengan pembuatan peta risiko Covid-19. Peta risiko Covid-19 membagi zona berdasarkan Kabupaten/Kota. Zona risiko Covid-19 menjadi patokan pemerintah dalam mengambil kebijakan setiap daerah. Pemerintah menggunakan pembobotan dari 15 indikator untuk menentukan zona. Beberapa kali perubahan zona risiko Covid-19 pada website mengalami keterlambatan. Klasifikasi dapat menjadi alternatif penentuan zona risiko Covid-19, sehingga perubahan zona dapat dilakukan secara cepat dan efisien. Klasifikasi memiliki berbagai algoritma, setiap algoritma memiliki keunggulan dan kelemahan. Algoritma klasifikasi yang memiliki akurasi yang baik dengan waktu relatif cepat yaitu Decision Tree, Naïve Bayes dan K-Nearest Neighbor. Tujuan penelitian ini menghitung performa setiap algoritma, mendapatkan algoritma terbaik dan mendapatkan pola klasifikasi dari algoritma terbaik. Metode penelitian menggunakan 10-fold cross validation untuk pembagian data dan confusion matrix untuk menilai performa. Software yang digunakan yaitu Rapidminer dan WEKA. Hasil dari pengolahan data menunjukan semua algoritma mempunyai nilai performa yang baik yaitu diatas 70%. Semua algoritma tidak memerlukan waktu yang lama dalam pembuatan model. Nilai performa terbaik didapatkan dengan menggunakan algoritma decision tree dengan software WEKA dengan nilai performa 88% dan waktu 0,32 detik. Pola klasifikasi dari algoritma terbaik menghasilkan 77 aturan  yang membagi 3 zona klasifikasi yaitu rendah, sedang, dan tinggi. Atribut yang berpengaruh dalam klasifikasi zona risiko Covid-19 yaitu aktif, CR, CFR, laju insidensi, positif, dan meninggal.

 

Abstract

The Covid-19 pandemic occurred in Indonesia. The government is trying to handle Covid-19, one of which is by making a Covid-19 risk map. The Covid-19 risk map divides zones based on Regency/City. The Covid-19 risk zone is the government's benchmark policy for each region. The government uses a weighting of 15 indicators to determine the zone. Several times the Covid-19 risk zone change on the website has been delayed. Classification can be an alternative to determining the Covid-19 risk zone,  that zone changes can be quickly and efficiently. Many algorithms can be used for classification. Several classification algorithms have good accuracy with relatively fast time are Decision Tree, K-Nearest Neighbor, and Naïve Bayes. The purpose of this study is to calculate the performance of each algorithm, get the best algorithm, and get the classification pattern from the best algorithm. The research method uses 10-fold cross validation for data sharing and confusion matrix to assess performance. The software used is Rapidminer. The results show that all algorithms have good performance values, which are above 70%. All algorithms do not require a long time in modeling. The best performance value using a Decision Tree algorithm. The classification pattern of the best algorithm produces 20 rules that divide 3 classification zones, namely low, medium, and high. Attributes that influence the classification of the Covid-19 risk zone are active, CR, CFR, incidence rate, positive, and death.


 

 

Downloads

Download data is not yet available.

Referensi

ALMOAMMAR, A., ALHENAKI, L. and KURDI, H., 2019. Selecting Accurate Classifier Models for a MERS-CoV Dataset. Proceedings of SAI Intelligent Systems Conference, [online] pp.1070–1084.

ALTALHI, A.H., LUNA, J.M., VALLEJO, M.A. and VENTURA, S., 2017.

Evaluation and Comparison of Open Source Software Suites for Data Mining and Knowledge Discovery. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 7(3).

ASDAGHI, F. and SOLEIMANI, A., 2019. An Effective Feature Selection Method for Web Spam Detection. Knowledge-Based Systems, [online] 166, pp.198–206.

AZIS, H., TANGGUH ADMOJO, F. and SUSANTI, E., 2020. Analisis Perbandingan Performa Metode Klasifikasi pada Dataset Multiclass Citra Busur Panah. Techno.Com, 19(3), pp.286–294.

BALIARSINGH, S.K., DING, W., VIPSITA, S. and BAKSHI, S., 2019. A Memetic Algorithm Using Emperor Penguin and Social Engineering Optimization for Medical Data Classification. Applied Soft Computing Journal, [online] 85, p.105773.

BIRD, J.J., BARNES, C.M., PREMEBIDA, C., EKÁRT, A. and FARIA, D.R., 2020. Country-Level Pandemic Risk and Preparedness Classification Based on COVID-19 Data: A Machine Learning Approach. Plos One, [online] 15(10), p.e0241332.

DIAO, Y., LIU, X., WANG, T., ZENG, X., DONG, C., ZHANG, Y., ZHOU, C., SHE, X., LIU, D. and HU, Z., 2020. Estimating the Cure Rate and Case Fatality Rate of the Ongoing Epidemic COVID-19. medRxiv.

FIBRIANDA, M.F. and BHAWIYUGA, A., 2018. Analisis Perbandingan Akurasi Deteksi Serangan Pada Jaringan Komputer Dengan Metode Naïve Bayes Dan Support Vector Machine ( SVM ). Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, 2(9), pp.3112–3123.

FU, L., LIANG, P., LI, X. and YANG, C., 2021. A Machine Learning Based Ensemble Method for Automatic Multiclass Classification of Decisions. In: International Conference on Evaluation and Assessment in Software Engineering. [online] Available at: <http://arxiv.org/abs/2105.01011%0Ahttp://dx.doi.org/10.1145/3463274.3463325>.

IKBAL, M., ANDRYANA, S. and SARI, R.T.K., 2021. Visualisasi dan Analisa Data Penyebaran Covid-19 dengan Metode Klasifikasi Naïve Bayes. Jurnal JTIK (Jurnal Teknologi Informasi dan Komunikasi), [online] 5(4), pp.389–394.

Kazemi-Karyani, A., Safari-Faramani, R., Amini, S., Ramezani-Doroh, V., Berenjian, F., Dizaj, M.Y., Hashempour, R. and Dizaj, J.Y., 2020. World One‑Hundred Days After COVID‑19 Outbreak: Incidence, Case Fatality Rate, and Trend. Journal of Education and Health Promotion, pp.1–10.

KRSTINIĆ, D., BRAOVIĆ, M., ŠERIĆ, L. and BOŽIĆ-ŠTULIĆ, D., 2020. Multi-label Classifier Performance Evaluation with Confusion Matrix. Computer Science & Information Technology, pp.01–14.

LISHANIA, I., GOEJANTORO, R. and NASUTION, Y.N., 2019. Perbandingan Klasifikasi Metode Naive Bayes dan Metode Decision Tree Algoritma (J48) pada Pasien Penderita Penyakit Stroke di RSUD Abdul Wahab Sjahranie Samarinda. Jurnal Eksponensial, 10(2), pp.135–142.

NAHDA, K. and HARJITO, D.A., 2021. Pengaruh Corporate Social Responsibility Terhadap Nilai Perusahaan Dengan Corporate Governance Sebagai Variabel Moderasi. Jurnal Siasat Bisnis, 15(1), pp.1–12.

NOVIANTI, D., 2019. Implementasi Algoritma Naïve Bayes pada Data Set Hepatitis Menggunakan Rapid Miner. Paradigma, [online] 21(2), pp.143–148. Available at:

QURROHMAN, T., HENDERI, WARNARS, H.L.H.S. and MAULANA, M., 2020. Covid-19 Series : Determines the Status Regional Zones of Covid-19 in Jakarta using Decision Tree and C4.5 Algorithm. Solid State Technology, 63, pp.1–8.

RUSTAN and HANDAYANI, L., 2020. The Outbreak’s Modeling of Coronavirus (Covid-19) using the Modified Seir Model in Indonesia. SPEKTRA: Jurnal Fisika dan Aplikasinya, 5(1), pp.61–68.

SADALI, M.I. and ROSEWIDIADARI, E.L., 2020. Aplikasi Analisis Keruangan dalam Kebijakan Menghadapi Covid-19 Di Indonesia. In: Rembug Pageblug: Dampak, Respon dan Konsekuensi Pendemi Covid-19 dalam Dinamika Wilayah, 1st ed. Yogyakarta: Badan Penerbit Fakultas Geografi (BPFG) UGM.pp.266–286.

SAPUTRA, M.F.A., WIDIYANINGTYAS, T. and WIBAWA, A.P., 2018. Illiteracy Classification Using K Means-Naïve Bayes Algorithm. International Journal on Informatics Visualization, 2(3), pp.153–158.

SYARIFUDIN, 2020. Model Baru Kepemimpinan dan Pengelolaan Nusantara Modal Atasi Bencana, Gangguan dan Sukseskan Pembangunan (Sebuah Gagasan). In: Covid19 & Disrupsi Tatanan Sosial Budaya, Ekonomi, Politik dan Multi (Catatan Akademisi, Jurnalis, Aktifis dan Diaspora). Bandarlampung: Pustaka Media.pp.335–342.

TAPISA, E.F., 2020. Posisi Hukum Internasional di Tengah Era New Normal: Kepentingan Nasional atau Kepatuhan Internasional. In: Prosding Seminar Nasional Hukum & Teknologi. [online] pp.215–245.

TRUICA, C.-O. and LEORDEANU, C.A., 2017. Classification of an Imbalanced Data Set Using Decision Tree Algorithms. U.P.B. Sci. Bull, 79(C).

UL HASSAN, C.A., KHAN, M.S. and SHAH, M.A., 2018. Comparison of machine learning algorithms in data classification. ICAC 2018 - 2018 24th IEEE International Conference on Automation and Computing: Improving Productivity through Automation and Computing, (September), pp.1–6.

ÜNAL, Y. and DUDAK, M.N., 2020. Classification of Covid-19 Dataset with Some Machine Learning Methods. Journal of Amasya University The Institute of Science and Technology (JAUIST), 1, pp.30–37.

Diterbitkan

28-02-2023

Terbitan

Bagian

Ilmu Komputer

Cara Mengutip

Analisis Performa Algoritma Decision Tree, Naive Bayes, K-Nearest Neighbor untuk Klasifikasi Zona Daerah Risiko Covid-19 di Indonesia. (2023). Jurnal Teknologi Informasi Dan Ilmu Komputer, 10(1), 115-122. https://doi.org/10.25126/jtiik.20231015935