Analisis Kinerja Intrusion Detection System Berbasis Algoritma Random Forest Menggunakan Dataset Unbalanced Honeynet BSSN

Penulis

DOI:

https://doi.org/10.25126/jtiik.1148911

Kata Kunci:

Random Forest, Intrusion Detection System, CIC-ToN-IoT Dataset, Honeynet BSSN Dataset

Abstrak

Teknologi dan sistem informasi yang semakin berkembang menjadikan ancaman siber juga semakin meningkat. Pada tahun 2023, Indonesia menduduki peringkat pertama sebagai negara dengan sumber serangan tertinggi. Untuk mengatasi permasalahan tersebut, Intrusion Detection System (IDS) dijadikan solusi di berbagai sistem pemerintahan, bekerja sama dengan Honeynet BSSN. Namun, IDS ini tidak bekerja maksimal untuk mendeteksi jenis serangan baru yang belum pernah terjadi sebelumnya (zero-day). Untuk meningkatkan performa IDS salah satunya dengan menggunakan machine learning. Pada penelitian ini, diusulkan desain IDS berbasis algoritma random forest menggunakan dataset CIC-ToN-IoT sebagai dataset whitelist dan dataset Honeynet BSSN sebagai dataset blacklist. Model mengklasifikasikan 10 (sepuluh) klasifikasi yaitu Benign, Information Leak, Malware, Trojan Activity, Information Gathering, APT, Exploit, Web Application Attack, Denial of Service (DoS), dan jenis serangan lainnya (other). Hasil analisis menunjukkan bahwa pemodelan IDS based on machine learning memiliki rata-rata nilai akurasi lebih dari 90%, nilai presisi 91%, nilai recall 90%, dan F1-score 90%. Untuk kelas klasifikasi dengan jumlah data support besar memiliki nilai presisi yang jauh lebih baik dibandingkan kelas klasifikasi dengan jumlah data support lebih sedikit. Dengan demikian, pemodelan machine learning yang dibuat dapat secara efektif dalam menganalisis berbagai serangan yang terjadi pada sistem informasi di Lingkungan Pemerintah terutama pada klasifikasi data dengan jumlah yang besar.

 

Abstract

 

As technology and information systems continue to develop, cyber threats also increase. In 2023, Indonesia will be ranked first as the country with the highest source of attacks. To overcome this problem, the Intrusion Detection System (IDS) is used as a solution in various government systems, in collaboration with Honeynet BSSN. However, this IDS doesn’t work optimally to detect new types of attacks that have never happened before (zero-day). One way to improve IDS performance is by using machine learning. In this research, we propose an IDS design based on a random forest algorithm with the CIC-ToN-IoT dataset as a whitelist dataset and the Honeynet BSSN dataset as a blacklist dataset. The model classifies 10 (ten) classifications, namely Benign, Information Leak, Malware, Trojan Activity, Information Gathering, APT, Exploit, Web Application Attack, Denial of Service (DoS), and other types of attacks. The analysis results show that IDS modeling based on machine learning has an average accuracy value of more than 90%, a precision value of 91%, a recall value of 90%, and an F1 score of 90%. For the classification of large amounts of data, the precision value is much better than for the classification of data with smaller amounts. Thus, the machine learning modeling created can effectively analyze various attacks that occur on information systems in the government environment, especially in the classification of large amounts of data.

Downloads

Download data is not yet available.

Referensi

ALVARO, E., GAMEZ, M., dan GARCIA, N., 2019. Ensemble Classification Methods with Applications in R. Vol. 7. John Wiley.

AMBARWATI, A., ADRIAN, Q. J., dan HERDIYENI Y., 2020. Analisis Pengaruh Data Scaling Terhadap Performa Algoritme Machine Learning untuk Identifikasi Tanaman. Jurnal RESTI, 4(1) p. 117-122.

AMOLI, P. V., HAMALAINEN, T., DAVID, G., ZOLOTUKHIN, M., dan MIRZAMOHAMMAD, M., 2016. Unsupervised network intrusion detection systems for zero-day fast-spreading attacks and botnets. JDCTA (International Journal of Digital Content Technology and its Applications,10 (2), p. 1–13.

BHATTACHARYYA, D.K. dan KHALITA, J. K., 2014. Network Anomaly Detection: A Machine Learning Perspective. Florida: CRC Press.

BOUCKAERT, R.R., FRANK, E., HALL, M., KIRKBY, R., REUTEMANN, P., SEEWALD, A., dan SCUSE, D., 2016. WEKA Manual Version 3-8-1. New Zealand: University of Waikato.

EIJK, V.V.D., SCHUIJT, C., 2020. Detecting cobalt strike beacons in netflow data.

IRFAN, B. M., POORNIMA, V., KUMAR, S.M., ASWAL, U.S., KRISHNAMOORTHY, N., dan

MARANAN, R., 2023. Machine Learning Algorithms for Intrusion Detection Performance Evaluation and Comparative Analysis. 4th International Conference on Smart Electronics and Communication (ICOSEC). India: Trichy.

KETTANI, H. dan WAINWRIGHT, P., 2019. On the Top Threats to Cyber Systems. IEEE 2nd International Conference on Information and Computer Technologies (ICICT), Kahului, HI, USA, pp. 175-179.

BSSN (Badan Siber dan Sandi Negara), 2023. Laporan Tahunan Honeynet Project BSSN Tahun 2023. Jakarta: Badan Siber dan Sandi Negara.

LEON, M., MARKOVIC, T. L, dan PUNNEKKAT, S., 2022. Comparative Evaluation of Machine Learning Algorithms for Network Intrusion Detection and Attack Classification. International Joint Conference on Neural Networks (IJCNN). Italia: Padua.

MISHRA, P., VARADHARAJAN, V., TUPAKULA, U., dan PILLI, E. S., 2019. A Detailed Investigation and Analysis of Using Machine Learning Techniques for Intrusion Detection. IEEE Communication Surveys and Tutorials, 21(1).

MUSHTAQ, E., SHAHID, F., dan ZAMEER, A., 2022. A comparative study of machine learning models for malware detection. 19th International Bhurban Conference on Applied Sciences and Technology (IBCAST). Pakistan: Islamabad.

GONG, M. 2021. A novel performance measure for machine learning classification. International Journal of Managing Information Technology (IJMIT), 13. DOI:10.5121/ijmit.2021.13101.

ONO, J.P., FREIRE, J. AND SILVA, C.T., 2021. Interactive Data Visualization in Jupyter Notebooks. Computing in Science & Engineering 23(2), p.99-106. https://doi.org/10.1109/MCSE.2021.3052619.

Peraturan Kepala Badan Siber Dan Sandi Negara Nomor 6 Tahun 2023 tentang Penyelenggaraan Layanan Honeynet Badan Siber dan Sandi Negara. Jakarta: Badan Siber dan Sandi Negara.

RANI, M.J. dan SINGH, D., 2023. Machine Learning Algorithm for Intrusion Detection: Performance Evaluation and Comparative Analysis. 7th International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC). Nepal: Kirtipur.

SARHAN, M., 2020. Netflow datasets. Available at: http://staff.itee.uq.edu.au/marius/NIDS_datasets/.

SARHAN, M., LAYEGHY, S., PORTMANN, M. 2023. Dataset CIC-ToN-IoT. The University of Queensland, Australia. https://rdm.uq.edu.au/files/127784c0-ef9d-11ed-a964-b70596e96ad5.

SARNAN, M. 2023. The Detection of Network Cyber Attacks Using Machine Learning. Australia: The University of Queensland.

SHARIF, M.H.U dan MOHAMMED, A., 2022. A literature review of financial losses statistics for cyber security and future trend. World Journal of Advanced Research and Reviews 15, pp. 138-156.

TEODORO, P.G., VERDEJO, J.D., FERNANEZ, G.M., dan VAZQUEZ, E., 2009. Anomaly-based network intrusion detection: Techniques, systems and challenges. Computers & Security, 28(1-2), p.18–28.

TIDJON, L. N., FRAPPIER, M., dan MAMMAR, A., 2019. Intrusion Detection Systems: A CrossDomain Overview. IEEE Communications Surveys & Tutorials, 21(4).

TIM HONEYNET PROJECT BSSN, 2023. Jakarta: BADAN SIBER DAN SANDI NEGARA.

ZAINUDIN, Z., SHAMSUDDIN, S.M., dan HASAN, S., 2019. Deep learning for image processing in WEKA environment. International Journal of Advances in Soft Computing and its Applications, 11(1), p. 1-21.

ZHOU, Q. dan PEZAROS D., 2019. Evaluation of Machine Learning Classifiers for Zero-Daya Intrusion Detection-An Analysis on CIC-AWS-2018 Dataset. in ArXiv, abs/1905.03685.

Diterbitkan

26-08-2024

Terbitan

Bagian

Ilmu Komputer

Cara Mengutip

Analisis Kinerja Intrusion Detection System Berbasis Algoritma Random Forest Menggunakan Dataset Unbalanced Honeynet BSSN. (2024). Jurnal Teknologi Informasi Dan Ilmu Komputer, 11(4), 867-876. https://doi.org/10.25126/jtiik.1148911