Perbandingan ANN, Random Forest, dan XGBoost dalam Klasifikasi Antibiotik dengan Penerapan metode Sampling

Penulis

  • Edy Saputra Rusdi Universitas Hasanuddin, Makasar https://orcid.org/0000-0001-5714-9111
  • A. Muh. Amil Siddik Universitas Hasanuddin, Makasar
  • Naimah Aris Universitas Hasanuddin, Makasar
  • Muhammad Ardiansyah Asrifah Universitas Hasanuddin, Makasar
  • Nur Hilal A. Syahrir Universitas Sulawesi Barat, Majene
  • Aidawayati Rangkuti Universitas Hasanuddin, Makasar
  • Wahyudi Rusdi IAIN Sultan Amai Gorontalo, Gorontalo

DOI:

https://doi.org/10.25126/jtiik.124

Abstrak

Banyak obat potensial telah ditemukan dari produk alami laut (Marine Natural Product). Hal ini menunjukkan bahwa senyawa laut merupakan sumber penting dalam pengembangan dan penemuan obat. Meskipun banyak senyawa laut yang menunjukkan aktivitas biologis tertentu, hanya sedikit yang tercatat sebagai senyawa antibakteri. Oleh karena itu, menemukan senyawa yang berpotensi sebagai senyawa antibakteri dari organisme laut masih menjadi tantangan. Tujuan dari penelitian ini adalah untuk memanfaatkan pendekatan komputasi untuk menemukan senyawa antibakteri dari produk alami laut yang berpotensi menjadi obat. Penelitian ini berfokus pada penggunaan model Artificial Neural Network (ANN), Random Forest, dan XGBoost untuk melakukan klasifikasi berdasarkan kemiripan kimiawi antara senyawa produk alami laut di Indonesia dengan senyawa antibakteri. Untuk mengatasi ketidakseimbangan data, digunakan teknik resampling berupa SMOTE dan undersampling (US). Hasil penelitian menunjukkan bahwa akurasi XGBoost + SMOTE memiliki nilai yang paling tinggi, yaitu 98.89%, mengungguli model ANN 97.57%, Random Forest  (RF) 97.06%, serta model dengan resampling lain seperti ANN+SMOTE 98.67% dan RF + SMOTE 98.59%. Sementara itu, penerapan teknik undersampling menyebabkan penurunan akurasi secara signifikan, di mana XGBoost + US, RF + US, dan ANN + US masing-masing hanya mencapai 91.12%, 91.59%, dan 87.85%. Dari 73 senyawa biota laut, hanya senyawa yang memiliki CID 101767277 yang diprediksi sebagai senyawa yang potensial sebagai antibakteri.

 

Abstract

Many potential drugs have been discovered from marine natural products. This suggests that marine compounds are essential in drug development and discovery. Although many marine compounds exhibit certain biological activities, only a few have been recorded as antibacterial compounds. Therefore, finding compounds with potential as antibacterial compounds from marine organisms remains a challenge. This paper aims to utilize computational approaches to discover antibacterial compounds from marine natural products that have the potential to become drugs. This research focuses on the use of Artificial Neural Network (ANN), Random Forest (RF), and XGBoost models to perform classification based on chemical similarity between compounds of marine natural products in Indonesia and antibacterial compounds. To overcome data imbalance, resampling techniques such as SMOTE and undersampling (US) were used. The results showed that the accuracy of XGBoost + SMOTE has the highest value, which is 98.89%, outperforming the ANN model 97.57%, Random Forest (RF) 97.06%, as well as models with other resampling such as ANN+SMOTE 98.67% and RF + SMOTE 98.59%. Meanwhile, the application of undersampling techniques caused a significant decrease in accuracy, where XGBoost + US, RF + US, and ANN + US only reached 91.12%, 91.59%, and 87.85%, respectively. Of the 73 marine biota compounds, only compounds that have CID 101767277 are predicted as potential antibacterial compounds.

Downloads

Download data is not yet available.

Referensi

BANERJEE, D., KUKREJA, V., HARIHARAN, S., JAIN, V. & JINDAL, V., 2023. Predicting Tulip Leaf Diseases: A Integrated CNN and Random Forest Approach. In: 2023 World Conference on Communication & Computing (WCONF). pp.1–6.

CARVALHO, I.T. & SANTOS, L., 2016. Antibiotics in the aquatic environments: A review of the European scenario. Environment International, [online] 94, pp.736–757.

CHACHOUI, Y., AZIZI, N., HOTTE, R. AND BENSEBAA, T., 2024. Enhancing algorithmic assessment in education: Equi-fused-data-based SMOTE for balanced learning. Computers and Education: Artificial Intelligence, [online] 6, p.100222.

CHOUDHARY, A., NAUGHTON, L.M., MONTÁNCHEZ, I., DOBSON, A.D.W. & RAI, D.K., 2017. Current Status and Future Prospects of Marine Natural Products (MNPs) as Antimicrobials. Marine Drugs, [online] 15(9).

DURRANT, J.D. & AMARO, R.E., 2015. Machine-learning techniques applied to antibacterial drug discovery. Chemical Biology and Drug Design.

FYMAT, A.L., 2017. Antibiotics and Antibiotic Resistance. Biomedical Journal of Scientific & Technical Research, 1(1).

GAO, Y., LI, H., ZHAO, C., LI, S., YIN, G. & WANG, H., 2024. Machine learning and feature extraction for rapid antimicrobial resistance prediction of Acinetobacter baumannii from whole-genome sequencing data. Frontiers in Microbiology, [online] 14.

HANIF, N., MURNI, A., TANAKA, C. & TANAKA, J., 2019. Marine natural products from Indonesian waters. Marine Drugs,

JALALI, R. & ETEMADFARD, H., 2024. Spatio-temporal analysis of COVID-19 lockdown effect to survive in the US counties using ANN. Scientific Reports, [online] 14(1), p.19608.

KIM, S., CHEN, J., CHENG, T., GINDULYTE, A., HE, J., HE, S., LI, Q., SHOEMAKER, B.A., THIESSEN, P.A., YU, B., ZASLAVSKY, L., ZHANG, J. & BOLTON, E.E., 2021. PubChem in 2021: New data content and improved web interfaces. Nucleic Acids Research, 49(D1), pp.D1388–D1395.

LI, W.-X., TONG, X., YANG, P.-P., ZHENG, Y., LIANG, J.-H., LI, G.-H., LIU, D., GUAN, D.-G. & DAI, S.-X., 2022. Screening of antibacterial compounds with novel structure from the FDA approved drugs using machine learning methods. Aging, 14(3), pp.1448–1472.

LV, F. & ZENG, Y., 2024. Novel Bioactive Natural Products from Marine-Derived Penicillium Fungi: A Review (2021–2023). Marine Drugs,

MAKARIOU, D., BARRIEU, P. & CHEN, Y., 2021. A random forest based approach for predicting spreads in the primary catastrophe bond market. Insurance: Mathematics and Economics, [online] 101, pp.140–162.

NAYAN KUMAR SINHA, 2020. Developing A Web based System for Breast Cancer Prediction using XGboost Classifier. International Journal of Engineering Research and, V9(06)..

RAMDANI, F. & FURQON, M.T., 2022. The simplicity of XGBoost algorithm versus the complexity of Random Forest, Support Vector Machine, and Neural Networks algorithms in urban forest classification. F1000Res. [online]

RUSDI, E.S., SYAHRIR, N.H.A., SIDDIK, A.MUH.A., AMIR, S.B.H. & RUSDI, W., 2023. Graph Clustering Based on Chemical Similarity in Marine Compounds and Antibacterial Compounds. pp.329–338.

SARVANANDA, L. & D PREMARATHNE, A., 2022. The Growing Of Antibiotic Resistance: A Short Viewpoint. Pharmaceutics and Pharmacology Research, 5(3), pp.01–02.

SUN, Z., YING, W., ZHANG, W. & GONG, S., 2024. Undersampling method based on minority class density for imbalanced data. Expert Systems with Applications, [online] 249, p.123328.

WIDODO, A.O., SETIAWAN, B. & INDRASWARI, R., 2024. Machine Learning-Based Intrusion Detection on Multi-Class Imbalanced Dataset Using SMOTE. In: Procedia Computer Science. Elsevier B.V. pp.578–583.

WISHART, D.S., FEUNANG, Y.D., GUO, A.C., LO, E.J., MARCU, A., GRANT, J.R., SAJED, T., JOHNSON, D., LI, C., SAYEEDA, Z., ASSEMPOUR, N., IYNKKARAN, I., LIU, Y., MACIEJEWSKI, A., GALE, N., WILSON, A., CHIN, L., CUMMINGS, R., LE, DI., PON, A., KNOX, C. & WILSON, M., 2018. DrugBank 5.0: A major update to the DrugBank database for 2018. Nucleic Acids Research, 46(D1), pp.D1074–D1082.

ZHANG, D. & GONG, Y., 2020. The Comparison of LightGBM and XGBoost Coupling Factor Analysis and Prediagnosis of Acute Liver Failure. IEEE Access, 8, pp.220990–221003.

Diterbitkan

29-08-2025

Terbitan

Bagian

Ilmu Komputer

Cara Mengutip

Perbandingan ANN, Random Forest, dan XGBoost dalam Klasifikasi Antibiotik dengan Penerapan metode Sampling. (2025). Jurnal Teknologi Informasi Dan Ilmu Komputer, 12(4), 943-950. https://doi.org/10.25126/jtiik.124