Optimasi Model Extreme Gradient Boosting Dalam Upaya Penentuan Tingkat Risiko Pada Ibu Hamil Berbasis Bayesian Optimization (BOXGB)

Penulis

  • Edi Jaya Kusuma Universitas Dian Nuswantoro, Semarang
  • Ririn Nurmandhani Universitas Dian Nuswantoro, Semarang
  • Lenci Aryani Universitas Dian Nuswantoro, Semarang
  • Ika Pantiawati Universitas Dian Nuswantoro, Semarang
  • Guruh Fajar Shidik Universitas Dian Nuswantoro, Semarang

DOI:

https://doi.org/10.25126/jtiik.20251219001

Kata Kunci:

bayesian optimization, hyper-parameter, machine learning, maternal risk level

Abstrak

Kehamilan pada ibu hamil memiliki beragam risiko selama prosesnya seperti preeklampsia, diabetes dan hipertensi gestational. Seiring dengan perkembangan teknologi dan pemanfaatan data, implementasi machine learning dalam pengembangan early diagnosis system untuk tingkat risiko kehamilan telah banyak dilakukan. Namun kendala dalam penerapan machine learning adalah sulitnya menemukan konfigurasi parameter yang tepat agar model machine learning mampu memberikan akurasi prediksi yang mumpuni. Pada penelitian ini diusulkan metode optimasi berbasis Bayesian untuk mengoptimalisasikan hyper-parameter dari model Decision Tree (DT) dan Extreme Gradient Boosting (XGB). Kedua model teroptimasi tersebut dilatih dan diuji dengan menggunakan data risiko kehamilan yang diperoleh dari hasil pengukuran medis pada ibu hamil. Dari hasil evaluasi diketahui terdapat pengaruh jumlah iterasi pada Bayesian Optimization (BO). Implementasi BO pada model Decision Tree (BODT) menunjukkan adanya sedikit peningkatan nilai performa dibandingan dengan penelitian sebelumnya. Sementara itu, capaian performa tertinggi diperoleh oleh kombinasi model XGB dan Bayesian (BOXGB) dimana capaian nilai akurasi pada model BOXGB yaitu 87% diikuti dengan nilai rata-rata presisi, recall, dan F1-score masing-masing sebesar 88%, 87%, dan 88%. Secara keseluruhan implementasi Bayesian Optimization mampu memberikan setelan hyper-parameter yang dapat meningkatkan kemampuan model machine learning khususnya dalam memprediksi tingkat risiko kehamilan pada ibu hamil berdasarkan data pengukuran klinis.

 

Abstract

During pregnancy process there are various risks such as preeclampsia, gestational diabetes and gestational hypertension. Along with the developments in technology as well as data science, the implementation of machine learning in early diagnosis system for pregnancy risk levels prediction has been widely carried out. However, there is a challenge in implementing machine learning, which is find the suitable yet effective parameter configuration in training machine learning model to provides better prediction accuracy. This research proposes a Bayesian-based Optimization (BO) method to tune up the hyper-parameters of Decision Tree (DT) and Extreme Gradient Boosting (XGB) models. These two optimized models were trained and tested using maternal risk dataset obtained from the clinical-based measurement on pregnant woman. From the evaluation result, it can be found that the number of iterations has high influence on the BO performance. The implementation of BO toward DT model has slight increase in performance result compared to the previous research. Meanwhile, the highest performance result achieved by the combination of BO and XGB (BOXGB) model where the proposed model reaches 87% of accuracy, followed by average value of precision, recall, and F1-score of 88%, 87%, and 88%, respectively. Overall, the implementation of BO is able to direct the hyper-parameter configuration which improves the machine learning performance especially in predicting maternal risk level based on clinical-based measurement data.

Downloads

Download data is not yet available.

Referensi

AHMED, M., 2020. Maternal Health Risk [Dataset]. UCI Machine Learning Repository. [online] https://doi.org/https://doi.org/10.24432/C5DP5D.

AHMED, M., KASHEM, M.A., RAHMAN, M. AND KHATUN, S., 2020. Review and Analysis of Risk Factor of Maternal Health in Remote Area Using the Internet of Things (IoT). [online] pp.357–365. https://doi.org/10.1007/978-981-15-2317-5_30.

AHSAN, M., MAHMUD, M., SAHA, P., GUPTA, K. AND SIDDIQUE, Z., 2021. Effect of Data Scaling Methods on Machine Learning Algorithms and Model Performance. Technologies, [online] 9(3), p.52. https://doi.org/10.3390/technologies9030052.

ALLAAM, F., PRASETIO, B.H. AND MAULANA, R., 2023. Sistem Deteksi Dini Penyakit Preeklampsia Melalui Perubahan Warna Urine Berdasarkan Protein dengan Menggunakan Metode Naive Bayes Classifier. Jurnal Teknologi Informasi dan Ilmu Komputer, 10(4), pp.807–814. https://doi.org/10.25126/jtiik.20241046908.

BACH, P., SCHACHT, O., CHERNOZHUKOV, V., KLAASSEN, S. AND SPINDLER, M., 2024. Hyperparameter Tuning for Causal Inference with Double Machine Learning: A Simulation Study. [online] (2018), pp.1–53. Available at: <http://arxiv.org/abs/2402.04674>.

BUNGA, F.R., FLORA, S. AND TARIGAN, N., 2023. Analisis Faktor Risiko Kejadian Hipertensi Pada Ibu Hamil di Puskesmas Telaga Kabupaten Gorontalo. Health Information : Jurnal Penelitian, [online] 15(2), pp.1–6. Available at: <https://myjurnal.poltekkes-kdi.ac.id/index.php/hijp/article/view/1048>.

ELSHEWEY, A.M., SHAMS, M.Y., EL-RASHIDY, N., ELHADY, A.M., SHOHIEB, S.M. & TAREK, Z., 2023. Bayesian Optimization with Support Vector Machine Model for Parkinson Disease Classification. Sensors, 23(4), pp.1–21. https://doi.org/10.3390/s23042085.

FARIAS, F., LUDERMIR, T. & BASTOS-FILHO, C., 2020. Similarity based stratified splitting: an approach to train better classifiers. arXiv preprint arXiv:2010.06099.

GALUZZI, B.G., GIORDANI, I., CANDELIERI, A., PEREGO, R. & ARCHETTI, F., 2020. Bayesian Optimization for Recommender System. [online] pp.751–760. https://doi.org/10.1007/978-3-030-21803-4_75.

GARNETT, R., 2023. Bayesian Optimization. [online] Cambridge University Press. https://doi.org/10.1017/9781108348973.

HOSSAIN, M.M., KASHEM, M.A., NAYAN, N.M. & CHOWDHURY, M.A., 2024. A Medical Cyber-physical system for predicting maternal health in developing countries using machine learning. Healthcare Analytics, [online] 5(October 2023), p.100285. https://doi.org/10.1016/j.health.2023.100285.

HUO, T., GLUECK, D.H., SHENKMAN, E.A. & MULLER, K.E., 2023. Stratified split sampling of electronic health records. BMC Medical Research Methodology, [online] 23(1), p.128. https://doi.org/10.1186/s12874-023-01938-0.

ISYTI’AROH, I., SUGIHARTO, S., ROFIQOH, S. & WIDYASTUTI, W., 2023. Studi Awal Resiko Gestasional Diabetik Melitus di Wilayah Kerja Puskesmas Bojong I Kabupaten Pekalongan. Jurnal Ilmiah Kesehatan Keperawatan, [online] 19(1), p.50. https://doi.org/10.26753/jikk.v19i1.1088.

IZONIN, I., TKACHENKO, R., SHAKHOVSKA, N., ILCHYSHYN, B. & SINGH, K.K., 2022. A Two-Step Data Normalization Approach for Improving Classification Accuracy in the Medical Diagnosis Domain. Mathematics, [online] 10(11), p.1942. https://doi.org/10.3390/math10111942.

KUSUMA, E.J., NURMANDHANI, R. & HANDAYANI, S., 2022. Evaluasi Identifikasi Kanker Serviks Berdasarkan Data Risiko Perilaku dengan Data Mining. JPKM: Jurnal Profesi Kesehatan Masyarakat, [online] 3(1), pp.9–19. https://doi.org/10.47575/jpkm.v3i1.266.

KUSUMA, E.J., PANTIAWATI, I. & HANDAYANI, S., 2022. Melanoma Classification based on Simulated Annealing Optimization in Neural Network. Knowledge Engineering and Data Science, [online] 4(2), p.97. https://doi.org/10.17977/um018v4i22021p97-104.

LIU, Z., JIANG, P., DE BOCK, K.W., WANG, J., ZHANG, L. & NIU, X., 2024. Extreme gradient boosting trees with efficient Bayesian optimization for profit-driven customer churn prediction. Technological Forecasting and Social Change, [online] 198, p.122945. https://doi.org/10.1016/j.techfore.2023.122945.

LU, H.Y., DING, X., HIRST, J.E., YANG, Y., YANG, J., MACKILLOP, L. & CLIFTON, D., 2023. Digital Health and Machine Learning Technologies for Blood Glucose Monitoring and Management of Gestational Diabetes. IEEE Reviews in Biomedical Engineering, [online] pp.1–19. https://doi.org/10.1109/RBME.2023.3242261.

MAHMUD SUJON, K., BINTI HASSAN, R., TUSNIA TOWSHI, Z., OTHMAN, M.A., ABDUS SAMAD, M. & CHOI, K., 2024. When to Use Standardization and Normalization: Empirical Evidence From Machine Learning Models and XAI. IEEE Access, [online] 12, pp.135300–135314. https://doi.org/10.1109/ACCESS.2024.3462434.

MERRILLEES, M. & DU, L., 2021. Stratified sampling for extreme multi-label data. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining. pp.334–345. https://doi.org/https://doi.org/10.48550/arXiv.2103.03494.

MUTLU, H.B., DURMAZ, F., YÜCEL, N., CENGİL, E. & YILDIRIM, M., 2023. Prediction of Maternal Health Risk with Traditional Machine Learning Methods. NATURENGS MTU Journal of Engineering and Natural Sciences Malatya Turgut Ozal University, 4(1), pp.16–23. https://doi.org/10.46572/naturengs.1293185.

PINANDITO, A., WICAKSONO, S.A. & WIJOYO, S.H., 2023. Implementasi Machine Learning dalam Deteksi Risiko Tinggi Diabetes Melitus pada Kehamilan. Jurnal Teknologi Informasi dan Ilmu Komputer, 10(4), pp.739–746. https://doi.org/10.25126/jtiik.20241047005.

PIRMANSYAH, E. & BERAWI, K.N., 2023. Faktor-Faktor Yang Berhubungan Dengan Kejadian Preeklampsia Pada Ibu Hamil: Tinjauan Pustaka. Medical Profession Journal of Lampung, 13(4), pp.575–577. https://doi.org/https://doi.org/10.53089/medula.v13i4.757.

WU, J., CHEN, X.Y., ZHANG, H., XIONG, L.D., LEI, H. & DENG, S.H., 2019. Hyperparameter optimization for machine learning models based on Bayesian optimization. Journal of Electronic Science and Technology, [online] 17(1), pp.26–40. https://doi.org/10.11989/JEST.1674-862X.80904120.

ZHANG, P., JIA, Y. & SHANG, Y., 2022. Research and application of XGBoost in imbalanced data. International Journal of Distributed Sensor Networks, [online] 18(6), p.155013292211069. https://doi.org/10.1177/15501329221106935.

ZHAO, W., LI, J., ZHAO, J., ZHAO, D., LU, J. & WANG, X., 2020. XGB model: Research on evaporation duct height prediction based on XGBoost algorithm. Radioengineering, 29(1), pp.81–93. https://doi.org/10.13164/re.2020.0081.

Diterbitkan

27-02-2025

Terbitan

Bagian

Ilmu Komputer

Cara Mengutip

Optimasi Model Extreme Gradient Boosting Dalam Upaya Penentuan Tingkat Risiko Pada Ibu Hamil Berbasis Bayesian Optimization (BOXGB). (2025). Jurnal Teknologi Informasi Dan Ilmu Komputer, 12(1), 111-120. https://doi.org/10.25126/jtiik.20251219001