Multi-Criteria Decision Making dalam Seleksi Fitur Ensemble untuk Prediksi Cacat Perangkat Lunak

Muhammad  Fikri; Rudy Herteno; Radityo  Adi Nugroho; Setyo Wahyu Saputro; Friska Abadi

doi:10.25126/jtiik.2025125

Penulis

Muhammad Fikri Universitas Lambung Mangkurat, Kalimantan Selatan
Rudy Herteno Universitas Lambung Mangkurat, Kalimantan Selatan
Radityo Adi Nugroho Universitas Lambung Mangkurat, Kalimantan Selatan
Setyo Wahyu Saputro Universitas Lambung Mangkurat, Kalimantan Selatan
Friska Abadi Universitas Lambung Mangkurat, Kalimantan Selatan

DOI:

https://doi.org/10.25126/jtiik.2025125

Kata Kunci:

prediksi cacat perangkat lunak, seleksi fitur ensemble, MCDM, random forest

Abstrak

Prediksi cacat perangkat lunak merupakan upaya strategis dalam meningkatkan kualitas produk melalui identifikasi dini modul yang berpotensi cacat. Kinerja prediksi dipengaruhi oleh pemilihan fitur, karena informasi yang berlebihan dan tidak relevan dapat mempengaruhi kualitas pembelajaran model. Seleksi fitur ensemble dinilai efektif dalam menyeleksi fitur yang relevan dengan menggabungkan beberapa metode seleksi fitur berbasis filter. Diperlukan mekanisme integrasi untuk menyatukan hasil dari empat teknik filter—Mutual Information, Fisher Score, Uncertainty dan Relief. Penelitian ini membandingkan empat metode Multi‑Criteria Decision Making—TOPSIS, VIKOR, EDAS, dan WASPAS—yang bekerja dengan merangking nilai relevansi fitur hasil seleksi filter tersebut. Sepuluh fitur teratas dari tiap metode kemudian dievaluasi menggunakan model Random Forest dengan metrik AUC melalui K‑Fold cross‑validation. Dari 12 dataset NASA MDP yang diuji, TOPSIS menunjukkan kinerja paling konsisten dan terbaik dengan nilai rata-rata AUC sebesar 0,8038. Temuan ini menegaskan pentingnya pemilihan metode integrasi yang tepat dalam meningkatkan akurasi prediksi cacat perangkat lunak dan memberikan panduan bagi pengembangan model yang lebih efektif.

Abstract

Software defect prediction is a strategic effort to improve product quality through early identification of potentially defective modules. Prediction performance is influenced by feature selection, because redundant and irrelevant information can affect the quality of model learning. Ensemble feature selection is considered effective in selecting relevant features by combining several filter-based feature selection methods. An integration mechanism is needed to unify the results of four filter techniques—Mutual Information, Fisher Score, Uncertainty and Relief. This study compares four Multi-Criteria Decision Making methods—TOPSIS, VIKOR, EDAS, and WASPAS—which work by ranking the relevance values of the filter-selected features. The top ten features from each method are then evaluated using the Random Forest model with the AUC metric through K-Fold cross-validation. Of the 12 NASA MDP datasets tested, TOPSIS showed the most consistent and best performance with an average AUC value of 0.8038. These findings emphasize the importance of choosing the right integration method in improving the accuracy of software defect prediction and provide guidance for the development of more effective models.

Downloads

Download data is not yet available.

Referensi

ABELLANA, D.P.M., ROXAS, R.R., LAO, D.M., & MAYOL, P.E., LEE, S. 2022. Ensemble feature selection in binary machine learning classification: A novel application of the evaluation based on distance from average solution (EDAS) method. Mathematical Problems in Engineering, https://doi.org/10.1155/2022

AKBAR, A. M., HERTENO, R., SAPUTRO, S. W., FAISAL, M. R., & NUGROHO, R. A. 2024. Optimizing Software Defect Prediction Models: Integrating Hybrid Grey Wolf and Particle Swarm Optimization for Enhanced Feature Selection with Popular Gradient Boosting Algorithm. Journal of Electronics, Electromedical Engineering, and Medical Informatics, 6(2), 169-181. https://doi.org/10.35882/jeeemi.v6i2.388

ALI, M., MAZHAR, T., SHAHZAD, T., GHADI, Y., MOHSIN, S., AKBER, S., & ALI, M. 2023. Analysis of Feature Selection Methods in Software Defect Prediction Models. IEEE Access, 11, 145954-145974. https://doi.org/10.1109/ACCESS.2023.3343249.

ALVES, MA, OLIVEIRA, BAS, & GUIMARÃES, FG 2024. Ensemble ranking: An aggregation of multiple multicriteria methods and scenarios and its application to power generation planning. Decision Analytics Journal, Elsevier, https://doi.org/10.1016/j.dajour.2024.100435

BARRY, M. , NDERU, L. AND GICHUHI, A. 2023 A Hybrid Spatial Dependence Model Based on Radial Basis Function Neural Networks (RBFNN) and Random Forest (RF). Journal of Data Analysis and Information Processing, 11, 293-309. https://doi.org/10.4236/jdaip.2023.113015

BAYNE, C., MCGROSSO, D., SANCHEZ, C., ROSSITTO, L.A., PATTERSON, M., GONZALEZ, C., BAUS, C., VOLK, C., ZHAO, H. N., DORRESTEIN, P., NIZET, V., SAKOULAS, G., GONZALEZ, D. J., & ROSE, W. 2025. Multi-omic signatures of host response associated with presence, type, and outcome of enterococcal bacteremia. mSystems, 10(2), e0147124. https://doi.org/10.1128/msystems.01471-24

ÇORBACIOĞLU, Ş., & AKSEL, G. 2023. Receiver operating characteristic curve analysis in diagnostic accuracy studies: A guide to interpreting the area under the curve value. Turkish Journal of Emergency Medicine, 23, 195 - 198. https://doi.org/10.4103/tjem.tjem_182_23.

FRIEYADIE, S., FONI, R., & SETIAWAN, A. 2024. Improving Software Defect Prediction Performance through Effective Forward Selection Method with Linear Regression for Handling Class Imbalance. 2024 International Conference on Information Technology Research and Innovation (ICITRI), 218-223. https://doi.org/10.1109/ICITRI62858.2024.10699214

GHINAYA, H., HERTENO, R., FAISAL, M. R., FARMADI, A., & INDRIANI, F. 2024. Analysis of Important Features in Software Defect Prediction Using Synthetic Minority Oversampling Techniques (SMOTE), Recursive Feature Elimination (RFE) and Random Forest. Journal of Electronics, Electromedical Engineering, and Medical Informatics, 6(3), 276-288. https://doi.org/10.35882/jeeemi.v6i3.453

HAMID, T.H., SALLEHUDDIN, R., YUNOS, Z.M., & ALI, A. 2021. Ensemble Based Filter Feature Selection with Harmonize Particle Swarm Optimization and Support Vector Machine for Optimal Cancer Classification. https://doi.org/10.1016/j.mlwa.2021.100054

HASHEMI, A., DOWLATSHAHI, M. B., & NEZAMABADI-POUR, H. 2020. MFS-MCDM: Multi-label feature selection using multi-criteria decision making. Knowledge-Based Systems, 106365. https://doi.org/10.1016/j.knosys.2020.106365

HASHEMI, A., DOWLATSHAHI, M., & NEZAMABADI-POUR, H. 2021. Ensemble of feature selection algorithms: a multi-criteria decision-making approach. International Journal of Machine Learning and Cybernetics, 13, 49 - 69. https://doi.org/10.1007/s13042-021-01347-z.

JANANE, F., OUADERHMAN, T., & CHAMLAL, H. 2023. A filter feature selection for high-dimensional data. Journal of Algorithms & Computational Technology, 17. https://doi.org/10.1177/17483026231184171.

KAUR, K., & KUMAR, A.M. 2023. MCDM-EFS: A novel ensemble feature selection method for software defect prediction using multi-criteria decision making. Intell. Decis. Technol., 17, 1283-1296. https://doi.org/10.3233/IDT-230251

KIBRIA, H.B., & MATIN, A. 2022. The Severity Prediction of The Binary And Multi-Class Cardiovascular Disease - A Machine Learning-Based Fusion Approach. Computational biology and chemistry, 98, 107672. https://doi.org/10.1016/j.compbiolchem.2022.107672

KRZYWICKA, M., & WOSIAK, A. 2023. Sensitivity of Standard Evaluation Metrics for Disease Classification and Progression Assessment Based on Whole-Body Imaging. Procedia Computer Science, 225, 4314–4323. https://doi.org/10.1016/j.procs.2023.10.428

MALEKIPIRBAZARI, M., AKSAKALLI, V., SHAFQAT, W., & EBERHARD, A. 2021. Performance comparison of feature selection and extraction methods with random instance selection. Expert Systems with Applications, 179, 115072. https://doi.org/10.1016/j.eswa.2021.115072

PASHA, S., MOHAMED, E., & PASHA, S. 2020. Novel Feature Reduction (NFR) Model With Machine Learning and Data Mining Algorithms for Effective Disease Risk Prediction. IEEE Access, 8, 184087-184108. https://doi.org/10.1109/ACCESS.2020.3028714.

NABELLA, P., HERTENO, R., SAPUTRO, S.W., FAISAL, M.R., & ABADI, F. 2024. Impact of a Synthetic Data Vault for Imbalanced Class in Cross-Project Defect Prediction. Journal of Electronics, Electromedical Engineering, and Medical Informatics, 6(2), 219-230. https://doi.org/10.35882/jeeemi.v6i2.409

RAHMAN, M.N., NUGROHO, R.A., FAISAL, M.R., ABADI, F., & HERTENO, R. 2024. Optimized multi correlation-based feature selection in software defect prediction. TELKOMNIKA (Telecommunication Computing Electronics and Control). https://doi.org/10.12928/telkomnika.v22i3.25793

RAMADHANI, A.A.P., NUGROHO, R.A., FAISAL, M.R., ABADI, F., & HERTENO, R. 2024. The Impact Of Software Metrics In NASA Metric Data Program Dataset Modules For Software Defect Prediction. TELKOMNIKA (Telecommunication Computing Electronics and Control). https://doi.org/10.12928/TELKOMNIKA.V22I4.25787

SALMAN, H., KALAKECH, A., & STEITI, A. 2024. Random Forest Algorithm Overview. Babylonian Journal of Machine Learning, Vol.2024, 69-79. https://doi.org/10.58496/bjml/2024/007.

SHEPPERD, M., SONG, Q., SUN, Z., & MAIR, C. 2018. MDP data sets (D' and D'') - zipped up (Version 1). figshare. https://doi.org/10.6084/m9.figshare.6071675.v1

SILVA, F.S. D., LIMA, M.P., CORUJO, D., VENÂNCIO NETO, A.J., & ESPOSITO, F. 2024. A Comprehensive Step-Wise Survey of Multiple Attribute Decision-Making Mobility Approaches. IEEE Access, 12, 108616-108656. https://doi.org/10.1109/ACCESS.2024.3436074

TAHERDOOST, H., & MADANCHIAN, M. 2023. Multi-Criteria Decision Making (MCDM) Methods and Concepts. Encyclopedia, 3(1), 77-87. https://doi.org/10.3390/encyclopedia3010006

ZEINI, H. A., AL-JEZNAWI, D., IMRAN, H., BERNARDO, L. F. A., AL-KHAFAJI, Z., & OSTROWSKI, K. A. 2023. Random Forest Algorithm for the Strength Prediction of Geopolymer Stabilized Clayey Soil. Sustainability, 15(2), 1408. https://doi.org/10.3390/su15021408

ZHAO, Y., HUANG, Z., GONG, L., ZHU, Y., YU, Q., & GAO, Y. 2023. Evaluating the Impact of Data Transformation Techniques on the Performance and Interpretability of Software Defect Prediction Models. IET Software. https://doi.org/10.1049/2023/6293074.

Multi-Criteria Decision Making dalam Seleksi Fitur Ensemble untuk Prediksi Cacat Perangkat Lunak

Penulis

DOI:

Kata Kunci:

Abstrak

Downloads

Referensi

Unduhan

Diterbitkan

Terbitan

Bagian

Lisensi

Cara Mengutip

Kirim Naskah

side menu

sertifikat akreditasi

Pengindeks Jurnal

Mendeley

Citations & Reference Manager

pengunjung

Keywords

Information

Supported by

Technical Support

Laboratorium

Direktori UB