Multi-Criteria Decision Making dalam Seleksi Fitur Ensemble untuk Prediksi Cacat Perangkat Lunak
DOI:
https://doi.org/10.25126/jtiik.2025125Kata Kunci:
prediksi cacat perangkat lunak, seleksi fitur ensemble, MCDM, random forestAbstrak
Prediksi cacat perangkat lunak merupakan upaya strategis dalam meningkatkan kualitas produk melalui identifikasi dini modul yang berpotensi cacat. Kinerja prediksi dipengaruhi oleh pemilihan fitur, karena informasi yang berlebihan dan tidak relevan dapat mempengaruhi kualitas pembelajaran model. Seleksi fitur ensemble dinilai efektif dalam menyeleksi fitur yang relevan dengan menggabungkan beberapa metode seleksi fitur berbasis filter. Diperlukan mekanisme integrasi untuk menyatukan hasil dari empat teknik filter—Mutual Information, Fisher Score, Uncertainty dan Relief. Penelitian ini membandingkan empat metode Multi‑Criteria Decision Making—TOPSIS, VIKOR, EDAS, dan WASPAS—yang bekerja dengan merangking nilai relevansi fitur hasil seleksi filter tersebut. Sepuluh fitur teratas dari tiap metode kemudian dievaluasi menggunakan model Random Forest dengan metrik AUC melalui K‑Fold cross‑validation. Dari 12 dataset NASA MDP yang diuji, TOPSIS menunjukkan kinerja paling konsisten dan terbaik dengan nilai rata-rata AUC sebesar 0,8038. Temuan ini menegaskan pentingnya pemilihan metode integrasi yang tepat dalam meningkatkan akurasi prediksi cacat perangkat lunak dan memberikan panduan bagi pengembangan model yang lebih efektif.
Abstract
Software defect prediction is a strategic effort to improve product quality through early identification of potentially defective modules. Prediction performance is influenced by feature selection, because redundant and irrelevant information can affect the quality of model learning. Ensemble feature selection is considered effective in selecting relevant features by combining several filter-based feature selection methods. An integration mechanism is needed to unify the results of four filter techniques—Mutual Information, Fisher Score, Uncertainty and Relief. This study compares four Multi-Criteria Decision Making methods—TOPSIS, VIKOR, EDAS, and WASPAS—which work by ranking the relevance values of the filter-selected features. The top ten features from each method are then evaluated using the Random Forest model with the AUC metric through K-Fold cross-validation. Of the 12 NASA MDP datasets tested, TOPSIS showed the most consistent and best performance with an average AUC value of 0.8038. These findings emphasize the importance of choosing the right integration method in improving the accuracy of software defect prediction and provide guidance for the development of more effective models.
Downloads
Referensi
ABELLANA, D.P.M., ROXAS, R.R., LAO, D.M., & MAYOL, P.E., LEE, S. 2022. Ensemble feature selection in binary machine learning classification: A novel application of the evaluation based on distance from average solution (EDAS) method. Mathematical Problems in Engineering, https://doi.org/10.1155/2022
AKBAR, A. M., HERTENO, R., SAPUTRO, S. W., FAISAL, M. R., & NUGROHO, R. A. 2024. Optimizing Software Defect Prediction Models: Integrating Hybrid Grey Wolf and Particle Swarm Optimization for Enhanced Feature Selection with Popular Gradient Boosting Algorithm. Journal of Electronics, Electromedical Engineering, and Medical Informatics, 6(2), 169-181. https://doi.org/10.35882/jeeemi.v6i2.388
ALI, M., MAZHAR, T., SHAHZAD, T., GHADI, Y., MOHSIN, S., AKBER, S., & ALI, M. 2023. Analysis of Feature Selection Methods in Software Defect Prediction Models. IEEE Access, 11, 145954-145974. https://doi.org/10.1109/ACCESS.2023.3343249.
ALVES, MA, OLIVEIRA, BAS, & GUIMARÃES, FG 2024. Ensemble ranking: An aggregation of multiple multicriteria methods and scenarios and its application to power generation planning. Decision Analytics Journal, Elsevier, https://doi.org/10.1016/j.dajour.2024.100435
BARRY, M. , NDERU, L. AND GICHUHI, A. 2023 A Hybrid Spatial Dependence Model Based on Radial Basis Function Neural Networks (RBFNN) and Random Forest (RF). Journal of Data Analysis and Information Processing, 11, 293-309. https://doi.org/10.4236/jdaip.2023.113015
BAYNE, C., MCGROSSO, D., SANCHEZ, C., ROSSITTO, L.A., PATTERSON, M., GONZALEZ, C., BAUS, C., VOLK, C., ZHAO, H. N., DORRESTEIN, P., NIZET, V., SAKOULAS, G., GONZALEZ, D. J., & ROSE, W. 2025. Multi-omic signatures of host response associated with presence, type, and outcome of enterococcal bacteremia. mSystems, 10(2), e0147124. https://doi.org/10.1128/msystems.01471-24
ÇORBACIOĞLU, Ş., & AKSEL, G. 2023. Receiver operating characteristic curve analysis in diagnostic accuracy studies: A guide to interpreting the area under the curve value. Turkish Journal of Emergency Medicine, 23, 195 - 198. https://doi.org/10.4103/tjem.tjem_182_23.
FRIEYADIE, S., FONI, R., & SETIAWAN, A. 2024. Improving Software Defect Prediction Performance through Effective Forward Selection Method with Linear Regression for Handling Class Imbalance. 2024 International Conference on Information Technology Research and Innovation (ICITRI), 218-223. https://doi.org/10.1109/ICITRI62858.2024.10699214
GHINAYA, H., HERTENO, R., FAISAL, M. R., FARMADI, A., & INDRIANI, F. 2024. Analysis of Important Features in Software Defect Prediction Using Synthetic Minority Oversampling Techniques (SMOTE), Recursive Feature Elimination (RFE) and Random Forest. Journal of Electronics, Electromedical Engineering, and Medical Informatics, 6(3), 276-288. https://doi.org/10.35882/jeeemi.v6i3.453
HAMID, T.H., SALLEHUDDIN, R., YUNOS, Z.M., & ALI, A. 2021. Ensemble Based Filter Feature Selection with Harmonize Particle Swarm Optimization and Support Vector Machine for Optimal Cancer Classification. https://doi.org/10.1016/j.mlwa.2021.100054
HASHEMI, A., DOWLATSHAHI, M. B., & NEZAMABADI-POUR, H. 2020. MFS-MCDM: Multi-label feature selection using multi-criteria decision making. Knowledge-Based Systems, 106365. https://doi.org/10.1016/j.knosys.2020.106365
HASHEMI, A., DOWLATSHAHI, M., & NEZAMABADI-POUR, H. 2021. Ensemble of feature selection algorithms: a multi-criteria decision-making approach. International Journal of Machine Learning and Cybernetics, 13, 49 - 69. https://doi.org/10.1007/s13042-021-01347-z.
JANANE, F., OUADERHMAN, T., & CHAMLAL, H. 2023. A filter feature selection for high-dimensional data. Journal of Algorithms & Computational Technology, 17. https://doi.org/10.1177/17483026231184171.
KAUR, K., & KUMAR, A.M. 2023. MCDM-EFS: A novel ensemble feature selection method for software defect prediction using multi-criteria decision making. Intell. Decis. Technol., 17, 1283-1296. https://doi.org/10.3233/IDT-230251
KIBRIA, H.B., & MATIN, A. 2022. The Severity Prediction of The Binary And Multi-Class Cardiovascular Disease - A Machine Learning-Based Fusion Approach. Computational biology and chemistry, 98, 107672. https://doi.org/10.1016/j.compbiolchem.2022.107672
KRZYWICKA, M., & WOSIAK, A. 2023. Sensitivity of Standard Evaluation Metrics for Disease Classification and Progression Assessment Based on Whole-Body Imaging. Procedia Computer Science, 225, 4314–4323. https://doi.org/10.1016/j.procs.2023.10.428
MALEKIPIRBAZARI, M., AKSAKALLI, V., SHAFQAT, W., & EBERHARD, A. 2021. Performance comparison of feature selection and extraction methods with random instance selection. Expert Systems with Applications, 179, 115072. https://doi.org/10.1016/j.eswa.2021.115072
PASHA, S., MOHAMED, E., & PASHA, S. 2020. Novel Feature Reduction (NFR) Model With Machine Learning and Data Mining Algorithms for Effective Disease Risk Prediction. IEEE Access, 8, 184087-184108. https://doi.org/10.1109/ACCESS.2020.3028714.
NABELLA, P., HERTENO, R., SAPUTRO, S.W., FAISAL, M.R., & ABADI, F. 2024. Impact of a Synthetic Data Vault for Imbalanced Class in Cross-Project Defect Prediction. Journal of Electronics, Electromedical Engineering, and Medical Informatics, 6(2), 219-230. https://doi.org/10.35882/jeeemi.v6i2.409
RAHMAN, M.N., NUGROHO, R.A., FAISAL, M.R., ABADI, F., & HERTENO, R. 2024. Optimized multi correlation-based feature selection in software defect prediction. TELKOMNIKA (Telecommunication Computing Electronics and Control). https://doi.org/10.12928/telkomnika.v22i3.25793
RAMADHANI, A.A.P., NUGROHO, R.A., FAISAL, M.R., ABADI, F., & HERTENO, R. 2024. The Impact Of Software Metrics In NASA Metric Data Program Dataset Modules For Software Defect Prediction. TELKOMNIKA (Telecommunication Computing Electronics and Control). https://doi.org/10.12928/TELKOMNIKA.V22I4.25787
SALMAN, H., KALAKECH, A., & STEITI, A. 2024. Random Forest Algorithm Overview. Babylonian Journal of Machine Learning, Vol.2024, 69-79. https://doi.org/10.58496/bjml/2024/007.
SHEPPERD, M., SONG, Q., SUN, Z., & MAIR, C. 2018. MDP data sets (D' and D'') - zipped up (Version 1). figshare. https://doi.org/10.6084/m9.figshare.6071675.v1
SILVA, F.S. D., LIMA, M.P., CORUJO, D., VENÂNCIO NETO, A.J., & ESPOSITO, F. 2024. A Comprehensive Step-Wise Survey of Multiple Attribute Decision-Making Mobility Approaches. IEEE Access, 12, 108616-108656. https://doi.org/10.1109/ACCESS.2024.3436074
TAHERDOOST, H., & MADANCHIAN, M. 2023. Multi-Criteria Decision Making (MCDM) Methods and Concepts. Encyclopedia, 3(1), 77-87. https://doi.org/10.3390/encyclopedia3010006
ZEINI, H. A., AL-JEZNAWI, D., IMRAN, H., BERNARDO, L. F. A., AL-KHAFAJI, Z., & OSTROWSKI, K. A. 2023. Random Forest Algorithm for the Strength Prediction of Geopolymer Stabilized Clayey Soil. Sustainability, 15(2), 1408. https://doi.org/10.3390/su15021408
ZHAO, Y., HUANG, Z., GONG, L., ZHU, Y., YU, Q., & GAO, Y. 2023. Evaluating the Impact of Data Transformation Techniques on the Performance and Interpretability of Software Defect Prediction Models. IET Software. https://doi.org/10.1049/2023/6293074.
Unduhan
Diterbitkan
Terbitan
Bagian
Lisensi
Hak Cipta (c) 2025 Jurnal Teknologi Informasi dan Ilmu Komputer

Artikel ini berlisensiCreative Commons Attribution-ShareAlike 4.0 International License.

Artikel ini berlisensi Creative Common Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Penulis yang menerbitkan di jurnal ini menyetujui ketentuan berikut:
- Penulis menyimpan hak cipta dan memberikan jurnal hak penerbitan pertama naskah secara simultan dengan lisensi di bawah Creative Common Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) yang mengizinkan orang lain untuk berbagi pekerjaan dengan sebuah pernyataan kepenulisan pekerjaan dan penerbitan awal di jurnal ini.
- Penulis bisa memasukkan ke dalam penyusunan kontraktual tambahan terpisah untuk distribusi non ekslusif versi kaya terbitan jurnal (contoh: mempostingnya ke repositori institusional atau menerbitkannya dalam sebuah buku), dengan pengakuan penerbitan awalnya di jurnal ini.
- Penulis diizinkan dan didorong untuk mem-posting karya mereka online (contoh: di repositori institusional atau di website mereka) sebelum dan selama proses penyerahan, karena dapat mengarahkan ke pertukaran produktif, seperti halnya sitiran yang lebih awal dan lebih hebat dari karya yang diterbitkan. (Lihat Efek Akses Terbuka).










