Penerapan Random Oversampling dan Principal Component Analysis untuk Meningkatkan Akurasi Prediksi Kebangkrutan Perusahaan di Indonesia dengan Model Machine Learning
DOI:
https://doi.org/10.25126/jtiik.2025125Kata Kunci:
prediksi kebangkrutan, machine learning, random oversampling, PCAAbstrak
Prediksi kebangkrutan menjadi penting untuk memberikan peringatan dini bagi manajemen dan pemangku kepentingan agar dapat mengambil tindakan preventif. Penelitian ini menguji penerapan metode Random Oversampling dan Principal Component Analysis (PCA) dalam model machine learning untuk meningkatkan akurasi prediksi kebangkrutan perusahaan. Penelitian ini menggunakan dua dataset yaitu data Taiwanese Bankruptcy Prediction dari UCI Machine Learning Repository sebanyak 6.891 data dan data primer berupa data kebangkrutan perusahaan Indonesia dari Bursa Efek Indonesia (BEI) dari tahun 2021-2023 sebanyak 2.703 data. Total keseluruhan dataset yang digunakan sebanyak 9.594 data. Empat algoritma klasifikasi—KNN, Naïve Bayes, SVM, dan Decision Tree—diuji sebelum dan sesudah penerapan metode tersebut. Hasil menunjukkan bahwa kombinasi PCA dan Random Oversampling meningkatkan recall kelas minoritas (kebangkrutan) secara signifikan. SVM menjadi algoritma terbaik dengan precision 0,86, recall 0,76, dan F1-score 0,80, sementara Decision Tree mengalami overfitting setelah oversampling. PCA berhasil mereduksi dimensi dataset hingga 98,87% varian tetap terjaga, dan Random Oversampling menyeimbangkan distribusi kelas.
Abstract
Bankruptcy prediction is crucial for providing early warnings to management and stakeholders to take preventive actions. This study examines the application of Random Oversampling and Principal Component Analysis (PCA) in machine learning models to improve the accuracy of corporate bankruptcy prediction. The study uses two datasets: the Taiwanese Bankruptcy Prediction data from the UCI Machine Learning Repository (6,891 data points) and primary data on Indonesian company bankruptcies from the Indonesia Stock Exchange (IDX) for 2021–2023 (2,703 data points), totaling 9,594 data points. Four classification algorithms—K-Nearest Neighbors (KNN), Naïve Bayes, Support Vector Machine (SVM), and Decision Tree—were tested before and after applying these methods. The results show that the combination of PCA and Random Oversampling significantly improved the recall of the minority class (bankruptcy). SVM emerged as the best-performing algorithm with a precision of 0.86, recall of 0.76, and F1-score of 0.80, while the Decision Tree experienced overfitting after oversampling. PCA successfully reduced the dataset’s dimensions while retaining 98.87% of the variance, and Random Oversampling balanced the class distribution.
Downloads
Referensi
ADEBUSOLA, S.O., OWOLAWI, P.A., OJO, J.S., MASWIKANENG, P.S. AND AYO, A.O., 2025. Application of principal component analysis and Artificial neural networks for the prediction of QoS in FSO links over South Africa. Results in Optics, 19, p.100796.
https://doi.org/10.1016/J.RIO.2025.100796.
ALJAWAZNEH, H., MORA, A., GARCÍA-SÁNCHEZ, P. AND CASTILLO, P., 2021a. Comparing the Performance of Deep Learning Methods to Predict Companies’ Financial Failure. IEEE Access, PP, p.1. https://doi.org/10.1109/ACCESS.2021.3093461.
ALJAWAZNEH, H., MORA, A.M., GARCÍA-SÁNCHEZ, P. AND CASTILLO-VALDIVIESO, P.A., 2021b. Comparing the Performance of Deep Learning Methods to Predict Companies’ Financial Failure. IEEE Access, 9, pp.97010–97038. https://doi.org/10.1109/ACCESS.2021.3093461.
BINANTO, I., JAMLU, M.S., WIBISONO, R.D.A. AND SIANIPAR, N.F., 2024. A Comparison of Random Forest and Support Vector Machine Classification Algorithms for Imbalanced and Balanced Rodent Tuber Dataset with Random Oversampling Method. In: 2024 Ninth International Conference on Informatics and Computing (ICIC). pp.1–4.
https://doi.org/10.1109/ICIC64337.2024.10956313.
BYUN, J., LEE, J., LEE, H. AND SON, B., 2025. Balancing Explainability and Privacy in Bank Failure Prediction: A Differentially Private Glass-Box Approach. IEEE Access, 13, pp.1546–1565. https://doi.org/10.1109/ACCESS.2024.3523967.
CHEN, L., 2023. Machine Learning-based Analysis and Prediction of Telecoms Customer Churn. In: 2023 5th International Conference on Applied Machine Learning (ICAML). pp.122–126. https://doi.org/10.1109/ICAML60083.2023.00032.
DAS, R., BISWAS, S.K., DEVI, D. AND SARMA, B., 2020. An Oversampling Technique by Integrating Reverse Nearest Neighbor in SMOTE: Reverse-SMOTE. In: 2020 International Conference on Smart Electronics and Communication (ICOSEC). pp.1239–1244. https://doi.org/10.1109/ICOSEC49089.2020.9215387.
DASILAS, A. AND RIGANI, A., 2024. Machine learning techniques in bankruptcy prediction: A systematic literature review. Expert Systems with Applications, 255, p.124761. https://doi.org/10.1016/J.ESWA.2024.124761.
DENNIS, BUDIANTO, I.R., AZARIA, R.K. AND GUNAWAN, A.A.S., 2022. Machine Learning-based Approach on Dealing with Binary Classification Problem in Imbalanced Financial Data. In: 2021 International Seminar on Machine Learning, Optimization, and Data Science (ISMODE). pp.152–156. https://doi.org/10.1109/ISMODE53584.2022.9742834.
DIANA, H. AND HIDAYAT, D., 2023. Deteksi Kecurangan Laporan Keuangan dan Prediksi Kebangkrutan Perusahaan Sebelum dan Saat Pandemi Covid-19 dengan Menggunakan Perbandingan Pengukuran Model Altman Z –Score, Grover, Springate dan Zmijewski. Al Qalam: Jurnal Ilmiah Keagamaan dan Kemasyarakatan, [online] 17(1), pp.255–278. https://doi.org/10.35931/AQ.V17I1.1801.
GALIL, K., HAUPTMAN, A. AND ROSENBOIM, R.L., 2023. Prediction of corporate credit ratings with machine learning: Simple interpretative models. Finance Research Letters, 58, p.104648. https://doi.org/10.1016/J.FRL.2023.104648
GURNANI, I., VINCENT, TANDIAN, F. AND ANGGREAINY, M., 2021. Predicting Company Bankruptcy Using Random Forest Method. https://doi.org/10.1109/AiDAS53897.2021.9574384.
HE, F. AND ZHANG, Z., 2020. Nonlinear Fault Detection of Batch Processes Using Functional Local Kernel Principal Component Analysis. IEEE Access, 8, pp.117513–117527. https://doi.org/10.1109/ACCESS.2020.3004564.
IPARRAGUIRRE-VILLANUEVA, O. AND CABANILLAS-CARBONELL, M., 2024. Predicting business bankruptcy: A comparative analysis with machine learning models. Journal of Open Innovation: Technology, Market, and Complexity, 10(3), p.100375. https://doi.org/10.1016/J.JOITMC.2024.100375.
JASIM, Z.A., ZAHID, Z., UL-SAUFIE, A.Z. AND MANSOR, M.M., 2024. Comparison between Principal Component Analysis and Sparse Principal Component Analysis as Dimensional Reduction Techniques for Random Forest based High Dimensional Data Classification. In: 2024 IEEE International Conference on Computing (ICOCO). pp.7–11. https://doi.org/10.1109/ICOCO62848.2024.10928248.
KAIB, M.T.H., KOUADRI, A., HARKAT, M.-F., BENSMAIL, A. AND MANSOURI, M., 2024. Improvement of Kernel Principal Component Analysis-Based Approach for Nonlinear Process Monitoring by Data Set Size Reduction Using Class Interval. IEEE Access, 12, pp.11470–11480. https://doi.org/10.1109/ACCESS.2024.3354926.
LIU, C., JIN, S., WANG, D., LUO, Z., YU, J., ZHOU, B. AND YANG, C., 2022. Constrained Oversampling: An Oversampling Approach to Reduce Noise Generation in Imbalanced Datasets With Class Overlapping. IEEE Access, 10, pp.91452–91465. https://doi.org/10.1109/ACCESS.2020.3018911.
LORD, J., LANDRY, A., SAVAGE, G.T. AND WEECH-MALDONADO, R., 2020. Predicting Nursing Home Financial Distress Using the Altman Z-Score. Inquiry (United States), 57. https://doi.org/10.1177/0046958020934946
MASDIANTINI, P.R. AND WARASNIASIH, N.M.S., 2020. Laporan Keuangan dan Prediksi Kebangkrutan Perusahaan. JIA (Jurnal Ilmiah Akuntansi), [online] 5(1), pp.196–220. https://doi.org/10.23887/JIA.V5I1.25119.
PAMUNGKAS, S., 2023. Financial Distress Analysis Using the Ohlson Model in Indonesian State Owned Enterprises. Journal of Accounting and Finance Management, [online] 3(6), pp.272–381. https://doi.org/10.38035/JAFM.V3I6.176.
PAPÍK, M. AND PAPÍKOVÁ, L., 2024. Automated Machine Learning in Bankruptcy Prediction of Manufacturing Companies. Procedia Computer Science, 232, pp.1428–1436. https://doi.org/10.1016/J.PROCS.2024.01.141.
PARK, M.S., SON, H., HYUN, C. AND HWANG, H.J., 2021a. Explainability of Machine Learning Models for Bankruptcy Prediction. IEEE Access, 9, pp.124887–124899. https://doi.org/10.1109/ACCESS.2021.3110270.
PARK, S., SON, H., HYUN, C. AND HWANG, H., 2021b. Explainability of Machine Learning Models for Bankruptcy Prediction. IEEE Access, PP, p.1. https://doi.org/10.1109/ACCESS.2021.3110270.
PRANAVI, N.S.S., SRUTHI, T.K.S.S., SIRISHA, B.J.N., Nayak, M.S. and Thadikemalla, V.S.G., 2022. Credit Card Fraud Detection Using Minority Oversampling and Random Forest Technique. In: 2022 3rd International Conference for Emerging Technology (INCET). pp.1–6. https://doi.org/10.1109/INCET54531.2022.9824146.
PUTRI, M.E. AND CHALLEN, A.E., 2021. Prediksi Kebangkrutan Pada Perusahaan Yang Terdaftar Di Bursa Efek Indonesia. JAS (Jurnal Akuntansi Syariah), [online] 5(2), pp.126–141. https://doi.org/10.46367/JAS.V5I2.425.
RAHAYU, N.E.E. AND USMANSYAH, U., 2021a. Altman Z-Score Model to Analyze Bankruptcy of Islamic Commercial Bank POJK No. 12/POJK.03/2020. LAA MAISYIR : Jurnal Ekonomi Islam, [online] 8(2), pp.173–191. https://doi.org/10.24252/LAMAISYIR.V8I2.23097.
RAHAYU, N.E.E. AND USMANSYAH, U., 2021b. Altman Z-Score Model to Analyze Bankruptcy of Islamic Commercial Bank POJK No. 12/POJK.03/2020. LAA MAISYIR : Jurnal Ekonomi Islam, [online] 8(2), pp.173–191. https://doi.org/10.24252/LAMAISYIR.V8I2.23097.
SETIAWAN, F., 2021. Financial Distress Analysis Using Altman Z-Score Model In Sharia Banking In Indonesia. IQTISHODUNA: Jurnal Ekonomi Islam, [online] 10(2). https://doi.org/10.36835/IQTISHODUNA.V10I2.938.
SETO, A.A., 2022. Altman Z-Score Model, Springate, Grover, Ohlson and Zmijweski to Assess the Financial Distress Potential of PT. Garuda Indonesia Tbk During and After the Covid-19 Pandemic. Enrichment : Journal of Management, [online] 12(5), pp.3819–3826. https://doi.org/10.35335/ENRICHMENT.V12I5.923.
SHETTY, S., MUSA, M. AND BRÉDART, X., 2022. Bankruptcy Prediction Using Machine Learning Techniques. Journal of Risk and Financial Management 2022, Vol. 15, Page 35, [online] 15(1), p.35. https://doi.org/10.3390/JRFM15010035.
TARAWNEH, A.S., HASSANAT, A.B., ALTARAWNEH, G.A. and Almuhaimeed, A., 2022. Stop Oversampling for Class Imbalance Learning: A Review. IEEE Access, 10, pp.47643–47660. https://doi.org/10.1109/ACCESS.2022.3169512.
WIDIASMARA, A. AND RAHAYU, H.C., 2019. PERBEDAAN MODEL OHLSON, MODEL TAFFLER DAN MODEL SPRINGATE DALAM MEMPREDIKSI FINANCIAL DISTRESS. INVENTORY: JURNAL AKUNTANSI, [online] 3(2), pp.141–158. https://doi.org/10.25273/INVENTORY.V3I2.5242.
XIE, S., LIN, H., MA, T., PENG, K. AND SUN, Z., 2024. Prediction of joint roughness coefficient via hybrid machine learning model combined with principal components analysis. Journal of Rock Mechanics and Geotechnical Engineering. https://doi.org/10.1016/J.JRMGE.2024.05.059.
XUE, K., QI, Y., DUAN, H., CAO, A. AND WANG, A., 2024. Prediction of coal and gas outburst hazard using kernel principal component analysis and an enhanced extreme learning machine approach. Geohazard Mechanics, 2(4), pp.279–288. https://doi.org/10.1016/J.GHM.2024.09.002.
YAICHAROEN, A. AND YAMADA, K., 2021. Improving Support Vector Classification Efficiency with Principal Component Analysis. In: 2021 18th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON). pp.862–865. https://doi.org/10.1109/ECTI-CON51831.2021.9454883.
Unduhan
Diterbitkan
Terbitan
Bagian
Lisensi
Hak Cipta (c) 2025 Jurnal Teknologi Informasi dan Ilmu Komputer

Artikel ini berlisensiCreative Commons Attribution-ShareAlike 4.0 International License.

Artikel ini berlisensi Creative Common Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Penulis yang menerbitkan di jurnal ini menyetujui ketentuan berikut:
- Penulis menyimpan hak cipta dan memberikan jurnal hak penerbitan pertama naskah secara simultan dengan lisensi di bawah Creative Common Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) yang mengizinkan orang lain untuk berbagi pekerjaan dengan sebuah pernyataan kepenulisan pekerjaan dan penerbitan awal di jurnal ini.
- Penulis bisa memasukkan ke dalam penyusunan kontraktual tambahan terpisah untuk distribusi non ekslusif versi kaya terbitan jurnal (contoh: mempostingnya ke repositori institusional atau menerbitkannya dalam sebuah buku), dengan pengakuan penerbitan awalnya di jurnal ini.
- Penulis diizinkan dan didorong untuk mem-posting karya mereka online (contoh: di repositori institusional atau di website mereka) sebelum dan selama proses penyerahan, karena dapat mengarahkan ke pertukaran produktif, seperti halnya sitiran yang lebih awal dan lebih hebat dari karya yang diterbitkan. (Lihat Efek Akses Terbuka).










