Implementasi Algoritma Catboost dan Shapley Additive Explanations (Shap) dalam Memprediksi Popularitas Game Indie pada Platform Steam

Penulis

  • Mohammad Teddy Syamkalla Institut Teknologi Telkom Purwokerto, Purwokerto
  • Siti Khomsah Institut Teknologi Telkom Purwokerto, Purwokerto
  • Yohani Setya Rafika Nur Institut Teknologi Telkom Purwokerto, Purwokerto

DOI:

https://doi.org/10.25126/jtiik.1148503

Kata Kunci:

Prediksi, popularitas gim indie, shapley additive explanations, shap, catboost, steam, prediksi, popularitas gim indi. CatBoost, SHAP, steam

Abstrak

Meningkatnya popularitas game indie di pasar game mewajibkan para pengembang game indie bersaing untuk membuat game nya diminati oleh para pengguna dengan berbagai cara agar dapat meningkatkan potensi popularitasnya. Penelitian sebelumnya telah mencoba menggunakan algoritma logistic regression dan random forest untuk meramalkan popularitas game indie di platform Steam, namun hasil model menggunakan berbagai macam metode masih rendah. Selain itu masih belum memberikan pengetahuan yang cukup kepada pengembang tentang apa yang mempengaruhi popularitasnya.Karena data game indie yang diambil dari platform steam yang digunakan dalam studi ini memiliki tipe kategorikal dan non-linear, maka digunakan pendekatan lain dengan memanfaatkan Algoritma CatBoost yang dalam beberapa penelitian lain terbukti memiliki kinerja dan kemampuan yang lebih baik dalam menangani data kategorikal dan non-linear. Metode Shapley Additive Explanations (SHAP) juga digunakan untuk mengartikan kontribusi dan pengaruh dari setiap fitur terhadap hasil prediksi. Hasil evaluasi pada data game indie dari platform steam hasil scraping yang terdiri dari 52627 baris dan 11 fitur menunjukkan bahwa model CatBoost memiliki akurasi 81%, presisi 0.83, recall 0.77, F1-score 0.80 menunjukkan kemampuan model yang seimbang dalam membedakan kelas popularitas. Hal tersebut didukung dengan nilai AUC 0.88 dimana kurva cenderung mendekati 90 derajat. Metode SHAP mengungkapkan pengaruh fitur terhadap hasil prediksi. Keberadaan kategori steam trading cards, genre RPG dan kompartibel pada sistem operasi mac akan meningkatkan popularitas. Hal tersebut juga terjadi pada semakin tinggi harga dan achievements yang disediakan. Namun keberadaan genre casual akan mengurangi popularitas. Dengan hasil penelitian ini diharapkan dapat membantu pengembang indie dalam mengetahui faktor yang berkemungkinan mempengaruhi popularitas game mereka.

 

Abstract

 

The increasing popularity of indie games in the gaming market requires indie game developers to compete to make their games attractive to users in various ways in order to increase their potential popularity. Previous research has tried to use logistic regression and random forest algorithms to forecast the popularity of indie games on the Steam platform, However, the model results using various methods are still low. Since the indie game data taken from the steam platform used in this study is categorical and non-linear, another approach is used by utilizing the CatBoost Algorithm which in several other studies has proven to have better performance and ability in handling categorical and non-linear data. The Shapley Additive Explanations (SHAP) method is also used to interpret the contribution and influence of each feature to the prediction results. Evaluation results on indie game data from the steam platform scraping results consisting of 52627 rows and 11 features show that the CatBoost model has 81% accuracy, precision 0.83, recall 0.77, F1-score 0.80 indicating a balanced model ability in distinguishing popularity classes. This is supported by the AUC value of 0.88 where the curve tends to approach 90 degrees. The SHAP method reveals the influence of features on prediction results. The existence of steam trading cards category, RPG genre and compatibility on mac operating system will increase the popularity. This also happens with the higher prices and achievements provided. However, the presence of the casual genre will reduce popularity. With the results of this study, it is hoped that it can help indie developers in knowing the factors that are likely to affect the popularity of their games.

Downloads

Download data is not yet available.

Referensi

AMAZON SAGEMAKER. 2023. Amazon SageMaker Panduan Developer. Retrieved from https://docs.aws.amazon.com/sagemaker/

ANNA VERONIKA DOROOGUSH. 2018. CatBoost gradient boosting with categorical features Support. arXiv. https://doi.org/10.48550/arXiv.1810.11363

BAHADOR PARSA, A., MOVAHEDI, A., TAGHIPOUR, H., DERRIBLE, S., & MOHAMMADADIAN, A. 2019. Toward Safer Highways, Application of XGBoost and SHAP for Real-Time Accident Detection and Feature Analysis. Accident Analysis & Prevention, 125, 105405. https://doi.org/10.1016/j.aap.2019.105405

BALDI, P., BRUNAK, S., CHAUVIN, Y., ANDERSEN, C. A. F., & NIELSEN, H. 2000. Assessing the Accuracy of Prediction Algorithms for Classification: An Overview. Bioinformatics, 16(5), 412-424. https://doi.org/10.1093/bioinformatics/16.5.412

CAELEN, O. 2017. A Bayesian Interpretation of the Confusion Matrix. Annals of Mathematics and Artificial Intelligence, 81(3-4), 289-301. https://doi.org/10.1007/s10472-017-9564-8

CATBOOST.AI. n.d. Catboost Doc. Retrieved July 5, 2023, from https://catboost.ai/en/docs/

DEVIACITA, D., SASTY, H., MUHARDI, H., NAWAWI, D. H. H., & LAUT, B. 2019. Implementasi Web Scraping untuk Pengambilan Data pada Situs Marketplace. Jurnal Ilmu Komputer dan Informasi, 7(4), 45-54. http://dx.doi.org/10.26418/justin.v7i4.30930

DOROGUSH, A. V. 2018. CatBoost: Gradient Boosting with Categorical Features Support. Yandex. https://doi.org/10.48550/arXiv.1810.11363

IBRAHIM, A. A., RIDWAN, R. L., MUHAMMED, M. M., ABDULAZIZ, R. O., & SAHEED, G. A. 2020. Comparison of the CatBoost Classifier with Other Machine Learning Methods. International Journal of Advanced Computer Science and Applications (IJACSA), 11(11), 45-53. Retrieved from www.ijacsa.thesai.org

JIANG, Z., & WANG, Y. 2021. PREDICTING THE POPULARITY OF INDEPENDENT VIDEO GAMES ON THE STEAM PLATFORM. arXiv. https://doi.org/10.17615/aaj0-x494

KANNANGARA, K. K. P. M., ZHOU, W., DING, Z., & HONG, Z. 2022. Investigation of Feature Contribution to Shield Tunneling-Induced Settlement Using Shapley Additive Explanations Method. Journal of Rock Mechanics and Geotechnical Engineering, 14(4), 1052-1063. https://doi.org/10.1016/j.jrmge.2022.01.002

LOUIS OWEN. 2022. Boost your machine learning model’s performance via hyperparameter tuning. Packt publishing Ltd.

PARKER, F. 2013. Indie Game Studies Year Eleven. Proceedings of DiGRA 2013 Conference: DeFragging Game Studies. https://doi.org/10.48550/arXiv.1810.11363

PERMATASARI, N., ASY SYAHIDAH, S., LEOFIRO IRFIANSYAH, A., & AL-HAQQONI, M. G. 2022b. PREDICTING DIABETES MELLITUS USING CATBOOST CLASSIFIER AND SHAPLEY ADDITIVE EXPLANATION (SHAP) APPROACH. BAREKENG: Jurnal Ilmu Matematika Dan Terapan, 16(2), 615–624. https://doi.org/10.30598/barekengvol16iss2pp615-624

RODRÍGUEZ-PÉREZ, R., & BAJORATH, J. 2020. Interpretation of Machine Learning Models Using Shapley Values: Application to Compound Potency and Multi-Target Activity Predictions. Journal of Computer-Aided Molecular Design, 34(10), 1013-1026. https://doi.org/10.1007/s10822-020-00314-0

SABER, M., BOULMAIZ, T., GUERMOUI, M., ABDRABO, K. I., SUMI, T., BOUTAGHANE, H., NOHARA, D., & MABROUK, E. 2021. Examining LightGBM and CatBoost Models for Wadi Flash Flood Susceptibility Prediction. Arabian Journal of Geosciences, 14(21), 1-14. https://doi.org/10.1080/10106049.2021.1974959

STEAMSPY. Retrieved July 1, 2023, from steamspy.com

STEFEN T. WRIGHT. 2018. There are too many video games. What now? - Polygon . Retrieved from https://www.polygon.com/2018/9/28/17911372/there-are-too-many-video-games-what-now-indiepocalypse

WANG, G., ZHAO, B., WU, B., ZHANG, C., & LIU, W. 2022. Intelligent Prediction of Slope Stability Based on Visual Exploratory Data Analysis of 77 In Situ Cases. International Journal of Mining Science and Technology, 32(4), 845-854. https://doi.org/10.1016/j.ijmst.2022.07.002

WANG, L., WU, J., ZHANG, W., WANG, L., & CUI, W. (2021. Efficient Seismic Stability Analysis of Embankment Slopes Subjected to Water Level Changes Using Gradient Boosting Algorithms. Frontiers in Earth Science, 9, 807317. https://doi.org/10.3389/feart.2021.807317

Diterbitkan

26-08-2024

Terbitan

Bagian

Ilmu Komputer

Cara Mengutip

Implementasi Algoritma Catboost dan Shapley Additive Explanations (Shap) dalam Memprediksi Popularitas Game Indie pada Platform Steam. (2024). Jurnal Teknologi Informasi Dan Ilmu Komputer, 11(4), 777-786. https://doi.org/10.25126/jtiik.1148503