Pengaruh Tahapan Preprocessing Terhadap Model Indobert Dan Indobertweet Untuk Mendeteksi Emosi Pada Komentar Akun Berita Instagram

Ulfia  Khairani; Viska Mutiawani; Hendri  Ahmadian

doi:10.25126/jtiik.1148315

Penulis

Ulfia Khairani Universitas Syiah Kuala, Banda Aceh
Viska Mutiawani Universitas Syiah Kuala, Banda Aceh
Hendri Ahmadian Universitas Islam Negeri Ar-Raniry, Banda Aceh

DOI:

https://doi.org/10.25126/jtiik.1148315

Kata Kunci:

deteksi emosi, tahapan preprocessing , indoBERT, indoBERTweet

Abstrak

Platform media sosial seperti Instagram telah membentuk ruang di mana berita dapat dengan mudah ditemukan dan menarik perhatian individu. Pada Instagram, dapat memberikan komentar-komentar terhadap berita yang telah dibaca. Pemahaman terhadap emosi yang mengiringi komentar-komentar yang telah diberikan pengguna pada postingan berita dapat membantu memahami bagaimana berita tersebut diserap, diinterpretasi, dan direspons oleh publik. Penelitian ini mengkategorikan empat emosi yaitu marah, senang, takut, dan sedih dengan menggunakan model terlatih IndoBERT dan IndoBERTweet. Penelitian ini bertujuan untuk membandingkan model IndoBERT dan IndoBERTweet dalam mendeteksi emosi pada komentar akun berita Instagram dan mengeksplorasi dampak penggunaan tahapan preprocessing khususnya remove stopwords dan stemming pada kedua model. Hasil penelitian menunjukkan bahwa model yang tidak melalui tahapan remove stopwords dan stemming menghasilkan kinerja yang lebih baik dibandingkan model yang melalui tahapan remove stopwords dan stemming, dengan perolehan akurasi sebesar 92,54% untuk model IndoBERTweet dan 88,81% untuk model IndoBERT.

Abstract

Social media platforms such as Instagram have created a space where news can be easily discovered and attract the attention of individuals. On Instagram, people can provide comments on the news they have read. Understanding the emotions that accompany the comments that users have given on news posts can help understand how the news is absorbed, interpreted and responded to by the public. This research categorizes four emotions, anger, happiness, fear and sadness, using pre-trained models IndoBERT and IndoBERTweet. This research aims to compare the IndoBERT and IndoBERTweet models in detecting emotions in Instagram news account comments and explore the impact of preprocessing stages, especially removing stopwords and stemming on both models. The research results showed that the model that did not go through the remove stopwords and stemming stages produced better performance than the model that went through the remove stopwords and stemming stages, with an accuracy of 92.54% for the IndoBERTweet model and 88.81% for the IndoBERT model.

Downloads

Download data is not yet available.

Referensi

ADOMA, A.F., HENRY, N.M. and CHEN, W., 2020. Comparative Analyses of Bert, Roberta, Distilbert, and Xlnet for Text-Based Emotion Recognition. 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing, ICCWAMTIP 2020, pp. 117–121.

ALBAB, M.U., and FAWAIQ, M. N., 2023. Optimization of the Stemming Technique on Text preprocessing President 3 Periods Topic. Jurnal TRANSFORMATIKA, 20(2), pp. 1–10.

SALSABILA, N.A., WINATMOKO, Y.A., SEPTIANDRI, A.A., and JAMAL, A., 2018. Colloquial Indonesian Lexicon. Proceedings of the 2018 International Conference on Asian Language Processing, IALP 2018, pp. 226–229.

CHIORRINI, A. DIAMANTINI, C., MIRCOLI, A., & POTENA, D., 2021. Emotion and sentiment analysis of tweets using BERT. In EDBT/ICDT Workshops (Vol. 3).

GUPTA, N., 2021. A Pre-Trained Vs Fine-Tuning Methodology in Transfer Learning. Journal of Physics: Conference Series, 1947(1).

HARYADI, D. and KUSUMA, G.P., 2019. Emotion detection in text using nested Long Short-Term Memory. International Journal of Advanced Computer Science and Applications, 10(6), pp. 351–357.

HUDHA, M., SUPRIYATI, E., and Listyorini, T., 2022. Analisis Sentimen Pengguna Youtube Terhadap Tayangan #Matanajwamenantiterawan Dengan Metode Naïve Bayes Classifier. JIKO (Jurnal Informatika dan Komputer), 5(1), pp. 1–6.

IMADUDDIN, H., A’LA, F.Y. and NUGROHO, Y.S., 2023. Sentiment Analysis in Indonesian Healthcare Applications using IndoBERT Approach. International Journal of Advanced Computer Science and Applications, 14(8), pp. 113–117.

IŞIK, M. and DAĞ, H., 2020. The impact of text preprocessing on the prediction of review ratings. Turkish Journal of Electrical Engineering and Computer Sciences, 28(3), pp. 1405–1421.

KHOMSAH, S., and ARIBOWO, A. S., 2020. Model Text-Preprocessing Komentar Youtube Dalam Bahasa Indonesia. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 4(4), pp. 648–654.

KOTO, F., LAU, J.H. and BALDWIN, T., 2021. INDOBERTWEET: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization’, EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings, pp. 10660–10668.

LEVI, M., PALIT, H. N., and ROSTIANINGSIH, S., 2020. Perbandingan Performa Tools Web Scraping pada Website dengan Data Statis dan Dinamis. Jurnal Infra, 8, pp. 1–7.

LI, X., ZHANG, H. and ZHOU, X.H., 2020. Chinese clinical named entity recognition with variant neural structures based on BERT methods’, Journal of Biomedical Informatics, 107(April), p. 103422.

MIYAJIWALA, A., LADKAT, A., JAGADALE, S., and JOSHI, R., 2022. On Sensitivity of Deep Learning Based Text Classification Algorithms to Practical Input Perturbations. Lecture Notes in Networks and Systems, 507 LNNS, pp. 613–626.

NANDWANI, P. and VERMA, R., 2021. A review on sentiment analysis and emotion detection from text. Social Network Analysis and Mining, 11(1), pp. 1–19. Nisa, R., Amriza, S. and Supriyadi, D. (2021) ‘Komparasi Metode’, 13(2), pp. 130–139.

REALITA, E. and SETIADI, U., 2022. Konsumsi Berita Insidental di Media Sosial pada Generasi Dewasa. Jurnal Riset Komunikasi, 5(1), pp. 99–112.

ROSID, M. A., FITRANI, A. S., ASTUTIK, I. R. I., MULLOH, N. I., and GOZALI, H. A., 2020. Improving Text Preprocessing for Student Complaint Document Classification Using Sastrawi. IOP Conference Series: Materials Science and Engineering, 874(1).

SARI, I.C. AND RULDEVIYANI, Y., 2020. Sentiment Analysis of the Covid-19 Virus Infection in Indonesian Public Transportation on Twitter Data: A Case Study of Commuter Line Passengers. In 2020 International Workshop on Big Data and Information Security (IWBIS). IEEE, pp. 23–28.

WANG, H., ZHANG, L., YIN, K., LUO, H., and LI, J., 2021. Landslide identification using machine learning. Geoscience Frontiers, 12(1), pp. 351–364.

WILIE, B., VINCENTIO, K., WINATA, G. I., CAHYAWIJAYA, S., LI, X., LIM, Z. Y., ... and

PURWARIANTI, A., 2020, IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding. pp. 843–857.

Pengaruh Tahapan Preprocessing Terhadap Model Indobert dan Indobertweet untuk Mendeteksi Emosi pada Komentar Akun Berita Instagram

Penulis

DOI:

Kata Kunci:

Abstrak

Downloads

Referensi

Unduhan

Diterbitkan

Terbitan

Bagian

Lisensi

Cara Mengutip

Kirim Naskah

side menu

sertifikat akreditasi

pengindeks

Mendeley

Citations & Reference Manager

pengunjung

Keywords

Information

Supported by

Technical Support

Laboratorium

Direktori UB