Peningkatan Akurasi pada Prediksi Kepribadian Mbti Pengguna Twitter Menggunakan Augmentasi Data

Penulis

  • Rizki Nurhaliza Harahap Fakultas Iinformatika - Universitas Telkom
  • Kemas Muslim Fakultas Informatika - Universitas Telkom

DOI:

https://doi.org/10.25126/jtiik.2020743622

Abstrak

Kepribadian suatu individu perlu diketahui untuk membantu seseorang dalam mempertimbangkan beberapa hal, salah satunya perekrutan karier. Pada umumnya, kepribadian dapat diketahui melalui metode wawancara, observasi, maupun survei kuesioner. Akan tetapi, metode konvensional tersebut dinilai kurang praktis dari segi waktu dan materi karena dibutuhkan waktu yang lama dan biaya yang cukup besar untuk mengolah data. Selain itu, penggunaan metode konvensional juga dapat menimbulkan bias karena melibatkan orang ketiga dalam pengolahan data. Penelitian ini mencoba memberikan solusi dengan membangun model yang dapat melakukan prediksi terhadap kepribadian seseorang berdasarkan analisis data dan informasi dari media sosial Twitter. Data dan informasi tersebut akan diproses sehingga didapatkan prediksi kepribadian orang tersebut. Teori klasifikasi kepribadian yang digunakan adalah teori Myers-Briggs Type Indicator (MBTI). Penelitian ini juga mencoba menerapkan teknik augmentasi data untuk meningkatkan performa dari text mining task yang memiliki dataset sedikit. Hasil terbaik didapatkan dengan metode Random Forest menggunakan pembobotan Term Frequency-Inverse Document Frequency (TF-IDF) dan fitur yang tersedia pada Twitter. Penggunaan teknik augmentasi dapat meningkatkan akurasi hingga 30% dari akurasi awal sehingga hasil penelitian menunjukkan bahwa penggunaan teknik augmentasi data dapat meningkatkan performa pada model prediksi kepribadian MBTI.

Abstract

The personality of an individual needs to be known to help people in considering things, one of them is career recruitment. In general, personality can be known through interviews, observations, and questionnaire surveys. However, the conventional method is judged to be impractical in terms of time and material because it takes a long time and has considerable costs to process data. After all, the use of conventional methods can also cause bias because it involves a third person in data processing. The research tries to provide a solution by building a system that can predict the personality of a person based on the analysis of data and information from social media Twitter. The data and information will be processed so that the personality prediction is obtained. The personality classification theory used is the Myers-Briggs Type Indicator (MBTI) theory. The research also tries to implement data augmentation techniques to improve the performance of text mining tasks that have a slight dataset. The best results are obtained by the Random Forest method using the Term Frequency-Inverse Document Frequency (TF-IDF) weighted and the features available on Twitter. The use of augmentation techniques can increase accuracy by up to 30% from initial accuracy. So, the use of data augmentation techniques can be used to improve the performance of MBTI personality prediction models.

Downloads

Download data is not yet available.

Referensi

ALLAHYARI, M., POURIYEH, S., ASSEFI, M.,

SAFAEI, S. DAN TRIPPE, E.D., 2017. A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques. Cornell University, .

AMALIYAH, M. DAN NOVIYANTO, F., 2013.

Aplikasi Tes Kepribadian untuk Penempatan Karyawan Menggunakan Metode MBTI (Myers-Briggs Type Indicator) Berbasis Web (Studi Kasus : PT. Winata Putra Mandiri). Jurnal Sarjana Teknik Informatika, 1(2), pp.607–616.

ARNOUX, P.H., XU, A., BOYETTE, N.,

MAHMUD, J., AKKIRAJU, R. DAN SINHA, V., 2017. 25 tweets to know you: A new model to predict personality with social media. Proceedings of the 11th International Conference on Web and Social Media, ICWSM 2017, pp.472–475.

AZUCAR, D., MARENGO, D. DAN SETTANNI,

M., 2018. Predicting the Big 5 personality traits from digital footprints on social media: A meta-analysis. Personality and Individual Differences, [online] 124(December 2017), pp.150–159. Available at: <https://doi.org/10.1016/j.paid.2017.12.018>.

BAI, S., YUAN, S., HAO, B. DAN ZHU, T., 2014.

Predicting personality traits of microblog users. Web Intelligence and Agent Systems, 12(3), pp.249–265.

CELLI, F. DAN LEPRI, B., 2018. Is big five better

than MBTI? A personality computing challenge using Twitter data. CEUR Workshop Proceedings, 2253.

GAIGOLE, P.C., PATIL, L.H. DAN CHAUDHARI,

P.M., 2013. Preprocessing Techniques in Text Categorization. National Conference on Innovative Paradigms in Engineering & Technology, pp.1–3.

GJURKOVIĆ, M. DAN ŠNAJDER, J., 2018.

Reddit: A Gold Mine for Personality Prediction. pp.87–97.

GOLBECK, J., 2016. Predicting Personality from

Social Media Text. AIS Transactions on Replication Research, 2(September), pp.1–10.

HASSANEIN, M., HUSSEIN, W., RADY, S. DAN

GHARIB, T.F., 2019. Predicting Personality Traits from Social Media using Text Semantics. Proceedings - 2018 13th International Conference on Computer Engineering and Systems, ICCES 2018, pp.184–189.

KORDE, V., 2012. Text Classification and

Classifiers:A Survey. International Journal of Artificial Intelligence & Applications, 3(2), pp.85–99.

LIMA, A.C.E.S. DAN DE CASTRO, L.N., 2019.

Tecla: A temperament and psychological type prediction framework from Twitter data. PLoS ONE, 14(3), pp.1–18.

SURVEY, A.F., 2013. MBTI Personality Types of

Project Managers and Their Success : (June).

TADESSE, M.M., LIN, H., XU, B. DAN YANG,

L., 2018. Personality Predictions Based on User Behavior on the Facebook Social Media Platform. IEEE Access, 6(c), pp.61959–61969.

TANDERA, T., HENDRO, SUHARTONO, D.,

WONGSO, R. DAN PRASETIO, Y.L., 2017. Personality Prediction System from Facebook Users. Procedia Computer Science, [online] 116, pp.604–611. Available at: <https://doi.org/10.1016/j.procs.2017.10.016>.

TIGHE, E. DAN CHENG, C., 2018. Modeling

Personality Traits of Filipino Twitter Users. pp.112–122.

WEI, J. DAN ZOU, K., 2019. EDA: Easy Data

Augmentation Techniques for Boosting Performance on Text Classification Tasks. pp.6381–6387.

WU, H. DAN GU, X., 2014. Reducing over-

weighting in supervised term weighting for sentiment analysis. COLING 2014 - 25th International Conference on Computational Linguistics, Proceedings of COLING 2014: Technical Papers, pp.1322–1330.

ZHENG, H. DAN WU, C., 2019. Predicting

personality using facebook status based on semi-supervised learning. ACM International Conference Proceeding Series, Part F148150, pp.59–64.

MARCUS, B., MACHILEK, F. & SCHÜTZ, A., 2006. Personality in cyberspace: Personal Web sites as media for personality expressions & impressions. Journal of Personality & Social Psychology, 90(6), pp.1014–1031.

QUERCIA, D., KOSINSKI, M., STILLWELL, D. & CROWCROFT, J., 2011. Our twitter profiles, our selves: Predicting personality with twitter. Proceedings - 2011 IEEE International Conference on Privacy, Security, Risk & Trust & IEEE International Conference on Social Computing, PASSAT/SocialCom 2011, pp.180–185.

ZUKHRUFILLAH, I., 2018. Gejala Media Sosial

Twitter Sebagai Media Sosial Alternatif. Al-I’lam: Jurnal Komunikasi dan Penyiaran Islam, 1(2), p.102

Diterbitkan

07-08-2020

Terbitan

Bagian

Ilmu Komputer

Cara Mengutip

Peningkatan Akurasi pada Prediksi Kepribadian Mbti Pengguna Twitter Menggunakan Augmentasi Data. (2020). Jurnal Teknologi Informasi Dan Ilmu Komputer, 7(4), 815-822. https://doi.org/10.25126/jtiik.2020743622