Natural Language Processing untuk Otomatisasi Pengenalan Pronomina dalam Kalimat Bahasa Indonesia

Mohammad Farid Naufal; Selvia Ferdiana Kusuma

doi:10.25126/jtiik.2022946394

Penulis

Mohammad Farid Naufal Universitas Surabaya, Surabaya
Selvia Ferdiana Kusuma Politeknik Elektronika Negeri Surabaya, Surabaya

DOI:

https://doi.org/10.25126/jtiik.2022946394

Abstrak

Pronomina (kata ganti) adalah jenis kata yang dapat dipakai untuk menggantikan posisi kata benda atau orang dalam suatu kalimat. Penggunaan pronomina akan mudah dipahami apabila serangkaian kalimat dibaca secara utuh. Namun jika rangkaian kalimat tersebut hanya dibaca pada kalimat-kalimat tertentu, maka akan sulit memahami kalimat yang memiliki pronomina. Pada pengolahan bahasa alamiah, diperlukan kejelasan konteks dari sebuah kalimat. Dalam konteks otomatisasi pengolahan bahasa alamiah, adanya pronomina dapat menyulitkan komputer untuk memahami kalimat tersebut. Oleh sebab itu, dalam pengolahan bahasa alamiah yang mengandung pronomina diperlukan pre proses berupa pengubahan pronomina ke dalam bentuk subjek atau objek asli yang dirujuk. Metode yang diusulkan untuk menyelesaikan permasalahan ini adalah pendekatan berbasis sintaktik. Pendekatan ini menitikberatkan pada struktur kata yang digunakan dan struktur komponen kata yang digunakan. Metode yang diusulkan memiliki 4 tahapan yakni pengumpulan data, pembangkitan aturan, otomatisasi pengenalan pronominal, dan terakhir adalah evaluasi. Metode yang diusulkan telah diujicobakan untuk mengenali adanya pronomina dari kalimat-kalimat pada materi Ilmu Pengetahuan Alam dan Ilmu Pengetahuan Sosial di jenjang sekolah dasar. Hasil evaluasi menunjukkan bahwa metode yang diusulkan dapat digunakan untuk mengubah subjek yang berbentuk pronomina menjadi subjek atau objek asli yang dirujuk. Rata-rata akurasi yang didapatkan sebesar 81%. Akurasi tersebut didapatkan dari perbandingan antara jumlah kata ganti yang berhasil diidentifikasi subjeknya dengan keseluruhan data uji. Hasil dari penelitian ini dapat digunakan peneliti di bidang Natural Language Processing untuk melakukan praproses terhadap teks yang akan diolah.

Abstract

A pronoun is a word that can be used to replace a noun or person in a sentence. The use of pronouns will be easy to understand if a series of sentences is read in its entirety. However, if the sentence series is only read in specific sentences, it will be difficult to understand sentences with pronouns. In natural language processing, it is necessary to clarify the context of a sentence. In the context of automation of natural language processing, the existence of pronouns can make it difficult for computers to understand the sentence. Therefore, in processing natural language containing pronouns, it is necessary to pre-process in the form of converting pronouns into the form of the original subject or object referred to. The method proposed to solve this problem is a syntactic-based approach. This approach focuses on the structure of the words used and the word components used. The proposed method has 4 stages, namely data collection, rule generation, automation of pronoun recognition and the last is evaluation. The proposed method has been evaluated to identify the existence of pronouns from sentences in the Natural Sciences and Social Sciences material at the elementary school level. The evaluation results show that the proposed method can be used to change the subject in the form of a pronoun into the original subject or object referred to. The average accuracy obtained is 81%. The accuracy is obtained from the comparison between the number of pronouns that have been identified with the overall test data. Researchers in natural language processing can use the results of this study to pre-process their text.

Downloads

Download data is not yet available.

Referensi

ALORAINI, A. & POESIO, M. 2020. Anaphoric Zero Pronoun Identification : A Multilingual Approach’, in Proceedings ofthe 3rd Workshop on Computational Models ofReference, Anaphora and Coreference (CRAC 2020), pp. 22–32.

ANITIA, D., MUNARKO, Y. & AZHAR, Y. 2020. Parsing Twitter Menggunakan Metode Left-Corner Parsing Dengan Memanfaatkan Pos Tagger. Jurnal Repositor, 2(7), p. 897. doi: 10.22219/repositor.v2i7.203.

AWASTHI, I. dkk. 2021. Natural Language Processing ( NLP ) based Text Summarization - A Survey’, in Sixth International Conference on Inventive Computation Technologies, pp. 1310–1317.

BASUKI, S. & KUSUMA, S. F. 2018. Automatic question generation for 5W-1H open domain of indonesian questions by using syntactical template-based features from academic textbooks. Journal of Theoretical and Applied Information Technology, pp. 3908–3923.

CHANG, T. dkk. 2017. Zero Pronoun Identification in Chinese Language with Deep Neural Networks’, in 2nd International Conference on Control, Automation, and Artificial Intelligence (CAAI 2017), pp. 518–522.

KASTRATI, Z. dkk. 2021. Sentiment Analysis of Students’ Feedback with NLP and Deep Learning: A Systematic Mapping Study’, mdpi.

KHOIRUNISA, R. dkk. 2020. Penggunaan Natural Language Processing Pada Chatbot Untuk Media Informasi Pertanian. Indonesian Journal of Applied Informatics, 4(2), p. 55. doi: 10.20961/ijai.v4i2.38688.

KUSUMA, S. F. dkk. 2018. Indonesian Question Generation Based on Bloom’s Taxonomy Using Text Analysis’, in Proceeding - 2018 International Seminar on Intelligent Technology and Its Application,

ISITIA 2018. IEEE, pp. 269–274. doi: 10.1109/ISITIA.2018.8711015.

KUSUMA, S. F. & ALHAMRI, R. Z. 2018. Generating Indonesian Question Automatically Based on Bloom’s Taxonomy Using Template Based Method’, KINETIK: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, 3(2), pp. 145–152. doi: 10.22219/kinetik.v3i2.650.

KUSUMA, S. F., SUKYA, F. & HERIADI, A. 2021. Pendekatan Baru untuk Merepresentasi Informasi di Bidang Pendidikan Menggunakan Kombinasi Ontologi. Jurnal Edukasi dan Penelitian Informatika (JEPIN), 7(2), p. 160. doi: 10.26418/jp.v7i2.46978.

MOELIONO, A. M. ET AL. 2017. Adan pengembanga d pe b kementerian pendidikan dan kebud.

PISTOL, I., TRANDABĂT, D. & RĂSCHIP, M. 2018. Medi-Test : Generating Tests from Medical Reference Texts’, mdpi. doi: 10.3390/data3040070.

PRATAMA, M. R., KUSUMADEWI, S. & HIDAYAT, T. 2017. Penerapan algoritma. Jurnal Teknologi Elektro, Universitas Mercu Buana, 8(1), pp. 1–8.

SARIPUDIN, P. & PURNAMASARI, K. K. 2017. Pendeteksian Keterkaitan Antar Kalimat Dengan Metode Template - Based Dalam Pembangkit Pertanyaan Otomatis’, pp. 1–7.

SUGONO. 2008. Kamus Bahasa Indonesia. Jakarta: Pusat Bahasa Departemen Pendidikan Nasional.

WARDANA, H. K., SWANITA, I. & YOHANES, B. W. 2019. Sistem Pemeriksa Pola Kalimat Bahasa Indonesia berbasis Algoritme Left-Corner Parsing dengan Stemming. Jurnal Nasional Teknik Elektro dan Teknologi Informasi (JNTETI), 8(3), p. 211. doi: 10.22146/jnteti.v8i3.515.

Natural Language Processing untuk Otomatisasi Pengenalan Pronomina dalam Kalimat Bahasa Indonesia

Penulis

DOI:

Abstrak

Downloads

Referensi

Unduhan

Diterbitkan

Terbitan

Bagian

Lisensi

Cara Mengutip

Kirim Naskah

side menu

sertifikat akreditasi

Pengindeks Jurnal

Mendeley

Citations & Reference Manager

pengunjung

Keywords

Information

Supported by

Technical Support

Laboratorium

Direktori UB