Pengenalan Entitas Bernama Menggunakan Bi-LSTM pada Chatbot Bahasa Indonesia

Penulis

  • Anshar Zulhilmi Universitas Brawijaya, Malang
  • Rizal Perdana Universitas Brawijaya, Malang
  • Indriati Universitas Brawijaya, Malang

DOI:

https://doi.org/10.25126/jtiik.1077968

Kata Kunci:

chatbot, pengenalan entitas bernama, NER, bi-lstm, skor F1

Abstrak

Institusi publik perlu mengitegrasikan e-government ke dalam struktur pengelolaan mereka. Sistem pelayanan terpadu harus dapat menyelesaikan masalah yang dihadapi oleh pengguna layanan. Apabila sistem pelayanan terpadu hanya mengandalkan manusia, maka sistem pelayanan terpadu dapat terhambat. Chatbot adalah salah satu solusi untuk menggantikan peran manusia dalam sistem pelayanan terpadu. Salah satu komponen pada chatbot adalah pengenalan entitas bernama. Pada penelitian ini, pengenalan entitas bernama dilakukan dalam beberapa tahap. Tahapan-tahapan tersebut antara lain penghilangan noise, pelabelan data, pembuatan kamus kata dan label, encoding urutan dan pemisahan data, inisiasi model, dan pelatihan model. Model yang digunakan yakni bidirectional long-short term memory. Skor F1 terbaik yang didapat dari pengujian adalah 87,44% dengan hyperparameter jumlah layer sebanyak 2, hidden size sebanyak 100, dan learning rate sebesar 0,01. Kemudian, penambahan jumlah layer maupun hidden size kurang berpengaruh terhadap skor F1 yang dihasilkan oleh model. Learning rate memengaruhi seberapa cepat model mencapai solusi optimal.

 

Abstract

 

Public institutions must integrate e-government into their management structures. An integrated service system must be able to solve the problems faced by service users. If the integrated service system only relies on humans, then the integrated service system can be hampered. Chatbot is one of the solutions to replace the human role in an integrated service system. One component of the chatbot is named entity recognition. In this study, the named entity recognition was carried out in several stages. These stages include noise removal, data labeling, word and label dictionary creation, sequence encoding and data separation, model initiation, and model training. The model used is bidirectional long-short term memory. The best F1 score obtained from the test is 87.44% with hyperparameters of the number of layers of 2, hidden size of 100, and learning rate of 0.01. The addition of the number of layers and hidden size has little effect on the F1 score produced by the model. The learning rate affects how fast the model reaches the optimal solution.

Downloads

Download data is not yet available.

Referensi

ARITONANG, D.M., 2017. The Impact of E-Government System on Public Service Quality in Indonesia. European Scientific Journal, ESJ, 13(35), p.99. https://doi.org/10.19044/esj.2017.v13n35p99.

AZARINE, I.S., ARIF BIJASKSANA, M. and ASROR, I., 2019. Named Entity Recognition on Indonesian Tweets using Hidden Markov Model. In: 2019 7th International Conference on Information and Communication Technology (ICoICT). IEEE. pp.1–5. https://doi.org/10.1109/ICoICT.2019.8835277.

GRAVES, A., 2012. Long Short-Term Memory. pp.37–45. https://doi.org/10.1007/978-3-642-24797-2_4.

HIEN, H.T., CUONG, P.-N., NAM, L.N.H., NHUNG, H.L.T.K. and THANG, L.D., 2018. Intelligent Assistants in Higher-Education Environments. In: Proceedings of the Ninth International Symposium on Information and Communication Technology - SoICT 2018. New York, New York, USA: ACM Press. pp.69–76. https://doi.org/10.1145/3287921.3287937.

LI, B., JIANG, N., SHAM, J., SHI, H. and FAZAL, H., 2019. Real-World Conversational AI for Hotel Bookings. In: 2019 Second International Conference on Artificial Intelligence for Industries (AI4I). IEEE. pp.58–62. https://doi.org/10.1109/AI4I46381.2019.00022.

NIRALA, K.K., SINGH, N.K. and PURANI, V.S., 2022. A survey on providing customer and public administration based services using AI: chatbot. Multimedia Tools and Applications, 81(16), pp.22215–22246. https://doi.org/10.1007/s11042-021-11458-y.

Panchendrarajan, R. and Amaresan, A., 2018. Bidirectional LSTM-CRF for named entity recognition. In: Proceedings of the 32nd Pacific Asia conference on language, information and computation.

RACHMAN, V., SAVITRI, S., AUGUSTIANTI, F. and Mahendra, R., 2017. Named entity recognition on Indonesian Twitter posts using long short-term memory networks. In: 2017 International Conference on Advanced Computer Science and Information Systems (ICACSIS). IEEE. pp.228–232. https://doi.org/10.1109/ICACSIS.2017.8355038.

SHINGTE, K., CHAUDHARI, A., Patil, A., CHAUDHARI, A. and DESAI, S., 2021. Chatbot Development for Educational Institute. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3861241.

TAHER, E., HOSEINI, S.A. and SHAMSFARD, M., 2020. Beheshti-NER: Persian Named Entity Recognition Using BERT. [online] Available at: <http://arxiv.org/abs/2003.08875>.

Diterbitkan

30-12-2023

Cara Mengutip

Pengenalan Entitas Bernama Menggunakan Bi-LSTM pada Chatbot Bahasa Indonesia. (2023). Jurnal Teknologi Informasi Dan Ilmu Komputer, 10(7), 1425-1430. https://doi.org/10.25126/jtiik.1077968