Perbandingan Algoritma Stemming Porter, Sastrawi, Idris, Dan Arifin & Setiono Pada Dokumen Teks Bahasa Indonesia

Penulis

  • Jasman Pardede Institut Teknologi Nasional Bandung, Bandung
  • Dicky Darmawan Institut Teknologi Nasional Bandung, Bandung

DOI:

https://doi.org/10.25126/jtiik.20251218860

Kata Kunci:

Information Retrieval, Porter, Sastrawi, Idris, Arifin & Setiono

Abstrak

Information Retrieval (IR) memiliki fungsi untuk memisahkan dokumen-dokumen yang relevan dari sekumpulan dokumen yang ada. Terdapat sebuah proses penting pada IR, yaitu stemming. Stemming adalah proses untuk mengurangi kata-kata berimbuhan menjadi bentuk kata dasar. Pada beberapa penelitian sering menggunakan stemming yang berbeda-beda, bahkan ada penelitian yang telah mencoba membandingkan dua algoritma stemming. Sedangkan pada penelitian ini membandingkan empat algoritma stemming. Tujuan penelitian ini adalah untuk mengungkapkan pengaruh stemming pada IR menggunakan dokumen bahasa Indonesia. Algoritma stemming yang digunakan pada penelitian ini adalah Porter, Sastrawi, Idris, dan Arifin Setiono. Kinerja masing-masing algoritma stemming diukur berdasarkan nilai presisi dan kebutuhan waktu. Terdapat 120 dokumen berbahasa Indonesia yang digunakan untuk mengukur kinerja masing-masing algoritma. Berdasarkan hasil eksperimen diperoleh bahwa kinerja terbaik untuk presisi adalah stemming Sastrawi. Sedangkan kinerja terbaik dari segi kebutuhan waktu adalah stemming Arifin Setiono. Kinerja presisi masing-masing stemming Sastrawi, Arifin Setiono, Porter, dan Idris adalah 70.3%, 55.8%, 49.9%, dan 32.2%. Sedangkan kinerja kebutuhan waktu rata-rata masing-masing stemming Arifin Setiono, Sastrawi, Idris, dan Porter adalah sebesar 0,123 detik, 160.1 detik, 168.8 detik, dan 188.4 detik.

 

Abstract

Information Retrieval (IR) has the function of separating relevant documents from a set of existing documents. An important process in IR is stemming. Stemming is the process of reducing affixed words into basic word forms. In several studies, different stemming is often used, and even studies have tried to compare two stemming algorithms. In this study, four stemming algorithms were compared. The purpose of this study was to reveal the effect of stemming on IR using Indonesian language documents. The stemming algorithms used in this study were Porter, Sastrawi, Idris, and Arifin Setiono. The performance of each stemming algorithm was measured based on the accuracy value and time requirements. There were 120 Indonesian language documents used to measure the performance of each algorithm. Based on the experimental results, it was obtained that the best performance for precision was the Sastrawi stemming. While the best performance in terms of time requirements was the Arifin Setiono stemming. The precision performance of each stemming of Sastrawi, Arifin Setiono, Porter, and Idris is 70.3%, 55.8%, 49.9%, and 32.2%. Meanwhile, the average time requirement performance of each stemming of Arifin Setiono, Sastrawi, Idris, and Porter is 0.123 sec, 160.1 sec, 168.8 sec, and 188.4 sec.

Downloads

Download data is not yet available.

Referensi

A. D. HARTANTO, Y. PRISTYANTO, A. N. ROHMAN, E. PUJASTUTI, A. NURMASANI & I. A. ASTUTI. 2023. Measuring of Scientific Document Abstraction Similarity Using Rabin-Karp and Poter Stemmer. International Conference on Informatics, Multimedia, Cyber and Informations System (ICIMCIS), Jakarta Selatan, Indonesia, 2023, pp. 49-54.

<https://doi.org/10.1109/ICIMCIS60089.2023.10348988>

ARIF SISWANDI, A., YUDI PERMANA & ARVITA EMARILIS. 2021.Stemming Analysis Indonesian Language News Text with Porter Algorithm. In: . Journal of Physics: Conference Series. Vol. 1845. IOP Publishing Ltd. <https://doi.org/10.1088/1742-6596/1845/1/012019>

B. SISWANTO & Y. DANI. 2021. Sentiment Analysis about Oximeter as Covid-19 Detection Tools on Twitter Using Sastrawi Library. 8th International Conference on Information Technology, Computer and Electrical Engineering (ICITACEE), Semarang, Indonesia, 2021, pp. 161-164.

<https://doi.org/10.1109/ICITACEE53184.2021.9617216>

E. COLMENARES & H. WU. 2021. Accelerating Workload Processing with MPI for Porter’s Stemming Algorithm. 2021 International Conference on Computational Science and Computational Intelligence (CSCI), Las Vegas, NV, USA, 2021, pp. 1783-1787. <https://doi.org/10.1109/CSCI54926.2021.00337>

JATIKUSUMO, DWIKI & HERRY DERAJAD WIJAYA. 2021. IJCIT (Indonesian Journal on Computer and Information Technology) Pendeteksi Lokasi Kejadian Covid-19 Menggunakan Social Media Dengan Kombinasi Algoritma Stemming Bahasa Indonesia. IJCIT (Indonesian Journal on Computer and Information Technology). Vol. 6. <https://doi.org/10.31294/ijcit.v6i1>

K. A. HAMBARDE & H. PROENÇA. 2023 Information Retrieval: Recent Advances and Beyond, in IEEE Access, vol. 11, pp. 76581-76604, <https://doi.org/ 10.1109/ACCESS.2023.3295776>.

MAULIPAKSI, DESLIANA. 2016. Badan Bahasa Kemendikbud Luncurkan KBBI Edisi IV Daring. Kementrian Pendidikan Dan Kebudayaan. July 27, 2016. <https://www.kemdikbud.go.id/main/blog/2016/07/badan-bahasa-kemendikbud-luncurkan-kbbi-edisi-iv-daring>

N. D. ARIANTI, M. IRFAN, U. SYARIPUDIN, D. MARIANA, N. ROSMAWARNI & D. S. MAYLAWATI. 2019. Porter Stemmer and Cosine Similarity for Automated Essay Assessment. 2019 5th International Conference on Computing Engineering and Design (ICCED), Singapore, 2019, pp. 1-6. < https://doi.org/ 10.1109/ICCED46541.2019.9161090>.

PERMANA, A YUDI. 2017. Implementasi Stemming Porter KBBI Untuk Klasifikasi Topik Soal Ujian Nasional Bahasa Indonesia Menggunakan Algoritma Naive Bayes. Jurnal Teknologi Pelita Bangsa. <https://doi.org/10.37366/sigma.v8i3.126>

RAHMATULLOH, ALAM, NENG IKA KURNIATI, IRFAN DARMAWAN, ADI ZAENAL ASYIKIN & DEDEN WITARSYAH. 2020. Comparison of the Effects Stemmer Porter and Nazief-Adriani on the Performance of Winnowing Algorithms for Measuring Plagiarism. Journal of Digital Information Management 18: 49. <https://doi.org/10.6025/jdim/2020/18/2/49-56>.

REZALINA, OPPIE. 2020. Perbandingan Algoritma Stemming Nazief & Adriani, Porter Dan Arifin Setiono Untuk Dokumen Teks Bahasa Indonesia. Universitas muhammadiyah jember, <http://repository.unmuhjember.ac.id/550/1/JURNAL.pdf >.

ROSID, MOCHAMAD ALFAN, ARIF SENJA FITRIANI, IKA RATNA INDRA ASTUTIK, NASRUDIN IQROK MULLOH & HARIS AHMAD GOZALI. 2020. Improving Text Preprocessing for Student Complaint Document Classification Using Sastrawi. In: . IOP Conference Series: Materials Science and Engineering. Vol. 874. Institute of Physics Publishing. <https://doi.org/10.1088/1757-899X/874/1/012017>.

SINAGA, ARDILES & SAHAT PANDAPOTAN NAINGGOLAN. 2023. Analisis Perbandingan Akurasi Dan Waktu Proses Algoritma Stemming Arifin-Setiono Dan Nazief-Adriani Pada Dokumen Teks Bahasa Indonesia. Sebatik 27: 63–69. <https://doi.org/10.46984/sebatik.v27i1.2072>.

SISWANDI, ARIF & NURHADI SUROJUDIN. 2020. Analisis Dan Perbandingan Stemming Algoritma Porter Dengan Algoritma Ahmad Yusoff Sembok Dalam Dokumen Teks Bahasa Indonesia. Seminar Nasional Teknologi Informasi Dan Komunikasi STI&K (SeNTIK) 4. <https://ejournal.jak-stik.ac.id/files/journals/2/articles/sentik2020/324/submission/proof/324-13-1121-1-10-20201101.pdf>

SUCI, FEBIARTY WULAN, NUR HAYATIN & YUDA MUNARKO. 2022. In-Idris: Modification Of Idris Stemming Algorithm For Indonesian Text. IIUM Engineering Journal 23: 82–94. <https://doi.org/10.31436/IIUMEJ.V23I1.1783>.

WAHYUDI, DWI, TEGUH SUSYANTO & DIDIK NUGROHO. 2017. Implementasi Dan Analisis Algoritma Stemming Nazief & Adriani Dan Porter Pada Dokumen Berbahasa Indonesia. Jurnal Ilmiah SINUS, 15 (2). <https://doi.org/10.30646/sinus.v15i2.305 >

WISUDA SARDJONO, MOCHAMMAD, MARGI CAHYANTI, MAULANA MUJAHIDIN & RINI ARIANTY. 2020. Pendeteksi Kesamaan Kata Untuk Judul Penulisan Berbahasa Indonesia Menggunakan Algoritma Stemming Nazief-Adriani. <https://jurnal.wicida.ac.id/index.php/sebatik/article/view/320>

Diterbitkan

27-02-2025

Terbitan

Bagian

Ilmu Komputer

Cara Mengutip

Perbandingan Algoritma Stemming Porter, Sastrawi, Idris, Dan Arifin & Setiono Pada Dokumen Teks Bahasa Indonesia. (2025). Jurnal Teknologi Informasi Dan Ilmu Komputer, 12(1), 69-76. https://doi.org/10.25126/jtiik.20251218860