Journal of Software Engineering and Applications

Volume 4, Issue 9 (September 2011)

ISSN Print: 1945-3116   ISSN Online: 1945-3124

Google-based Impact Factor: 1.22  Citations  h5-index & Ranking

The Enhancement of Arabic Stemming by Using Light Stemming and Dictionary-Based Stemming

HTML  Download Download as PDF (Size: 234KB)  PP. 522-526  
DOI: 10.4236/jsea.2011.49060    5,143 Downloads   9,110 Views  Citations

Affiliation(s)

.

ABSTRACT

Word stemming is one of the most important factors that affect the performance of many natural language processing applications such as part of speech tagging, syntactic parsing, machine translation system and information retrieval systems. Computational stemming is an urgent problem for Arabic Natural Language Processing, because Arabic is a highly inflected language. The existing stemmers have ignored the handling of multi-word expressions and identification of Arabic names. We used the enhanced stemming for extracting the stem of Arabic words that is based on light stemming and dictionary-based stemming approach. The enhanced stemmer includes the handling of multiword expressions and the named entity recognition. We have used Arabic corpus that consists of ten documents in order to evaluate the enhanced stemmer. We reported the accuracy values for the enhanced stemmer, light stemmer, and dictionary-based stemmer in each document. The results obtain shows that the average of accuracy in enhanced stemmer on the corpus is 96.29%. The experimental results showed that the enhanced stemmer is better than the light stemmer and dictionary-based stemmer that achieved highest accuracy values.

Share and Cite:

Alhanini, Y. and Aziz, M. (2011) The Enhancement of Arabic Stemming by Using Light Stemming and Dictionary-Based Stemming. Journal of Software Engineering and Applications, 4, 522-526. doi: 10.4236/jsea.2011.49060.

Cited by

[1] AN IMPROVED HAUSA WORD STEMMING ALGORITHM
FUDMA JOURNAL OF …, 2022
[2] Text mining at multiple granularity: leveraging subwords, words, phrases, and sentences
2020
[3] Constrained Sequence-to-sequence Semitic Root Extraction for Enriching Word Embeddings
2019
[4] The Enhancement of Arabic Information Retrieval Using Arabic Text Summarization
2019
[5] Building Sense Tagged Corpus Using Wikipedia for Supervised Word Sense Disambiguation
Procedia Computer Science, 2018
[6] Conditional Arabic Light Stemmer: CondLight
2017
[7] Pengembangan Algoritma Stemming Bahasa Indonesia dengan Pendekatan Dictionary Base Stemming untuk Menentukan Kata Dasar dari Kata yang Berimbuhan
2017
[8] Fixing the Infix: Unsupervised Discovery of Root-and-Pattern Morphology
2017
[9] Pengembangan Algoritma Stemming Bahasa Indonesia dengan Pendekatan Dictionary Base Stemming untuk Menentukan Kata Dasar dari Kata yang …
2017
[10] Improving document relevancy using integrated language modeling techniques
2016
[11] IMPROVING DOCUMENT RELEVANCY USING INTEGRATED LANGUAGE MODELING TECHNIQUES.
Malaysian Journal of Computer Science, 2016
[12] Building and Benchmarking Novel Arabic Stemmer for Document Classification
Journal of Computational and Theoretical Nanoscience, 2016
[13] Arabic Information Retrieval: A Relevancy Assessment Survey
2016
[14] An Intelligent Framework for Natural Language Stems Processing
2016
[15] Stemming Hausa text: using affix-stripping rules and reference look-up
Language Resources and Evaluation, 2015
[16] New rules-based algorithm to improve Arabic stemming accuracy
International Journal of Knowledge Engineering and Data Mining, 2015
[17] Mapping Arabic WordNet synsets to Wikipedia articles using monolingual and bilingual features
Natural Language Engineering, 2015
[18] Building an Arabic Word Stemmer for Textual Document Classification ﺔﯿﺑﺮﻌﻟا تﺎﻤﻠﻜﻠﻟ رﺬﺠﻣ ءﺎﻨﺑ تﺎﻔﻠﻤﻟا ﻒﯿﻨﺼﺘﻟ ﺔﯿﺼﻨﻟا‎
Thesis, 2015
[19] Building an Arabic Word Stemmer for Textual Document Classification
2014
[20] Improving Arabic Light Stemming in Information Retrieval Systems
2014
[21] Normalized Google Distance for Collocation Extraction from Islamic Domain
Journal of Information Engineering and Applications, 2014
[22] Evaluating Knowledge-Based Semantic Measures on Arabic
International Journal on Communications Antenna and Propagation (IRECAP), 2014
[23] Arabic person names recognition by using a rule based approach
Journal of Computer Science, 2013
[24] Building and Benchmarking New Heavy/Light Arabic Stemmer
2013
[25] Stemming Tigrinya Words for Information Retrieval
24th International Conference on Computational Linguistics, 2012

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.