TY - JOUR
T1 - An intelligent use of stemmer and morphology analysis for Arabic information retrieval
AU - Alnaied, Ali
AU - Elbendak, Mosa
AU - Bulbul, Abdullah
N1 - Funding Information: We would like to thank the anonymous reviews for their valuable comments which have helped us to improve this paper. This work is partially supported by the National Natural Science Foundation of China under Grant No. 60775028, the Major Projects of Technology Bureau of Dalian No.2007A14GXD42, and IT Industry Development of Jilin Province.
PY - 2020/12
Y1 - 2020/12
N2 - Arabic Information Retrieval has gained significant attention due to an increasing usage of Arabic text on the web and social media networks. This paper discusses a new approach for Arabic stem, called Arabic Morphology Information Retrieval (AMIR), to generate/extract stems by applying a set of rules regarding the relationship among Arabic letters to find the root/stem of the respective words used as indexing terms for the text search in Arabic retrieval systems. To demonstrate the usefulness of the proposed algorithm, we highlight the benefits of the proposed rules for different Arabic information retrieval systems. Finally, we have evaluated AMIR system by comparing its performance with LUCENE, FARASA, and no-stemmer counterpart system in terms of mean average precisions. The results obtained demonstrate that AMIR has achieved a mean average precision of 0.34% while LUCENE, FARASA and no stemmer giving 0.27%, 0.28% and 0.21, respectively. This demonstrates that AMIR is able to improve Arabic stemmer and increases retrieval as well as being strong against any type of stem.
AB - Arabic Information Retrieval has gained significant attention due to an increasing usage of Arabic text on the web and social media networks. This paper discusses a new approach for Arabic stem, called Arabic Morphology Information Retrieval (AMIR), to generate/extract stems by applying a set of rules regarding the relationship among Arabic letters to find the root/stem of the respective words used as indexing terms for the text search in Arabic retrieval systems. To demonstrate the usefulness of the proposed algorithm, we highlight the benefits of the proposed rules for different Arabic information retrieval systems. Finally, we have evaluated AMIR system by comparing its performance with LUCENE, FARASA, and no-stemmer counterpart system in terms of mean average precisions. The results obtained demonstrate that AMIR has achieved a mean average precision of 0.34% while LUCENE, FARASA and no stemmer giving 0.27%, 0.28% and 0.21, respectively. This demonstrates that AMIR is able to improve Arabic stemmer and increases retrieval as well as being strong against any type of stem.
KW - Arabic morphological analysis
KW - Arabic stemmer
KW - Information retrieval systems
KW - Natural language processing
UR - http://www.scopus.com/inward/record.url?scp=85081205196&partnerID=8YFLogxK
U2 - 10.1016/j.eij.2020.02.004
DO - 10.1016/j.eij.2020.02.004
M3 - Article
AN - SCOPUS:85081205196
SN - 1110-8665
VL - 21
SP - 209
EP - 217
JO - Egyptian Informatics Journal
JF - Egyptian Informatics Journal
IS - 4
ER -