A Rule-Based Extensible

A Rule-Based Extensible Stemmer for Information Retrieval with Application to Arabic

Haidar Harmanani1, Walid Keirouz2, and Saeed Raheel1

1Computer Science and Mathematics Division, Lebanese American University, Lebanon

2Department of Computer Science, American University of Beirut, Lebanon

Abstract: This paper presents a new and extensible method for information retrieval and content analysis in Natural Languages (NL).  The proposed method is stem-based; stems are extracted based on a set of language dependent rules that are interpreted by a rule engine. The rule engine allows the system to be adapted to any natural language by modifying the NL semantic rules and grammar. The system has been fully tested using Arabic, and partially using English, Hebrew, and Persian.  We have validated our approach using a database-based prototype. 

Keywords: Natural language processing, information retrieval, stemming.

Received February 21, 2005; accepted July 13, 2005 

 
Read 7783 times Last modified on Wednesday, 20 January 2010 03:05
Share
Top
We use cookies to improve our website. By continuing to use this website, you are giving consent to cookies being used. More details…