A Rule-Based Extensible

A Rule-Based Extensible Stemmer for Information Retrieval with Application to Arabic

Haidar Harmanani1, Walid Keirouz2, and Saeed Raheel1

1Computer Science and Mathematics Division, Lebanese American University, Lebanon

2Department of Computer Science, American University of Beirut, Lebanon

Abstract: This paper presents a new and extensible method for information retrieval and content analysis in Natural Languages (NL).  The proposed method is stem-based; stems are extracted based on a set of language dependent rules that are interpreted by a rule engine. The rule engine allows the system to be adapted to any natural language by modifying the NL semantic rules and grammar. The system has been fully tested using Arabic, and partially using English, Hebrew, and Persian.  We have validated our approach using a database-based prototype. 

Keywords: Natural language processing, information retrieval, stemming.

Received February 21, 2005; accepted July 13, 2005 

 
Read 7089 times Last modified on Wednesday, 20 January 2010 03:05
Share

Upcoming courses

  • Diploma Courses
  • Business and Enterprise
  • Digital Literacy & IT
  • Health Literacy
  • Business Literacy

Free courses

Starting from Jun. 14 2016

the degree finder

in 3 easy steps
Top
We use cookies to improve our website. By continuing to use this website, you are giving consent to cookies being used. More details…