A New Approach for Arabic Named Entity
Recognition
Wahiba Karaa and Thabet
Slimani
College of Computers and
Information Technology, Taif University, KSA
Abstract:
A Named Entity Recognition (NER) plays a noteworthy
role in Natural Language Processing (NLP) research, since it makes available
the detection of proper nouns in unstructured texts. NER makes easier
searching, retrieving, and extracting information seeing as the significant
information in texts is usually sited around proper names. This paper suggests
an efficient approach that can identify Named Entities (NE) in Arabic texts
without the need for morphological or syntactic analysis or gazetteers. The
goal of our approach is to provide a general framework for Arabic NE
recognition. Within this framework; the system learns the recognition of NE automatically
and induces NE systematically, starting from sample NE instances as seeds. This
method takes advantage from the web, the approach learns from a web corpus. The
seeds are used to identify the contexts in the web denoting NE and then the
contexts identify new NE. Thorough experimental evaluation of our approach, the
performances measured by recall, precision and f-measure conducted to recognize
NE are promising. We obtained an overall rate of F-measure equal to 83%.
Keywords: Arabic NE,
machine learning, web document, information retrieval, information extraction.