An Artificial Neural Network Approach for Sentence Boundary Disambiguation in Urdu Language Text
Shazia Raj, Zobia Rehman, Sonia Rauf, Rehana Siddique, and Waqas Anwar
Department of Computer Science, COMSATS Institute of Information Technology, Pakistan
Abstract: Sentence boundary identification is an important step for text processing tasks, e.g., machine translation, POS tagging, text summarization etc., in this paper we present an approach comprising of feed forward neural network along with part of speech information of the words in a corpus. Proposed adaptive system has been tested after training it with varying sizes of data and threshold values. The best results, our system produced are 93.05% precision, 99.53% recall and 96.18% f-measure.
Keywords: Sentence boundary identification, feed forwardneural network, back propagation learning algorithm.
Received April 22, 2013; accepted September 19, 2013