Word Prediction via a Clustered Optimal Binary Search Tree
Eyas El-Qawasmeh
Computer Science Department, Jordan University of Science and Technology, Jordan
Abstract: Word prediction methodologies depend heavily on the statistical approach that uses the unigram, bigram, and the trigram of words. However, the construction of the N-gram model requires a very large size of memory, which is beyond the capability of many existing computers. Beside this, the approximation reduces the accuracy of word prediction. In this paper, we suggest to use a cluster of computers to build an Optimal Binary Search Tree (OBST) that will be used for the statistical approach in word prediction. The OBST will contain extra links so that the bigram and the trigram of the language will be presented. In addition, we suggest the incorporation of other enhancements to achieve optimal performance of word prediction. Our experimental results showed that the suggested approach improves the keystroke saving.
Keywords: Bigram, cluster computing, N-gram, unigram, trigram, word frequency, word prediction.
Received April 21, 2003; accepted July 29, 2003
Full Text