Texts Semantic Similarity Detection Based Graph Approach
Majid Mohebbi and Alireza Talebpour
Department of
Computer Engineering, Faculty of Electrical and Computer Engineering,
Shahid Beheshti University, Iran
Abstract: Similarity of text documents is important to analyze and extract useful information from text documents and generation of the appropriate data. Several cases of lexical matching techniques offered to determine the similarity between documents that have been successful to a certain limit and these methods are failing to find the semantic similarity between two texts. Therefore, the semantic similarity approaches were suggested, such as corpus-based methods and knowledge based methods e.g., WordNet based methods. This paper, offers a new method for Paraphrase Identification (PI) in order to, measuring the semantic similarity of texts using an idea of a graph. We intend to contribute to the order of the words in sentence. We offer a graph based algorithm with specific implementation for similarity identification that makes extensive use of word similarity information extracted from WordNet. Experiments performed on the Microsoft Research Paraphrase Corpus and we show our approach achieves appropriate performance.
Keywords: WordNet, semantic similarity, similarity metric, graph theory.
Received November 17, 2013; accepted June 23, 2014