Incremental Transitivity Applied to Cluster Retrieval
Yaser Hasan1, Muhammad Hassan2, and Mick Ridley1
1Computing Department, University of Bradford, UK
2Computer Science Department, Zarqa Private University, Jordan
Abstract: Many problems have emerged while building accurate and efficient clusters of documents; such as the inherent problems of the similarity measure, and document logical view modeling. This research is an attempt to minimize the effect of these problems by using a new definition of transitive relevance between documents; i.e., adding more conditions on transitive relevance judgment through incrementing the relevance threshold by a constant value at each level of transitivity. Proving the relevance relation to be transitive, will make it an equivalence relation that can be used to build equivalence classes of relevant documents. The main contribution of this paper is to use this definition to partition a set of documents into disjoint subsets as equivalence classes (clusters). Another contribution is by using the incremental transitive relevance relation; the traditional vector space model can be made incrementally transitive.