HierarchicalRank: Webpage Rank Improvement
Using HTML TagLevel Similarity
Dilip Sharma and Deepak
Ganeshiya
Department
of Computer Engineering and Applications, GLA
University Mathura, India
Abstract: In the past researches, two types of algorithms are
introduced that are query dependent and query independent, works online or
offline. PageRank Algorithm works offline independent to query while Hyperlink-Induced
Topic Search (HITS) algorithm woks online dependent on query. One of the
problems of these algorithms is that, division of the rank is based on number
of inlinks, outlinks and different parameters used in hyperlink analysis which
is dependent or independent to webpage content with the problem of topic drift.
Previous researches were focused to solve this problem using the popularity of
the outlink webpages. In this paper a novel algorithm for popularity measure is
proposed based on similarity between query and Hierarchical text extracted from
source and target webpage using Hyper Text Markup Language (HTML) tags
importance parameter. In this paper, result of proposed method is compared with
PageRank Algorithm and Topic Distillation with Query Dependent Link Connections
and Page Characteristics results.
Keywords: Web mining, web graph, hyperlink analysis,
connectivity, pagerank, HTML tags.
Received July 21, 2014; accepted October 14, 2014