An Efficient Algorithm for Extracting Infrequent Itemsets from Weblog

An Efficient Algorithm for Extracting Infrequent

Itemsets from Weblog

Brijesh Bakariya1 and Ghanshyam Thakur2

1Department of Computer Science and Engineering, I.K. Gujral Punjab Technical University, India

2Department of Computer Applications, Maulana Azad National Institute of Technology, India

Abstract: Weblog data contains unstructured information. Due to this, extracting frequent pattern from weblog databases is a very challenging task. A power set lattice strategy is adopted for handling that kind of problem. In this lattice, the top label contains full set and at the bottom label contains empty set. Most number of algorithms follows bottom-up strategy, i.e. combining smaller to larger sets. Efficient lattice traversal techniques are presented which quickly identify all the long frequent itemsets and their subsets if required. This strategy is suitable for discovering frequent itemsets but it might not be worth being used for infrequent itemsets. In this paper, we propose Infrequent Itemset Mining for Weblog (IIMW) algorithm; it is a top-down breadth-first level-wise algorithm for discovering infrequent itemsets. We have compared our algorithm IIMW to Apriori-Rare, Apriori-Inverse and generated result in with different parameters such as candidate itemset, frequent itemset, time, transaction database and support threshold.

Keywords: Infrequent itemsets, lattice, frequent itemsets, weblog, support threshold.

Received September 6, 2014; accepted March 24, 2016
Full text  
Read 1575 times Last modified on Sunday, 24 February 2019 06:53
Share
Top
We use cookies to improve our website. By continuing to use this website, you are giving consent to cookies being used. More details…