Compact Tree Structures for Mining High Utility Itemsets
Anup Bhat Department of Computer Science and Engineering, Manipal Academy of Higher Education, India This email address is being protected from spambots. You need JavaScript enabled to view it. |
Harish Venkatarama Department of Computer Science and Engineering, Manipal Academy of Higher Education, India This email address is being protected from spambots. You need JavaScript enabled to view it. |
Geetha Maiya Department of Computer Science and Engineering, Manipal Academy of Higher Education, India This email address is being protected from spambots. You need JavaScript enabled to view it. |
Abstract: High Utility Item set Mining (HUIM) from large transaction databases has garnered significant attention as it accounts for the revenue of the items purchased in a transaction. Existing tree-based HUIM algorithms discard unpromising items and require at most two database scans for their construction. Hence, whenever utility threshold is changed, the trees have to be reconstructed from scratch. In this regard, the current study proposes to not only incorporate all the items in the tree structure but compactly represent transaction information. The proposed trees namely- Utility Prime Tree (UPT), Prime Cantor Function Tree (PCFT), and String based Utility Prime Tree (SUPT) store transaction-level information in a node unlike item-based prefix trees. Experiments conducted on both real and synthetic datasets compare the execution time and memory of these tree structures with a proposed Utility Count Tree (UCT) and existing IHUP, UP-Growth trees. Due to transaction-level encoding, these structures consume significantly less memory when compared to the tree structures in the literature.
Keywords: High utility itemset mining, tree based algorithms.
Received February 11, 2020; accepted July 13, 2021