An Efficient Approach for Mining Frequent Itemsets with Transaction Deletion Operation
Bay Vo1, 2, Thien-Phuong Le3, Tzung-Pei Hong4, Bac Le5, Jason Jung6
1Division of Data Science, Ton Duc Thang University, Vietnam
2Faculty of Information Technology, Ton Duc Thang University, Vietnam
3Faculty of Technology, Pacific Ocean University, Vietnam
4Department of Computer Science and Information Engineering,
National University of Kaohsiung, Taiwan
5Department of Computer Science, University of Science, Vietnam
6Department of Computer Engineering, Chung-Ang University, Republic of Korea
Abstract: Deletion of transactions in databases is common in real-world applications. Developing an efficient and effective mining algorithm to maintain discovered information is thus quite important in data mining fields. A lot of algorithms have been proposed in recent years, and the best of them is the pre-large-tree-based algorithm. However, this algorithm only rebuilds the final pre-large tree every deleted transactions. After that, the FP-growth algorithm is applied for mining all frequent itemsets. The pre-large-tree-based approach requires twice the computation time needed for a single procedure. In this paper, we present an incremental mining algorithm to solve above issues. An itemset tidset-tree structure will be used to maintain large and pre-lagre itemsets. The proposed algorithm only processes deleted transactions for updating some nodes in this tree, and all frequent itemsets are directly derived from the tree traversal process. Experimental results show that the proposed algorithm has good performance.
Keywords: Data mining, frequent itemsets, incremental mining, pre-large itemsets, itemset-tidset tree.