Mining Recent Maximal Frequent Itemsets Over Data Streams with Sliding Window
Saihua Cai1, Shangbo Hao1,
Ruizhi Sun1, and Gang
Wu2
1College of Information and Electrical Engineering,
China Agricultural University,
China
2Secretary of Computer Science Department, Tarim University, China
Abstract: The huge number of data streams makes it
impossible to mine recent frequent itemsets. Due to the maximal frequent itemsets can
perfectly imply all the frequent itemsets and the number is much smaller, therefore, the time cost and the memory usage for
mining maximal frequent itemsets are much more efficient. This paper proposes
an improved method called Recent Maximal Frequent Itemsets Mining (RMFIsM) to mine recent maximal frequent itemsets over
data streams with sliding window. The RMFIsM method uses two matrixes to store the
information of data streams,
the first matrix stores the information of each transaction and the second one stores the frequent 1-itemsets. The frequent p-itemsets are
mined with “extension” process of frequent 2-itemsets, and the maximal
frequent itemsets are obtained by deleting the sub-itemsets of long frequent itemsets. Finally, the
performance of the RMFIsM method is conducted by a series of experiments, the results show that
the proposed RMFIsM method
can mine recent maximal frequent itemsets efficiently.
Keywords: Data streams, recent maximal frequent itemsets, sliding window, matrix structure.