PatHT: An Efficient Method of Classification over
Evolving Data Streams
Meng Han, Jian Ding, and Juan Li
School of Computer Science
and Engineering, North Minzu University, China
Abstract: Some existing classifications need frequent update to
adapt to the change of
concept in data streams. To solve this problem, an
adaptive method Pattern-based Hoeffding Tree (PatHT) is proposed to process evolving data
streams. A key
technology of a training classification decision tree is to improve the efficiency of
choosing an optimal splitting attribute. Therefore, frequent patterns are used.
Algorithm PatHT discovers constraint-based closed frequent patterns incremental
updated. It builds an adaptive and incremental updated tree based on the
frequent pattern set. It uses sliding window to avoid concept drift in mining
patterns and uses concept drift detector to deal with concept change problem in
procedure of training examples. We tested the performance of PatHT against some
known algorithms using real data streams and synthetic data streams with
different widths of concept change. Our approach outperforms
traditional classification models and it is proved by the
experimental results.
Key words: Data
mining; decision tree; data stream classification; closed pattern mining; concept drift.