Tracking Recurring Concepts from Evolving Data Streams using Ensemble Method

Tracking Recurring Concepts from Evolving Data

Streams using Ensemble Method

Yange Sun1,2, Zhihai Wang2, Jidong Yuan2, and Wei Zhang2

1School of Computer and Information Technology, Xinyang Normal University, China

2School of Computer and Information Technology, Beijing Jiaotong University, China

Abstract: Ensemble models are the most widely used methods for classifying evolving data stream. However, most of the existing data stream ensemble classification algorithms do not consider the issue of recurring concepts, which commonly exist in real-world applications. Motivated by this challenge, an Ensemble with internal Change Detection (ECD) was proposed to enhance performance by exploring the recurring concepts. It is done by maintaining a pool of classifiers, which dynamically adds and removes classifiers in response to the change detector. The algorithm adopts a two window change detection model, which adopts the Jensen-Shannon divergence to measure the distance of the distributions between old and recent data. When a change is detected, the repository of stored historical concepts is checked for reuse. Experimental results on both synthetic and real-world data streams demonstrate that the proposed algorithm not only outperforms the state-of-art methods on standard evaluation metrics, but also adapts well in different types of concept drift scenarios especially when concept s reappear.

Keywords: Data streams, ensemble classification, change detection, recurring concept, Jensen-Shannon divergence.

Received August 10, 2016; accepted July 8, 2018
Full text    
Read 3241 times Last modified on Sunday, 20 October 2019 01:24
Share
Top
We use cookies to improve our website. By continuing to use this website, you are giving consent to cookies being used. More details…