Clustering Based on Correlation Fractal Dimension Over an
Evolving Data Stream
Anuradha Yarlagadda1, Murthy
Jonnalagedda2, and Krishna Munaga2
1Department of Computer Science and
Engineering, Jawaharlal Nehru Technological University, India
2Department of Computer Science and
Engineering, University College of Engineering Kakinada, India
Abstract: Online clustering, in an evolving high dimensional data is an
amazing challenge for data mining applications. Although, many clustering
strategies have been proposed, it is still an exciting task since the published
algorithms fail to do well with high dimensional datasets, finding arbitrary
shaped clusters and handling outliers. Knowing fractal characteristics of
dataset can help abstract the dataset and provide insightful hints in the
clustering process. This paper concentrates on presenting a novel strategy,
FractStream for clustering data streams using fractal dimension, basic window
technology, and damped window model. Core fractal-clusters, progressive fractal-cluster,
outlier fractal clusters are identified, aiming to reduce search complexity and
execution time. Pruning strategies are also employed based on the weights
associated with each cluster, which reduced the usage of main memory.
Experimental study of this paper over a number of data sets demonstrates the effectiveness
and efficiency of the proposed technique.
Keywords: Cluster, data stream, fractal,
self-similarity, sliding window, damped window.
Received January 24, 2014; accepted October 14, 2014