Maximum Spanning Tree Based Redundancy Elimination for Feature Selection of High Dimensional Data

Maximum Spanning Tree Based Redundancy

Elimination for Feature Selection of High

Dimensional Data

Bharat Singh and Om Prakash Vyas

Department of Information Technology, Indian Institute of Information Technology-Allahabad, India

Abstract: Feature selection adheres to the phenomena of preprocessing step for High Dimensional data to obtain optimal results with reference of speed and time. It is a technique by which most prominent features can be selected from a set of features that are prone to contain redundant and relevant features. It also helps to lighten the burden on classification techniques, thus makes it faster and efficient.We introduce a novel two tiered architecture of feature selection that can able to filter relevant as well as redundant features. Our approach utilizes the peculiar advantage of identifying highly correlated nodes in a tree. More specifically, the reduced dataset comprises of these selected features. Finally, the reduced dataset is tested with various classification techniques to evaluate their performance. To prove its correctness we have used many basic algorithms of classification to highlight the benefits of our approach. In this journey of work we have used benchmark datasets to prove the worthiness of our approach.

Keywords: Data mining, feature selection, tree based approaches, maximum spanning tree, high dimensional data.

Received February 15, 2015; accepted December 21, 2015

 Full text

Read 2952 times
Share
Top
We use cookies to improve our website. By continuing to use this website, you are giving consent to cookies being used. More details…