Correlation Dependencies between Variables in Feature Selection on Boolean Symbolic Objects
Djamal
Ziani
College of Business and Administration, Al
Yamamah University, Saudi Arabia
Abstract: Feature selection is an important process in data
analysis and data mining. The increasing size, complexity, and multi-valued
nature of data necessitate the use of Symbolic Data Analysis (SDA), which
utilizes symbolic objects instead of classical tables, for data analysis. The
symbolic objects are created by using abstraction or generalization techniques
on individuals. They are a representation of concepts or clusters. To improve
the description of these objects, and to eliminate incoherencies and
over-generalization, using dependencies between variables is crucial in SDA.
This study shows how correlation dependencies between variables can be
processed on Boolean Symbolic Objects (BSOs) in feature selection. A new
feature selection criterion that considers the dependencies between variables,
and a method of dealing with computation complexity is also presented.
Keywords: Feature selection, dependencies; symbolic data analysis,
discrimination criteria.