keyboard_arrow_up
EFFICIENT FEATURE SUBSET SELECTION MODEL FOR HIGH DIMENSIONALDATA

Authors

Chinnu C Georgel and Abdul Ali
Ilahia College of Engineering and Technology, India

Abstract

This paper proposes a new method that intends on reducing the size of high dimensional dataset by identifying and removing irrelevant and redundant features. Dataset reduction is important in the case of machine learning and data mining. The measure of dependence is used to evaluate the relationship between feature and target concept and or between features for irrelevant and redundant feature removal. The proposed work initially removes all the irrelevant features and then a minimum spanning tree of relevant features is constructed using Prim’s algorithm. Splitting the minimum spanning tree based on the dependency between features leads to the generation of forests. A representative feature from each of the forests is taken to form the final feature subset

Keywords

Feature subset selection, filter technique, feature clustering, feature reduction