Feature Ranking Derived from Data Mining Process Aleˇ s Piln´ y and Pavel Kord´ ık and Miroslav ˇ Snorek Department of Computer Science and Engineering, FEE, Czech Technical University, Prague, Czech Republic pilnya1@fel.cvut.cz, kordikp@fel.cvut.cz, snorek@fel.cvut.cz Abstract. Most common feature ranking methods are based on the sta- tistical approach. This paper compare several statistical methods with new method for feature ranking derived from data mining process. This method ranks features depending on percentage of child units that sur- vived the selection process. A child unit is a processing element trans- forming the parent input features to the output. After training, units are interconnected in the feedforward hybrid neural network called GAME. The selection process is realized by means of niching genetic algorithm, where units connected to least signiﬁcant features starve and fade from population. Parameters of new feature ranking algorithm are investigated and comparison among diﬀerent methods is presented on well known real world and artiﬁcial data sets. 1 Introduction Nowadays data with few input features is the exception. Each feature adds one dimension to the dimensionality of data vectors. For eﬀective, more accurate data mining, it is necessary to use preprocessing methods which reduce dimensionality of input data or describe the relevance of each feature of data. Set of methods to reduce data dimension, so called Feature Selection(FS) [12], search for subset of relevant features from an initial set of features while Feature Extraction(FE) methods [14] create subset of new features containing information extracted from original set of features. Relaxed setting for FS are methods known as Feature Ranking [5], ranking of all original features in correspondence to their relevance. Feature Selection algorithms may be divided into three categories. Algorithms in the ﬁrst category are based on ﬁlters [2], where the signiﬁcance of features is computed outside from classiﬁcation algorithm. On the other side Wrapper methods [6], from the second category, depends on classiﬁer to evaluate quality of selected features. Finally Embedded methods [3] selects relevant features within learning process of internal parameters (e.g. weights between layers of neural networks). The goal of feature selection is to avoid selecting too many or too few variables than necessary. In practical applications, it is impossible to obtain complete set of relevant features. Therefore, the modelled system is open system, and all important features that are not included in the data set (for what reason ever) are summarised as noise [11].