12 Optimal Feature Generation with Genetic Algorithms and FLDR in a Restricted- Vocabulary Speech Recognition System Julio César Martínez-Romo 1 , Francisco Javier Luna-Rosas 2 , Miguel Mora-González 3 , Carlos Alejandro de Luna-Ortega 4 and Valentín López-Rivas 5 1,2,5 Instituto Tecnológico de Aguascalientes 3 Universidad de Guadalajara, Centro Universitario de los Lagos 4 Universidad Politécnica de Aguascalientes Mexico 1. Introduction In every pattern recognition problem there exist the need for variable and feature selection and, in many cases, feature generation. In pattern recognition, the term variable is usually understood as the raw measurements or raw values taken from the subjects to be classified, while the term feature is used to refer to the result of the transformations applied to the variables in order to transform them into another domain or space, in which a bigger discriminant capability of the new calculated features is expected; a very popular cases of feature generation are the use of principal component analysis (PCA), in which the variables are projected into a lower dimensional space in which the new features can be used to visualize the underlying class distributions in the original data [1], or the Fourier Transform, in which a few of its coefficients can represent new features [2], [3]. Sometimes, the literature does not make any distinction between variables and features, using them indistinctly [4], [5]. Although many variables and features can be obtained for classification, not all of them posse discriminant capabilities; moreover, some of them could cause confusion to a classifier. That is the reason why the designer of the classification system will require to refine his choice of variables and features. Several specific techniques for such a purpose are available [1], and some of them will be reviewed later on in this chapter. Optimal feature generation is the generation of the features under some optimality criterion, usually embodied by a cost function to search the solutions’ space of the problem at hand and providing the best option to the classification problem. Examples of techniques like these are the genetic algorithms [6] and the simulated annealing [1]. In particular, genetic algorithms are used in this work. Speech recognition has been a topic of high interest in the research arena of the pattern recognition community since the beginnings of the current computation age [7], [8]; it is due,