Learning and Nonlinear Models (L&NLM) Journal of the Brazilian Neural Network Society, Vol. 8, Iss. 3, pp. 125-134, 2010. © Sociedade Brasileira de Redes Neurais (SBRN) 125 FEATURE SELECTION VIA GENETIC ALGORITHMS IN THE CLASSIFICATION OF ANTI-SNAKE VENOM MEDICINAL PLANTS Lariza Laura de Oliveira 1 , Gabriela Felix Persinoti 2 , Silvana Giuliatti 2 and Renato Tinós 1 1 Grupo de Informática Biomédica, Departamento de Física e Matemática, Faculdade de Filosofia, Ciência e Letras de Ribeirão Preto, Universidade de São Paulo 2 Departamento de Genética, Faculdade de Medicina de Ribeirão Preto, Universidade de São Paulo {larizalaura@usp.br, gabi.felix@gmail.com, silvana@rge.fmrp.usp.br, rtinos@ffclrp.usp.br} Abstract- In this work, Genetic Algorithm (GA) is employed in feature selection for the classification of medicinal plants with snake venom-neutralizing properties. The classification is performed using an Artificial Neural Network (ANN), which indicates the medicinal plants with anti-snake venom action as output when an amino acid sequence of snake venom is presented in its input. GAs and ANNs are Artificial Intelligence techniques and have been used in several similar optimization and classification problems. Here, the feature selection system is implemented using the classification error rate of the training set and the number of attributes as the fitness of each individual of the GA. The validation results for the classification system indicate that ANNs can be used to aid the selection of medicinal plants with snake venom-neutralizing properties. Also, feature selection based on GAs can help researches to select amino acids sequences of the snake venoms which can be important to the interaction with medicinal plants compounds. Key words- Bioinformatics, Genetic Algorithms, Artificial Neural Networks, Artificial Intelligence, Snake venom, Medicinal plants. 1 Introduction Snake bites envenomation is considered a serious public health problem, not only in Brazil, but in all Latin America [1]. Snake venoms are complex combinations of proteins including: phospholipase A2 (PLA2), proteolytic enzymes, and others. Frequently, envenomation by snake bites is treated by antiophidic serum administration. However, many times, patients do not have fast access to the serum since accidents usually occur in remote places. Furthermore, the local damage induced by snake venoms, as myonecrosis for example, can be sometimes irreversible [2]. In this sense, some fast procedures can help the patient until the usual treatment with serum can be administrated. One of these procedures is the use of medicinal plants extracts, which can be found close to the accident local. In the Brazilian popular medicine, many plants are employed in snake bites treatment, however few have their effects scientifically investigated [1]. Artificial Intelligence (AI) techniques have been employed with success in different bioinformatics problems as sequences analysis, protein structure prediction, and sequences alignment [3] and can be useful to this problem too. In this paper, feature selection based on Genetic Algorithms (GAs), which are an AI technique employed in optimization problems, is used to select attributes for classification of medicinal plants with snake venom-neutralizing properties. The classification is performed by another AI technique, Artificial Neural Networks (ANNs), which should indicate a medicinal plant with anti- snake venom in its output when an snake venom amino acid sequence is presented in its input. Two are the main objectives of this work. The first one is to generate a classification system (software) that relates a medicinal plant with a venom protein. It can aid researches to explore new relations and associate plants which were not studied yet to other plants with known anti-venom properties. The second main objective is to select important features of the input data in order to improve classification. Then, the feature selection could indicate which subsets of amino acids present in the snake venoms proteins are the most important in the interaction with the medicinal plants. Such information is important to better understand the snake venom-neutralizing properties of medicinal plants compounds. In the next section, the materials and methods used in this work are described. Then the experimental results with two medicinal plants are described. It is important to observe that the AI system proposed here can be extended in order to cope with more plants and snakes venom. This paper intends to show that the construction of an AI system to classify and select features in the problem of relating medicinal plants and snake venoms is possible.