IJCSS, Vol.1, No.1, 2000 73 DISCOVERING HIERARCHICAL DECISION RULES WITH EVOLUTIVE ALGORITHMS IN SUPERVISED LEARNING JosØ C. Riquelme, Jesœs S. Aguilar and Miguel Toro Departamento de Lenguajes y Sistemas InformÆticos. Facultad de InformÆtica y Estadstica. Avenida Reina Mercedes s/n 41012 Sevilla Spain e-mail: riquelme@lsi.us.es, aguilar@lsi.us.es, miguel.toro@lsi.us.es Accepted in the nal form: 15 November, 2000 Abstract. This paper describes a new approach, HIDER (HIerarchical DEcision Rules), for learning rules in continuous and discrete domains based on evolutive algorithms. The algorithm produces a hierarchical set of rules, that is, the rules must be applied in a specic order. With this policy, the number of rules may be reduced because the rules could be one inside of another. The evolutive algorithm uses both real and binary codication for the individuals of the population and introduces several new genetic operators. In addition, this paper discusses the capability of learning systems based on an evolutive algorithm to reduce both the number of rules and the number of attributes involved in the rule set. We have tested our system on real data from the UCI repository. The results of a 10-fold cross validation are compared to C4.5s and they show an important improvement. Keywords: Evolutive Algorithms, Supervised learning, Decision Lists. 1 Introduction Supervised learning is used when the user knows the outcomes of the data samples and wants to predict the outcome of a new unseen instance. An algorithm carries out the prediction (classication) and it can produce knowledge by using a suitable data structure. Some techniques, like nearest neighbour searching or neural networks, can classify an instance, but cannot obtain the knowledge from the information stored in the database. This sort of learning is called lazy learning because the algorithm does not generate a model of knowledge for the database. However, other techniques produce sets of rules with a specic structure: decision trees, decision lists, or simply, set of rules. In general, when a rule-based framework is used to express the acquired knowledge, this is often called decision rules. Such rules can subsequently be used both to infer properties of the corresponding categories and to classify other, previously unseen, examples from the original space. The algorithm is ran once to produce the set of rules that classify many new instances. Other techniques as nearest neighbour classiers need one execution every time we want to predict the class of an unseen new example. Therefore, the diﬀerence between the techniques that can produce knowledge and those that cannot is very important from the point of view of the number of executions. Supervised learning algorithms tend to emulate the human behaviour, since from input conditions try to predict the action to be taken, based upon experience with similar situations. Such experience