On the quest for easy-to-understand splitting rules Fernando Berzal a, * , Juan-Carlos Cubero a , Fernando Cuenca b , Marıa J. Martın-Bautista a a Department of Computer Science and Artiﬁcial Intelligence, E.T.S. Ingenierıa Inform atica, University of Granada, Granada 18071, Spain b Xfera, Madrid, Spain Abstract Decision trees are probably the most popular and commonly used classiﬁcation model. They are built recursively following a top-down approach (from general concepts to particular examples) by repeated splits of the training dataset. The chosen splitting criterion may aﬀect the accuracy of the classiﬁer, but not signiﬁcantly. In fact, none of the proposed splitting criteria in the literature has proved to be universally better than the rest. Although they all yield similar results, their complexity varies signiﬁcantly, and they are not always suitable for multi-way decision trees. Here we propose two new splitting rules which obtain similar results to other well-known criteria when used to build multi-way decision trees, while their sim- plicity makes them ideal for non-expert users. Ó 2002 Elsevier Science B.V. All rights reserved. Keywords: Supervised learning; Classiﬁcation; Decision trees; Splitting rules 1. Introduction Decision trees are probably the most popular and commonly used classiﬁcation model; e.g. see [5,16]. Decision trees are built recursively following a top-down approach (from general concepts to particular examples). That is the reason why the acronym TDIDT, which stands for top-down induction on decision trees, is used to refer to this kind of algorithms. www.elsevier.com/locate/datak Data & Knowledge Engineering 44 (2003) 31–48 * Corresponding author. Tel.: +34-958-242376; fax: +34-958-243317. E-mail addresses: fberzal@decsai.ugr.es (F. Berzal), jc.cubero@decsai.ugr.es (J.-C. Cubero), fernando.cuenca@ xfera.com (F. Cuenca), mbautis@decsai.ugr.es (M.J. Martın-Bautista). 0169-023X/03/$ - see front matter Ó 2002 Elsevier Science B.V. All rights reserved. PII:S0169-023X(02)00062-9