Artificial Intelligence Review 18: 77–95, 2002. © 2002 Kluwer Academic Publishers. Printed in the Netherlands. 77 A Perspective View and Survey of Meta-Learning RICARDO VILALTA and YOUSSEF DRISSI IBM T.J. Watson Research Center, 30 Saw Mill River Rd, Hawthorne NY, 10532 USA (E-mail: vilalta@us.ibm.com, youseffd@us.ibm.com) Abstract. Different researchers hold different views of what the term meta-learning exactly means. The first part of this paper provides our own perspective view in which the goal is to build self-adaptive learners (i.e. learning algorithms that improve their bias dynamically through experience by accumulating meta-knowledge). The second part provides a survey of meta-learning as reported by the machine-learning literature. We find that, despite different views and research lines, a question remains constant: how can we exploit knowledge about learning (i.e. meta-knowledge) to improve the performance of learning algorithms? Clearly the answer to this question is key to the advancement of the field and continues being the subject of intensive research. Keywords: classification, inductive learning, meta-knowledge 1. Introduction Meta-learning studies how learning systems can increase in efficiency through experience; the goal is to understand how learning itself can become flexible according to the domain or task under study. All learning systems work by adapting to a specific environment, which reduces to imposing a partial ordering or bias on the set of possible hypotheses explaining a concept (Mitchell (1980)). Meta-learning differs from base-learning in the scope of the level of adaptation: meta-learning studies how to choose the right bias dynamically, as opposed to base-learning where the bias is fixed a priori, or user parameterized. In a typical inductive-learning scenario, applying a base- learner (e.g. decision tree, neural network, or support vector machine) over some data produces a hypothesis that depends on the fixed bias embedded in the learner. Learning takes place at the base-level because the quality of the hypothesis normally improves with an increasing number of examples. Nevertheless, successive applications of the learner over the same data always produces the same hypothesis, independently of performance; no knowledge is extracted across domains or tasks (Pratt and Thrun (1997)). Meta-learning aims at discovering ways to dynamically search for the best learning strategy as the number of tasks increases (Thrun (1998), Rende et al. (l987B)). A computer program qualifies as a learning machine if its