Learning Hierarchical Classifiers with Class Taxonomies Feihong Wu, Jun Zhang, and Vasant Honavar Artificial Intelligence Research Laboratory Department of Computer Science Iowa State University Ames, Iowa 50011-1040, USA {wuflyh, jzhang, honavar}@cs.iastate.edu Abstract. As more and more data with class taxonomies emerge in di- verse fields, such as pattern recognition, text classification and gene func- tion prediction, we need to extend traditional machine learning methods to solve classification problem in such data sets, which presents more challenges over common pattern classification problems. In this paper, we define structured label classification problem and investigate two learn- ing approaches that can learn classifier in such data sets. We also develop distance metrics with label mapping strategy to evaluate the results. We present experimental results that demonstrate the promise of the pro- posed approaches. 1 Introduction Pattern classification is an important topic in machine learning and data mining research, and many state-of-the-art pattern classification algorithms have been developed. However, most of such algorithms are targeted to solve classification problem with single class label, which assumes all class labels are mutually ex- clusive. In many real world problems, it is quite common to have more complex class labels, such as multiple topic categories for text documents and multiple functional classes for biological data. The main characteristics of this problem are: (1) Class labels are naturally organized as a taxonomy structure (Class Tax- onomy) which defines an abstraction over class labels; (2) Because of the large possible class combinations within a class taxonomies and relatively sparse data for combinatorial class labels, it is a hard problem to many standard classifier learning algorithms; (3) Standard evaluation method for classifiers targeting sin- gle label problem might not be suitable to evaluate classifiers for solving complex class label problems, new evaluation approaches are needed. Although such problems have been explored to some extent, they are not fully formalized, and we still lack of a general strategy to solve the problems. In this paper, we formalize the structured label learning problem and analyze their distinct features. We propose and implement two approaches, with general strategies of applying any up-to-date classification learning methods. Because of the restrictions of the standard evaluation criterion, we propose a new evaluation