Learning Hierarchical Classiﬁers with Class Taxonomies Feihong Wu, Jun Zhang, and Vasant Honavar Artiﬁcial Intelligence Research Laboratory Department of Computer Science Iowa State University Ames, Iowa 50011-1040, USA {wuflyh, jzhang, honavar}@cs.iastate.edu Abstract. As more and more data with class taxonomies emerge in di- verse ﬁelds, such as pattern recognition, text classiﬁcation and gene func- tion prediction, we need to extend traditional machine learning methods to solve classiﬁcation problem in such data sets, which presents more challenges over common pattern classiﬁcation problems. In this paper, we deﬁne structured label classiﬁcation problem and investigate two learn- ing approaches that can learn classiﬁer in such data sets. We also develop distance metrics with label mapping strategy to evaluate the results. We present experimental results that demonstrate the promise of the pro- posed approaches. 1 Introduction Pattern classiﬁcation is an important topic in machine learning and data mining research, and many state-of-the-art pattern classiﬁcation algorithms have been developed. However, most of such algorithms are targeted to solve classiﬁcation problem with single class label, which assumes all class labels are mutually ex- clusive. In many real world problems, it is quite common to have more complex class labels, such as multiple topic categories for text documents and multiple functional classes for biological data. The main characteristics of this problem are: (1) Class labels are naturally organized as a taxonomy structure (Class Tax- onomy) which deﬁnes an abstraction over class labels; (2) Because of the large possible class combinations within a class taxonomies and relatively sparse data for combinatorial class labels, it is a hard problem to many standard classiﬁer learning algorithms; (3) Standard evaluation method for classiﬁers targeting sin- gle label problem might not be suitable to evaluate classiﬁers for solving complex class label problems, new evaluation approaches are needed. Although such problems have been explored to some extent, they are not fully formalized, and we still lack of a general strategy to solve the problems. In this paper, we formalize the structured label learning problem and analyze their distinct features. We propose and implement two approaches, with general strategies of applying any up-to-date classiﬁcation learning methods. Because of the restrictions of the standard evaluation criterion, we propose a new evaluation