An Analysis of the Anti-Learning Phenomenon for the Class Symmetric Polyhedron Adam Kowalczyk 1 and Olivier Chapelle 2 1 National ICT Australia and RSISE, The Australian National University Canberra, Australia adam.kowalczyk@nicta.com.au 2 Max Planck Institute for Biological Cybernetics, T¨ ubingen, Germany olivier.chapelle@tuebingen.mpg.de Abstract. This paper deals with an unusual phenomenon where most machine learning algorithms yield good performance on the training set but systematically worse than random performance on the test set. This has been observed so far for some natural data sets and demonstrated for some synthetic data sets when the classification rule is learned from a small set of training samples drawn from some high dimensional space. The initial analysis presented in this paper shows that anti-learning is a property of data sets and is quite distinct from over-fitting of a training data. Moreover, the analysis leads to a specification of some machine learning procedures which can overcome anti-learning and generate ma- chines able to classify training and test data consistently. 1 Introduction The goal of a supervised learning system for binary classification is to classify instances of an independent test set as well as possible on the basis of a model learned from a labeled training set. Typically, the model has similar classifica- tion behavior on both the training and test sets, i.e., it classifies training and test instances with precision higher than the expected accuracy of the random classifier. Thus it has what we refer to as “the learning mode”. However, there are real life situations where better than random performance on the training set yields systematically worse than random performance on the off-training test set. One example is the Aryl Hydrocarbon Receptor classification task in KDD Cup 2002 [3, 9, 11]. These systems exhibit what we call “the anti-learning mode”. As it has been discussed in [8], anti-learning can be observed in publicly available microarray data used for prediction of cancer outcomes, which can show both learning and anti-learning mode, depending on the features selected. In this paper however, we focus on synthetic data which facilitates rigorous analysis. The aim is to demonstrate rigorously that anti-learning can occur, and can be primarily a feature of the data as it happens for many families of al- gorithms, universally across all setting of tunable parameters. In particular, we analyse a task of classification of binary labeled vertices of a class symmetric