Category learning without labels—A simplicity approach Emmanuel Minos Pothos (e.pothos@ed.ac.uk) Department of Psychology, University of Edinburgh; 7 George Square Edinburgh, EH8 9JZ UK Nick Chater (nick.chater@warwick.ac.uk) Department of Psychology, University of Warwick; Coventry, CV4 7AL UK Abstract In an extensive research tradition in categorization, re- searchers have looked at how participants will classify new objects into existing categories; or the factors af- fecting learning to associate category labels with a set of objects. In this work, we examine a complementary as- pect of categorization, that of the spontaneous classifi- cation of items into categories. In such cases, there is no “correct” category structure that the participants must in- fer. We argue that the this second type of categorization, unsupervised categorization, can be seen as some form of perceptual organization. Thus, we take advantage of theoretical work in perceptual organization to use sim- plicity as a principle suitable for a model of unsuper- vised categorization. The model applied directly to similarity ratings about the objects to be categorized successfully predicted participants’ spontaneous classifi- cations. Moreover, we report evidence whereby per- ceived similarity is affected by spontaneous classifica- tion; this supplements the already substantial literature on such effects, but in categorization situation where the objects’ classification is not pre-determined. There are several situations in real life where novel objects can be spontaneously organized into groups. Consider a set of pebbles taken from a beach, or cloud patterns on a particular day, or just meaningless shapes shown onto a computer screen. This spontaneous clas- sification can be appropriately labeled “unsupervised” because there are no “correct” categories the observer need to infer. By contrast, in supervised categorization, the learner (e.g., a child or someone learning a new language), has to infer what a category is by observing exemplars of the category and guessing their category membership (e.g., a child could be corrected for calling an apple an orange; through a process of corrective feedback, she would eventually learn to associate the appropriate objects with the category label “orange”). Supervised vs. unsupervised categorization While there has been very little theoretical work on unsupervised categorization, this has not been the case for supervised categorization. Several models have been put forward, covering different intuitions about the cognitive mechanisms of supervised categorization. For example, in definitional accounts of concepts (e.g., Katz & Fodor, 1963), categories are characterized by necessary and sufficient conditions for an item to be a category member (see Pothos & Hahn, 2000, for a recent evaluation). In exemplar theories (e.g., Nosof- sky, 1989), a concept is represented by a set of known instances of that concept; new instances are therefore assigned to different categories in terms of their simi- larity to the members of each category. In prototype theories assignment is also determined by a similarity process, but this time to the prototype of each category, where a category prototype encapsulates some measure of central tendency across the exemplars of the cate- gory (e.g., Homa, Sterling, & Trepel, 1981). Despite the technical sophistication of this research, it does not cover the whole scope of categorization pro- cesses. Models such as the exemplar model or the pro- totype one could never be used to predict how a person would spontaneously classify a set of items. In fact, in an influential paper Murphy and Medin (1985) criti- cized models such as the above for failing to explain category coherence—why it is the case that certain groupings of items make better categories than others; for example, the categories of birds or cups are coher- ent, but a category consisting of dolphins born on Tuesdays together with pink tulips within 20 miles of London, and the Eiffel Tower would be nonsensical. Given that the exemplar or prototype models could not explain such observations, Murphy and Medin con- cluded that they are inadequate models of categoriza- tion (and thus made a case for the importance of gen- eral knowledge in categorization). However, under the light of the present distinction between supervised and unsupervised categorization, it is not the case that the exemplar or the prototype modes are inadequate in that they fail to capture general knowledge effects. Rather, category coherence is a problem of unsupervised categorization, as it relates to how categories originate–a process which, necessarily, cannot be guided by a ‘supervisor.’ To summarize this section, the distinction of catego- rization models into supervised and unsupervised serves the useful purpose of enabling a closer specifi-