Temporal and spatial integration of face, object, and scene features in occipito-temporal cortex Thomas W. James a,c, * , Eunji Huh a,b , Sunah Kim c a Department of Psychological and Brain Sciences, Indiana University, United States b Department of Speech and Hearing Sciences, Indiana University, United States c Cognitive Science Program, Program in Neuroscience, Indiana University, United States article info Article history: Accepted 26 July 2010 Available online 19 August 2010 Keywords: Face recognition Feature integration Fusiform face area fMRI Restricted viewing abstract In three neuroimaging experiments, face, novel object, and building stimuli were compared under con- ditions of restricted (aperture) viewing and normal (whole) viewing. Aperture viewing restricted the view to a single face/object feature at a time, with the subjects able to move the aperture continuously though time to reveal different features. An analysis of the proportion of time spent viewing different fea- tures showed stereotypical exploration patterns for face, object, and building stimuli, and suggested that subjects constrained their viewing to the features most relevant for recognition. Aperture viewing showed much longer response times than whole viewing, due to sequential exploration of the relevant isolated features. An analysis of BOLD activation revealed face-selective activation with both whole view- ing and aperture viewing in the left and right fusiform face areas (FFA). Aperture viewing showed strong and sustained activation throughout exploration, suggesting that aperture viewing recruited similar pro- cesses as whole viewing, but for a longer time period. Face-selective recruitment of the FFA with aperture viewing suggests that the FFA is involved in the integration of isolated features for the purpose of recognition. Ó 2010 Elsevier Inc. All rights reserved. 1. Introduction Recognition of objects in our environment is an essential cogni- tive operation, and one that is performed with relative ease by hu- mans and other primates. Decades of research into the cognitive and neural mechanisms of object recognition suggest that objects can (and potentially must) be decomposed into features for identi- ﬁcation to occur (Biederman, 1997; Marr, 1982; Poggio & Edelman, 1990; Schyns, Goldstone, & Thibaut, 1998). These features are alternatively called parts, components, geons, or primitives, and their exact nature remains a point for debate. By most deﬁnitions, object recognition involves binding together multiple features across space and time, by recruiting spatial and temporal integra- tion processes. For visual object recognition, spatial feature integra- tion has received more intense scrutiny than temporal integration. When studied in relation to human face recognition, spatial feature integration is often called holistic, conﬁgural, or relational process- ing (Gauthier & Tarr, 2002; Maurer, Le Grand, & Mondloch, 2002; Moscovitch, Winocur, & Behrmann, 1997; Tanaka & Farah, 1993). Face recognition, which is a special case of object recognition, is notable due to its extreme efﬁciency in human observers in terms of the short time required for feature integration (Bruce & Young, 1986; Farah, Wilson, Drain, & Tanaka, 1998; Gauthier & Tarr, 2002; Maurer et al., 2002; McKone, Martini, & Nakayama, 2001; Moscovitch et al., 1997; Rhodes, 1988; Tanaka & Farah, 1993; Yovel & Duchaine, 2006). In fact, it has been suggested that holistic pro- cessing is characterized by the simultaneous integration of face fea- tures (Rossion, 2008). The striking recruitment of spatial feature integration for visual face recognition is contrasted with more generic object recognition (such as discriminating a stapler from a telephone), which relies less on whole object processing, and instead relies more on what is referred to as feature-based, parts-based, sequential, componen- tial, piecemeal, or analytic processing (Marsolek & Burgund, 1997; Tanaka & Farah, 1993; Yovel & Duchaine, 2006). In fact, in many cases, objects can be recognized based on a single diagnostic fea- ture (Bruce & Young, 1986; Tanaka & Farah, 1993). In other cases, though, formation of a coherent object percept involves integration of multiple features in a time-consuming, sequential manner. Faces and other objects are not only recognized visually, but can also be recognized using the sense of touch, especially when ob- jects are actively explored haptically (Klatzky, Lederman, & Reed, 1987). Although both the visual and haptic systems are able to ex- tract many properties of objects, to efﬁciently recognize them, both systems rely heavily on shape features (Kilgour & Lederman, 2002; Klatzky et al., 1987). Because the receptor and peripheral nerve 0278-2626/$ - see front matter Ó 2010 Elsevier Inc. All rights reserved. doi:10.1016/j.bandc.2010.07.007 * Corresponding author. Address: 1101 E Tenth St., Bloomington, IN 47405, United States. Fax: +1 812 855 4691. E-mail address: thwjames@indiana.edu (T.W. James). Brain and Cognition 74 (2010) 112–122 Contents lists available at ScienceDirect Brain and Cognition journal homepage: www.elsevier.com/locate/b&c