A Neural Field Model of Word Recognition Andrew P. Valenti (andrew.valenti@tufts.edu) Bradley Oosterveld (bradley.oosterveld@tufts.edu) Matthias Scheutz (matthias.scheutz@tufts.edu) Tufts University Human-Robot Interaction Laboratory, 200 Boston Ave. Medford, MA 02155 Abstract We show how temporal and spatial information can be repre- sented as stable patterns in a dynamical system. We describe a model in which category perception arises from the incremen- tal recognition of temporal patterns from sequences of inputs and this is accomplished by decoding a pool of recurrently con- nected artiﬁcial neurons which is called a neural ﬁeld. In an example application, we use these patterns to identify a set of words which share the word onset represented by the input se- quence, consistent with the Marslen-Wilson COHORT model of word recognition. Similarly, we evaluate the extent to which information contained in the bottom-up sensory signal can be used to determine word boundaries. We suggest it is plausi- ble that a neural ﬁeld offers a naturalistic explanation of how perception arises in word processing. Keywords: dynamic ﬁeld theory, neural ﬁelds, connectionist model, word recognition, COHORT model, machine learning Introduction The brain encodes and processes sensory input acquired from the environment. Sensory input, regardless of modality, is encoded as spatiotemporal patterns, and a superior form of pattern processing has evolved in humans coinciding with the expansion of the neocortex. In this brain structure, sev- eral essential cognitive processes such as visual, auditory, and speech perception occur (Koch, 2004; Mattson, 2014). These processes include not only recognizing patterns, but also clas- sifying them (Grossberg, 2005). During this processing, dif- ferent sensory inputs which represent members of the same category are mapped to a singular representation for that cat- egory. In speech processing, for example, all pronunciations of the phoneme “@”, are mapped to the same pattern, allowing for invariance in speech perception across multiple speakers (Kleinschmidt & Jaeger, 2015). Consistent with these hy- potheses, our model uses patterns of activation to represent sequences of states in the context of perceiving words; we modeled these states as equilibriums in a neural ﬁeld. The human neocortex consists of six layers of tissue con- taining approximately 10 10 neurons. Columns of tissue can be represented mathematically as neural ﬁelds, which form patterns of activation through interaction with each other (Amari, 1977). These interactions between ﬁelds generate patterns of activation in a fashion that is believed to be sim- ilar to how sensory information is represented in the human neocortex (Amari, 1977; Brady, 2012). These patterns rep- resent an encoding of spatial and temporal information from the brain’s sensory input stream. Each neuron in a neural ﬁeld F (Figure 1) is connected to each of its neighbors with weights that create an on-center off-surround activation pattern, where the closest neighbors provide a positive inﬂuence on activation, further neighbors a negative inﬂuence, and the furthest no inﬂuence. If given no input and random initial conditions, the units of the ﬁeld are guaranteed to quickly fall into a stable equilibrium state. Different equilibrium states of a ﬁeld can be associated with different inputs, and thus the states of activation in a neural ﬁeld can be used to store information by associating them with category labels (Valenti, Brady, Scheutz, Holcomb, & Pu, 2016). In this work, we demonstrate a model of word perception using neural ﬁelds. Our research is not focused the initial interaction between perceptual signals and the sensory appa- ratus. We are instead interested in the processing of the out- put of such apparatuses, and how it can be used to constrain the patterns of activation in higher level cognitive processes, like lexical representation. Our model uses two neural ﬁelds, each representing a level of cognitive processing. Since sen- sory information unfolds over time as a continuous sequence, the input presented to the ﬁrst neural ﬁeld is a sequence of feature vectors which represent the letters of an artiﬁcial font. Sequences of output features from the ﬁrst ﬁeld representing letters are then presented as input to the second ﬁeld which identiﬁes likely word boundaries and classiﬁes these letter se- quences as words. There are many theories about how patterns of activation in the lexicon are formed once the sensory information has been received (Dahan & Magnuson, 2006). This work focuses on the Marslen-Wilson (1987) COHORT model, which theorizes that information contained in the bottom-up perceptual signal can be exploited to determine which lexical items should be activated, and also used to identify perceptual characteristics such as word boundaries. To explore the extent to which this information is sufﬁcient, we have developed a model where word onsets constrain the set of activated lexical entities such that word onsets activate lexical items with shared onsets. Our model thus makes predictions similarly to the COHORT model; the initial information contained in the sensory signal inﬂuences the activation of an initial word-cohort, allowing it to predict word boundaries in a higher level of processing. Representing State with a Neural Field Our model is composed of two layers of neural ﬁelds. The structure of a single layer is shown in Figure 1. An Input vector (I ) is fully connected to the neural ﬁeld (F ) by input