A Neural Field Model of Word Recognition Andrew P. Valenti (andrew.valenti@tufts.edu) Bradley Oosterveld (bradley.oosterveld@tufts.edu) Matthias Scheutz (matthias.scheutz@tufts.edu) Tufts University Human-Robot Interaction Laboratory, 200 Boston Ave. Medford, MA 02155 Abstract We show how temporal and spatial information can be repre- sented as stable patterns in a dynamical system. We describe a model in which category perception arises from the incremen- tal recognition of temporal patterns from sequences of inputs and this is accomplished by decoding a pool of recurrently con- nected artificial neurons which is called a neural field. In an example application, we use these patterns to identify a set of words which share the word onset represented by the input se- quence, consistent with the Marslen-Wilson COHORT model of word recognition. Similarly, we evaluate the extent to which information contained in the bottom-up sensory signal can be used to determine word boundaries. We suggest it is plausi- ble that a neural field offers a naturalistic explanation of how perception arises in word processing. Keywords: dynamic field theory, neural fields, connectionist model, word recognition, COHORT model, machine learning Introduction The brain encodes and processes sensory input acquired from the environment. Sensory input, regardless of modality, is encoded as spatiotemporal patterns, and a superior form of pattern processing has evolved in humans coinciding with the expansion of the neocortex. In this brain structure, sev- eral essential cognitive processes such as visual, auditory, and speech perception occur (Koch, 2004; Mattson, 2014). These processes include not only recognizing patterns, but also clas- sifying them (Grossberg, 2005). During this processing, dif- ferent sensory inputs which represent members of the same category are mapped to a singular representation for that cat- egory. In speech processing, for example, all pronunciations of the phoneme “@”, are mapped to the same pattern, allowing for invariance in speech perception across multiple speakers (Kleinschmidt & Jaeger, 2015). Consistent with these hy- potheses, our model uses patterns of activation to represent sequences of states in the context of perceiving words; we modeled these states as equilibriums in a neural field. The human neocortex consists of six layers of tissue con- taining approximately 10 10 neurons. Columns of tissue can be represented mathematically as neural fields, which form patterns of activation through interaction with each other (Amari, 1977). These interactions between fields generate patterns of activation in a fashion that is believed to be sim- ilar to how sensory information is represented in the human neocortex (Amari, 1977; Brady, 2012). These patterns rep- resent an encoding of spatial and temporal information from the brain’s sensory input stream. Each neuron in a neural field F (Figure 1) is connected to each of its neighbors with weights that create an on-center off-surround activation pattern, where the closest neighbors provide a positive influence on activation, further neighbors a negative influence, and the furthest no influence. If given no input and random initial conditions, the units of the field are guaranteed to quickly fall into a stable equilibrium state. Different equilibrium states of a field can be associated with different inputs, and thus the states of activation in a neural field can be used to store information by associating them with category labels (Valenti, Brady, Scheutz, Holcomb, & Pu, 2016). In this work, we demonstrate a model of word perception using neural fields. Our research is not focused the initial interaction between perceptual signals and the sensory appa- ratus. We are instead interested in the processing of the out- put of such apparatuses, and how it can be used to constrain the patterns of activation in higher level cognitive processes, like lexical representation. Our model uses two neural fields, each representing a level of cognitive processing. Since sen- sory information unfolds over time as a continuous sequence, the input presented to the first neural field is a sequence of feature vectors which represent the letters of an artificial font. Sequences of output features from the first field representing letters are then presented as input to the second field which identifies likely word boundaries and classifies these letter se- quences as words. There are many theories about how patterns of activation in the lexicon are formed once the sensory information has been received (Dahan & Magnuson, 2006). This work focuses on the Marslen-Wilson (1987) COHORT model, which theorizes that information contained in the bottom-up perceptual signal can be exploited to determine which lexical items should be activated, and also used to identify perceptual characteristics such as word boundaries. To explore the extent to which this information is sufficient, we have developed a model where word onsets constrain the set of activated lexical entities such that word onsets activate lexical items with shared onsets. Our model thus makes predictions similarly to the COHORT model; the initial information contained in the sensory signal influences the activation of an initial word-cohort, allowing it to predict word boundaries in a higher level of processing. Representing State with a Neural Field Our model is composed of two layers of neural fields. The structure of a single layer is shown in Figure 1. An Input vector (I ) is fully connected to the neural field (F ) by input