Connectionist Learning of Natural Language Lexical Phonotactics ( Running head: Connectionist Learning of Phonotactics ) Ivelin Stoianov and John Nerbonne Dept. Alfa-informatica, Faculty of Letters, University of Groningen P.O.Box 716 , 9700 AS Groningen, The Netherlands Phone: (31-50) 363-5936, e-Mail: {stoianov,nerbonne}@let.rug.nl Abstract Connectionist learning of natural language words and their phonetic regularities is presented. The Neural Network (NN) model we employ in this problem is the Simple Recurrent Network, trained with the Backpropagation Through Time (BPTT) learning algorithm. During the training, it was assigned the task of predicting the next phoneme given one phoneme at each moment and keeping information of the past phonemes from a given word in a few context neurons. The phonotactics of the Dutch language was studied among others. The shortcomings of some similar previous implementations are explained and successfully overcome. Among the techniques we employed to achieve the much-improved error rate of 1.1% with monosyllabic words and 3.5% with multisyllabic ones are new methods for network response interpretation, an evolutionary approach in training a set of networks, and the exploitation of the word frequencies in training. Finally, an analysis of the phonotactics rules extracted by a trained network is presented. Keywords: connectionism, neural networks, SRN, machine learning, linguistics, phonotactics,