Periodic and Sturmian languages Lucian Ilie ⋆, 1 , Solomon Marcus 2 , and Ion Petre ⋆⋆, 3 1 Department of Computer Science, University of Western Ontario N6A 5B7, London, ON, CANADA ilie@csd.uwo.ca 2 Romanian Academy, Mathematics, Calea Victoriei 125, Bucharest, ROMANIA solomon.marcus@imar.ro 3 Department of Computer Science, ˚ Abo Akademi University FIN-20520 Turku, FINLAND ipetre@abo.fi November 26, 2004 Abstract. Counting the number of distinct factors in the words of a language gives a measure of complexity for that language similar to the factor-complexity of infinite words. Similarly as for infinite words, we prove that this complexity functions f (n) is either bounded or f (n) n+1. We call languages with bounded complexity periodic and languages with complexity f (n)= n +1 Sturmian. We describe the structure of periodic languages and we characterize the Sturmian languages as the sets of factors of (one- or two-way) infinite Sturmian words. Keywords: infinite words, languages, factors, periodic, Sturmian MSC: 68R15, 68Q45 1 Introduction A function can be associated in a natural way to an infinite word by counting the number of factors of the same length. Fundamental results concerning this function and the implications on the structure of the underlying infinite word were proved already by Morse and Hedlund [14], Coven and Hedlund [4]. The most interesting cases are those corresponding to very low complexity of the above mentioned function, that is, bounded or marginally unbounded. On the other hand, a similar function can be considered for languages of finite words. Already Berstel [1] considered the notion of the population function of a language L which associates, to every n, the number of words of length at most n in L. The notion of the number of words of the same length is certainly very basic one in language theory and it has been intensively studied already in [8–10]. Many results were discovered (or rediscovered) later in [6, 7, 13, 15, 18], to quote a few; [7] gives a good account of the history of the most important results. The same problem was also investigated for L-systems; see [16]. Research supported in part by NSERC. ⋆⋆ Research supported by Academy of Finland, project 203667