International J. Soc. Sci. & Education 2014 Vol.4 Issue 2, ISSN: 2223-4934 E and 2227-393X Print 401 LINDSEI-TR: A New Spoken Corpus of Advanced Learners of English By Abdurrahman Kilimci Cukurova University, Faculty of Education, English Language Teaching Department, Balcalı, Adana, Turkey Abstract The aim of the present study is to describe the LINDSEI-TR, the Turkish component of the LINDSEI (the Louvain International Database of Spoken English), which was initiated to compile a corpus of spoken data produced by learners from varied mother tongues (Gilquin et al., 2010). In this respect, the main objective of the study is to present the aim, development, and the design criteria of the corpus along with its quantitative and qualitative characteristics. The corpus is considered to be of value to researchers in terms of delineating the features of learners’ spoken interlanguage and designing teaching materials to improve second language teaching and learning. Keywords: Corpus linguistics, spoken corpus, interlanguage, second language teaching and learning 1. Introduction Computer learner corpora, which were begun to be compiled in 1990s have since become not only more developed but also varied. Granger (2002:7) defines computer learner corpora as “electronic collections of authentic FL/SL textual data assembled according to explicit design criteria for a particular SLA/FLT purpose. She adds that “they are encoded in a standardized and homogeneous way and documented as to their origin and provenance” (p. 7). Leech (1998) considers learner corpora “a useful resource for anyone wanting to find out how people learn languages and how they can be helped to learn them better” (p xvi). Granger, (2002) points out that the emergence of learner corpus research has brought together the two formerly distinct fields of corpus linguistics and foreign/second language research. She also adds that it has become a theoretical and practical value by availing itself of the main principles, tools and methods from corpus linguistics to provide better descriptions of learner language for a wide range of purposes in foreign/second language acquisition research and the improvement of foreign language teaching. The ICLE (International Corpus of Learner English) project, launched in 1990, led to the compilation of many sub-corpora of different language backgrounds. In 1995, five years later, a new project, the Louvain International Database of Spoken English Interlanguage (LINDSEI) was started. The project aimed at providing a spoken counterpart to the ICLE, containing oral data produced by advanced learners of English from several mother tongue backgrounds. The compilation of the first component at the CECL with transcripts of 50 interviews of about 100.000 words with French mother tongue learners of English prompted the compilation of other components by the international of the project for different mother tongue backgrounds. The aim of this article is to present the design criteria and compilation stages of the LINDSEI-TR, the Turkish sub-component of the LINDSEI corpus, providing information on its quantitative and qualitative characteristics. The rest of the work is structured as follows: the next section presents the purpose of the corpus, and then the following respectively report on the design criteria and compilation stages of the LINDSEI-TR from qualitative and quantitative perspectives. A final section highlights the potential of the LINDSEI-TR, and concludes with implications which might help stimulate further avenues for research.