The 5 th UAD TEFL International Conference (5 th UTIC) ISBN 978-623-6071-02-1 Eastparc Hotel, Yogyakarta, Indonesia 2019 19 10.12928/utic.v2.5731.2019 http://seminar.uad.ac.id/index.php/utic utic@uad.ac.id Corpora: From theoretical linguistics to language teaching Ikmi Nur Oktavianti Universitas Ahmad Dahlan, Jl. Ringroad Selatan, Kragilan, Tamanan, Kec. Banguntapan, Bantul, Daerah Istimewa Yogyakarta 55191, Indonesia ikmi.oktavianti@pbi.uad.ac.id 1. Introduction Corpus (pl. corpora) derives from Latin word literally means ‘body’ (Oxford Dictionary of English, 2014). According to Oxford Dictionary of English (2014), the term corpus begins to refer to collection of texts since the early 18 th century. In linguistics, corpus is defined as a collection of texts stored digitally to serve as the assistance of language studies (Lüdeling & Kytö, 2009; McEnery & Hardie, 2012; McEnery & Wilson, 2001; O’Keeffe & McCarthy, 2012; O’Keeffe, Mccarthy, & Carter, 2007; Timmis, 2015). A corpus is simply a huge collection of texts we can study further; it is not a theory of language, but it does affect our way of thinking about language and language teaching (McCarthy, 2004). Corpus is widely applied in linguistic analysis, as it is able to provide empirical evidence to the description of language structure and language use. For decades, its popularity in linguistics is sharply inclining along with the building of new corpora for specific purposes. Whereas language researchers have experienced the advantage of using corpora to improve the description of language, the use of corpora in language teaching, especially in EFL classroom, is still uncommon. In Indonesia, the existence of corpora seems ignored based on the minimum discussion in this topic. One reason is that corpora and learning how to use corpora is seldom part of teacher training courses. Consequently, teachers lack the skills needed to use corpora as native speaker consultants (Granath, 2009). ARTICLE INFO ABSTRACT Article history Received 07 December 2019 Revised 11 March 2020 Accepted 21 August 2020 Available Online 15 January 2021 Keywords corpus big data language teaching Corpus has gained its popularity in linguistics over the past five decades, from the computerized storage of English language in Survey of English Usage in 1959 to the ongoing development of Corpus of Contemporary American English. Because of the huge size of actual language data compiled in corpora, many linguists and language teachers working with English language have benefited from them in linguistic research and teaching practice. Up to now, there are innumerable English online corpora recording data from various genres, modes, and regions as well as corpus tools to analyze self-compiled corpus. The massive development of corpora, however, has not been widely discussed among English language researchers and practitioners in Indonesia, let alone in English language teaching. Although linguistics and language teaching are two inseparable and firmly related fields, corpus as a concept and product of linguistics seems ignored or even avoided. This paper then aims to review the nature of corpus and how it is used to assist linguistic analysis. More importantly, this paper discusses another possible application of corpus, e.g., the use of corpus in teaching language. Considering the nature and the benefits of using corpora, it is then important to promote the use of corpus to enhance English language teaching and learning, either directly in the classrooms or indirectly in materials development. This is an open access article under the CC–BY-SA license.