Boulton, A. (in press). Data-driven learning: In conversation with Alex Boulton. In P. Crosthwaite (Dir.), Corpora for language learning: Bridging the research-practice divide. Routledge. Page 1 of 9 1. Data-driven learning: In conversation with Alex Boulton Alex Boulton https://orcid.org/0000-0001-6306-8158 Abstract Data-driven learning (DDL) involves using the tools and techniques of corpus linguistics for teaching and learning second or foreign languages. Since its first appearance over 30 years ago, hundreds of papers have sought to evaluate different aspects of its use. In this chapter, Peter Crosthwaite talks to Alex Boulton about his syntheses of research in this area, and how researchers can make DDL more accessible. 1. Could you provide examples of how DDL can give learners more direct access to data? The basic idea behind data-driven learning or DDL is that instead of being taught (by teachers, textbooks, online courses, etc.), learners take on a more active role. They do this not by learning ‘rules’ but by looking at how language is actually used. This is typically in a relatively large collection of texts which we call a corpus, which might be enormous, billions of words aiming to represent the entire English language, for example, or much smaller for learners with specific needs such as academic writing within their discipline. The important point is that the texts should be representative of what they’re interested in. And the tool we use to search a corpus is called a concordancer, where the main function is to find occurrences of the word or phrase you’re looking for and show them in context so you can see for yourself how they’re used. They can also show how frequent a word is, what other words it occurs with, which part of the corpus it occurs most frequently in, and much more besides. So learners don’t get ‘answers’ to their questions, they see language in use and work out the answers for themselves. It’s kind of a game really, they have to think quite hard to come to their own conclusions, but that’s conducive to learning compared to ‘being taught’. To give a simple example, do you say a book by JK Rowling or a book of JK Rowling? In French, you’d tend to say de which is often equivalent to of, so it’s not obvious for my students. It’s the kind of thing that’s difficult to find in a dictionary, grammar book or usage manual, and you’ll probably forget if the teacher simply tells you. Or, what’s the difference between big and large? Dictionary definitions are pretty circular on this, defining each in terms of the other. Yes, they are often synonymous, but with a corpus you can look for differences too. Among other things, big tends to be relatively more common in speech while large is more frequent in academic texts; also, big is more likely to describe things which are subjectively important or idiomatic, such as big brother, big idea or big mistake, while large refers to things that are measurable, such as large extent, large amounts or large portions. Which is not to say that the others are impossible, just less usual. DDL takes us away from the ‘what’s possible’ with grammar rules to what is more normal usage. Language isn’t just a question of correct/incorrect but what’s normal and appropriate in different linguistic and non-linguistic contexts. Frequency of words and patterns can be a very useful guide in helping to decide what’s worth learning at this point in your trajectory as a learner, for things that interest you as an individual. The idea came about in the 1980s and 1990s, when the most usual contact with the language was in the classroom. Now of course with the internet it may feel less relevant but the fundamental idea is still valid – looking at lots of examples to arrive at your own conclusions.