‘Keep out of reach of children!’ Introducing the
Corpus of Product Information (CoPI) and its
potential for corpus-based genre teaching
Karin Puga
1
and Sandra Götz
1
Abstract
In this paper, we introduce the language-pedagogic potential of the Corpus
of Product Information (CoPI). The corpus is XML-annotated and contains
about 100,000 words of product descriptions of health products, cleaning
supplies and products for beauty and personal care, divided into three textual
moves: (1) overview, (2) directions and (3) warnings. First, we describe
the data collection, corpus design and annotation scheme of the corpus,
and then we present the findings of an analysis of CoPI’s most frequent
words, clusters and its type–token ratio. Finally, we show its potential for
language-pedagogic purposes and suggest how the CoPI analyses can be
used for paper- and computer-based DDL activities that foster corpus-based
genre teaching in the advanced EFL classroom. We conclude this paper by
summarising the outcomes of a first case study we conducted to test these
activities with advanced learners of English.
Keywords: corpus-based teaching, Corpus of Product Information, data-
driven learning, DYO-Corpus, genre analysis, genre-based teaching.
1. Introduction
Corpus-based investigations have clearly demonstrated that language varies
immensely in different genres, modes or varieties (e.g., Biber, 1988; and
Biber et al., 1999). Generally, in linguistics, genre is defined as a recognisable
cultural and interpersonal event making use of language that fulfils a number
of communicative purposes (see Bhatia, 1993; and Swales, 1990). In the
context of this study, the term ‘register’ is defined as all linguistic patterns
1
Department of English – English Linguistics, Justus Liebig University, Otto-Behaghel-Str.
10 B, 35394 Giessen, Germany.
Correspondence to: Karin Puga, e-mail: Karin.Puga@anglistik.uni-giessen.de
Corpora 2017 Vol. 12 (3): 393–423
DOI: 10.3366/cor.2017.0126
© Edinburgh University Press
www.euppublishing.com/cor