389 Kyungpook National University e-mail: nki@knu.ac.kr, sjmano27@naver.com, haeyun.jung.22@gmail.com, c-juni@hanmail.net Abstract This research analyzes web crawling corpora and examines how many of the neologisms that are coined every year are dying out and how many endure. It seeks to grasp what implications the results of the analysis have for the inclusion of these neologisms in the dictionary. The Korean government initiated the investigation into neologisms in 1992 and has been supervising this research project ever since. Some 400 to 500 coinages that meet definite criteria are being extracted every year, compiled and printed out in the form of a glossary. This paper focuses on the years 2005 and 2006, for which 408 and 530 respectively, that is, 938 new words in total, were recorded. The study turns then to the analysis of the usage changes in the Korean mass media which these neologisms have been undergoing for the past decade. On a quantitative level, the investigation shows that 27% of those neologisms have been in consistent usage for the last ten years. Keywords: neologisms; usage changes; web crawling corpus; frequency; news articles 1 Introduction The Korean New Words Investigation Project was implemented to collect and record data on the contemporary Korean Language. This project has been carried out and surveys conducted since 1992. Our research consists in studying the new coinages that appear in the mass media within a year. We collect every year about 400 to 500 neologisms and we gather them into a glossary printed under the title New Words of [year]. In this study, we present how our investigation into neologisms is being conducted and discuss methodological and procedural issues. Finally, we propose how to use the results of such an investigation for supplementing dictionary entries. A number of questions have been raised, which form the basis for our study. First of all, how many of the neologisms collected each year die out and how many endure? Second, as we examine the changes in neologism usage, what are the criteria for their extinction and survival? Third, what are the significance and limitations of frequency and statistical distribution when investigating the fluctuations of neologism usage, and how to overcome these limitations? Finally, how can the results of such investigations be utilized when including neologisms in the dictionary? In order to address these questions, we focus on the neologisms extracted in the years 2005 and 2006 and follow their evolution within a time frame of about ten years. Object of study: neologisms of year 2005 (408 words) and year 2006 (530 words), i.e., 938 words in total Time frame: from 2005 to date (for a period of 10 years or so) 2 Object and Methodology The neologisms we investigate in this study are restricted to ‘lexical neologisms’ (i.e., new word forms). The New Words Investigation System allows us to extract automatically the new word forms that appear on the Web, but poses practical issues as it cannot automatically distinguish ‘semantic neologisms’ (i.e., existing word forms that assume a new meaning) and ‘formal neologisms’ (i.e., existing word forms that assume a new grammatical function) (Renouf 2013). There are several points to consider in order to investigate the changes in usage of neologisms over the past decade. The Life and Death of Neologisms: On What Basis Shall We Include Neologisms in the Dictionary? Kilim Nam, Soojin Lee, Hae-Yun Jung, Jun Choi 1 / 5 1 / 5