Mapping Word Knowledge in Japanese: Coding Japanese Word Associations Terry Joyce Large-Scale Knowledge Resources COE, Tokyo Institute of Technology, Tokyo, Japan joyce.t.aa@m.titech.ac.jp Abstract This project is investigating lexical knowledge by mapping out the associative structures that exist for Japanese words. Specifically, the project is (1) constructing a large-scale database of Japanese word associations, (2) utilizing the association database to create lexical association network maps as a means of capturing association patterns, and (3) exploring applications of the database and the maps. This paper focuses on describing the coding of word association responses collected so far in preparation for the release of Version 1 of the Japanese Word Association Database. The paper also introduces a study conducted to explore the application of lexical maps to Japanese language instruction. Index Terms: lexical knowledge, Japanese word association database, lexical association network maps, bilingual lexical maps 1. Introduction Reflecting the fact that association is a basic mechanism of human cognition [1][2], there has been considerable interest within various areas of cognitive science, such as psychology, artificial intelligence and natural language processing, in identifying and understanding the structured relations that exist between concepts by mapping out how concepts are represented in the rich networks of associations that exist between words [3][4][5][6][7][8][9]. In a similar vein, this project is seeking to investigate the nature of lexical knowledge in Japanese by mapping out the complex networks of associations that exist for basic Japanese vocabulary as captured through large-scale free word association surveys [10][11][12][13][14]. This paper reports on the on-going construction of a large-scale database of Japanese word associations, based on responses collected from two conducted questionnaire surveys and from a web- based survey. More specifically, Section 2 focuses on describing the coding of collected word association responses for a random sample of 2,100 vocabulary items from the present database corpus of 5,000 items, which will made publicly available as Version 1 of the Japanese Word Association Database. Section 2 also touches on the development of a web-based version of the word association survey launched as an effective way of collecting the large- scale quantities of responses required for the database. Section 3 presents an example of the lexical association network maps and an example of how analyzing the types of association relationships elicited from related words can provide insights into their conceptual structures. Finally, Section 4 introduces a study conducted to explore the application of lexical maps to Japanese language instruction. 2. Constructing the database This project is constructing a Japanese word association database that is large-scale in terms of both the number of words surveyed and the number of association responses collected. 2.1. Survey corpus of basic Japanese vocabulary A survey corpus of 5,000 basic Japanese kanji and words was compiled [10][12], by identifying common items in three references sources of basic vocabulary for Japanese language education. 2.2. Questionnaire surveys The majority of the word association responses collected to date have come from two large questionnaire surveys. The first survey collected up to 50 word association responses for a random sample of 2,000 items, while the second survey collected at least ten responses for the remaining 3,000 items in the survey corpus. 2.2.1. Method Participants: Native Japanese university students (N = 1,481; 929 males and 552 females; average age 19.03, SD = 0.97) participated in the two surveys on a volunteer basis. Questionnaire sheets: For both surveys, target items were divided into lists of 100 items. A survey questionnaire consisted of 10 pages with 10 items printed per page, as a centered column of words with underlined blank spaces for association responses (e.g., 本 ). The instructions asked the participants to look at each printed item and to write down in the blank space the first semantically-related Japanese word that comes to mind. 2.2.2. Results From two traditional paper questionnaire surveys, approximately 148,100 word association responses were collected for a corpus of 5,000 basic Japanese kanji and words. 2.3. Version 1 of Japanese Word Association Database Through two questionnaire surveys, 2,100 items drawn at random from the survey corpus were presented to up to 50 respondents for word association responses (a list of these is available at http://www.valdes.titech.ac.jp/~terry/jwad.html). The word association responses to these items are being processed and coded in order to make them publicly available as Version 1 of the Japanese Word Association Database.