Magdalena Zawisławska (Warsaw University) Magdalena Derwojedowa (Warsaw University) Jadwiga Linde-Usiekniewicz (Warsaw University) A FrameNet for Polish Since its very beginning FrameNet (Fillmore, 1976, 1977; Fillmore and Atkins, 1992; Fillmore et al., 2003), became a template and a benchmark for the bases for other languages (Erk et al., 2003; Subirats and Petruck, 2003; Ohara et al., 2004), among them for the project RAMKI (Rygorystyczna aplikacja metodologii kognitywno- interpretacyjnej (ram interpretacyjnych) do opisu polszczyzny — Rigorous Application of the Cognitive-Interpretational Methodology (Interpretative Frames) for Polish Language Description). The aim of the project is to provide a description of 200 Polish verbs within the frame semantics. The verbs were chosen according to two criteria: frequency in a large (ca. 100 million words) Corpus of Polish, and lexical equivalence to lexical unit already described in other (i.e. Berkeley, German and Spanish) FrameNets. In the next step preliminary sense division was carried out by a trained linguist with the help of the selected dictionaries (e.g. CHODZIĆ 1. ‘(about a human or an animal) to walk’; 2. ‘(about a machine) to work’; 3. ‘to wear sth’ etc.). As in Berkeley FN (cf. Baker et al. 2000) each lexical unit entry is tagged with the surface syntactic properties, the interpretative frame activated by the lexeme and also examples from the corpus, tagged both syntactically (cf. Świdziński 1996) and semantically, with the appropriate frame elements. New frames will be proposed for verbs not matching already existing ones. A lexicographer is provided with a dedicated application, a computer program that helps to select examples from the corpus, annotate the data syntactically and semantically; it also keeps the data in a database to facilitate searching according to various criteria, such as sentence patters, frames and frame elements, and outputs the data for www presentation. The lexicographic procedure consist of six steps: 1. verifying if the preliminary sense disambiguation reflects the senses detected in the data; 2. selecting the most suitable examples for each sense from the corpus; 3. assigning of the surface syntactic pattern; 4. labelling within the frame; 5. tagging the most typical examples from the corpus with frame elements and 6. with syntactic tags. Besides of being the pilot Polish FrameNet for verbs, the project has several other aspects. As the originall Framet was designed for English, Polish project RAMKI is the first attempt to provide frame semantics for a language with verbal aspect, free word order and rich verb derivation, with prefixes modifying strongly the sense, e.g. GNIĆ— ZGNIĆ ‘to rot’, NADGNIĆ ‘to start to rot’, PRZEGNIĆ ‘to rot through’, WYGNIĆ ‘to rot away (in a gradual process)’, PODGNIĆ ‘to rot for the bottom or to rot in part’). We assume that the frame methodology would enable us to describe the verb polysemyin a more detailed way, to show the difference between so-called aspectual pairs and to verify the existing descriptions of syntactic patterns of verbs (cf. Linde-Usiekniewicz et al. 2008). References C. F. Baker, J. Ch. Fillmore, and B. Cronin. Structure of the FrameNet database. The International Journal of Lexicography , 16(3):281–296, 2000. K. Erk, A. Kowalski, S. Padó, and M. Pinkal. Towards a resource for lexical semantics: A large German corpus with extensive semantic annotation. In Proceedings of ACL 2003. Sapporo, 2003.