Ontological and Terminological Commitments and the Discourse of Specialist Communities Khurshid Ahmad 1 , Maria Teresa Musacchio 2 , Giuseppe Palumbo 3 1 School of Computing and Statistics Trinity College Dublin Dublin 2, Ireland kahmad@cs.tcd.ie 2 Dipartimento di Lingue e Letterature AngloGermaniche e Slave, Università di Padova Via Beldomandi 1, 35137 Padova, Italy mt.musacchio@unipd.it 3 Dipartimento di Scienze del Linguaggio e della Cultura, Università di Modena e Reggio Emilia Largo Sant’Eufemia 19, 41100 Modena, Italy palumbo.giuseppe@unimo.it Abstract The paper presents a corpus-based study aimed at an analysis of ontological and terminological commitments in the discourse of specialist communities. The analyzed corpus contains the lectures delivered by the Nobel Prize winners in Physics and Economics. The analysis focuses on (a) the collocational use of automatically identified domain-specific terms and (b) a description of meta- discourse in the lectures. Candidate terms are extracted based on the z-score of frequency and weirdness. Compounds comprising these candidate terms are then identified using the ontology representation system Protégé. This method is then replicated to complete analysis by including an investigation of metadiscourse markers signalling how writers project themselves into their work. 1. Introduction Discourse analysis now plays a major role in the research, teaching and learning of the language of specialist domains. There is increasing evidence that discourse analysts are moving from the traditional intuitive and hermeneutical analysis, based on hand- selected key-words and sentences in carefully selected texts (see, for example, Sinclair and Coulthard 1975) to a more empirical corpus-based study of spoken (Sinclair 1992) and written (Stubbs 1996) discourse. The discourse of an author, and its interpretation by the readers of the discourse, is regarded as an ‘ideological question’: what were the intellectual commitments of the author and those of the readers? What is it that the author does not quite believe in and will reinforce/contradict the beliefs of the readers? These questions are asked frequently in literary criticism and recently have been asked in applied linguistics literature: ‘how writers project themselves into their work to manage their communicative intent’ (Hyland 2000). Researchers interested in the question of intellectual commitment of specialists are increasingly to be found in the various areas of computing including ontological engineering, semantic web and information extraction (Sowa 2000, Maedche 2002). In this paper we explore ways of finding (a) the ontological commitment, primarily through a discourse analysis of the specialist terms coined and used by specialists, and (b) how this commitment may or may not be projected in texts through metadiscourse analysis of the so-called discourse ‘markers’ that help an author to foreground/background his or her claims, and cite sympathetic members of his or her own community. This intellectual commitment is an ontological commitment to use domain-specific vocabulary or ‘terminology’ that is consistent with the theory specified by an ontology (Gruber 1993). In any specific domain, the objects that can be represented are referred to as universe of discourse. Thus, those who subscribe to a given ontology also make a terminological commitment to agree on the meaning of any term in that ontology (Kashyap 2004). In terminology a list of concepts constituting a domain is established, concepts are related logically and ontologically to one another, and a designation – that is, a term – is assigned to each concept in the domain (Cabré 1998). Finally, in discourse analysis a universe of discourse is a set of discourse elements that pertain to the beliefs, conventions, and knowledge shared by members of a sociolinguistic community. As can be seen, there is a clear overlap in terms and concepts derived from philosophy in ontological engineering, terminology and discourse analysis. We use this interdisciplinary common ground to analyse ontological and terminological commitments in the discourse of specialist communities. 2. Method and motivation LSP studies – and it has to be said many a computational studies – of terminology are rooted deeply in the so-called Platonist tradition. For a Platonist, intellectual commitment in one sense relates to a belief in a set of abstract objects – whatever exists in the world of physics, economics, chemistry and so on, exists in the abstract and is transcendental to human sensory perception. Specialist knowledge comprises a set of sentences written by specialists reflecting the ‘reality of [the] abstract objects’ (Orenstein 1977). In discourse analysis the focus is on analysing a set of sample sentences from one or more texts, even corpora, and examining the communicative intent of the author(s). In computing, the focus again is on identifying a set of true sentences, based usually on the intuition of the computing researchers and true in the sense of Ornstein above, and the subsequent conversion of the true sentences into statements of logic. But there are criticisms of the Platonist approach in both LSP studies in 1454