—1— Fifth International Conference on Scientometrics & Informetrics River-Forest (Chicago), Illinois, USA, June 7-10, 1995. HOW TO DO THINGS WITH TERMS IN INFORMETRICS: TERMINOLOGICAL VARIATION AND STABILIZATION AS SCIENCE WATCH INDICATORS 1 . Xavier Polanco, Luc Grivel, Jean Royauté Institut de l'Information Scientifique et Technique (INIST) Centre National de la Recherche Scientifique (CNRS) Abstract: Until now, computational linguistics (CL) has been a discipline which had not found its place in informetrics (I). In this paper, we report on our current work which brings together I&CL and applies this new “couple” to the development of an information analysing device, designed to derive indicators from the very words used by researchers in scientific and technical documents. Our linguistic approach consists in identifying the real term variant (for example dependance at near zero temperature) of a thesaurus term (temperature dependance) in a corpus. We applied a co-word analysis technique to terms in order to highlight a “terminological” network which includes both non variant terms and variant terms which otherwise would remain undetected without some linguistic treatment. We formulate that the presence or absence of terminological variations can be used as indicators at the written- language level of knowledge. Finally, we propose two informetric-linguistic indexes in French symbolic language, VAR (variation) and FIG ("figement" in French). 1. INTRODUCTION. We report on our current work on the coupling and application of informetric and computational linguistic techniques to develop an information analysis device designed to derive from the very words used by researchers in scientific and technical documents (abstracts or full text) indicators that may signal and measure "changes in aspects of science". (According to Elkana et al. (Ref. 5): "Science indicators are measures of changes in aspects of sciences"). In this article, the term "Informetrics" covers "both sciento- and biblio-metrics, and implicitly both documentary and electronic forms of information" (Ref. 2). By computational linguistics, we mean computerized natural language processing (NLP) and we adopted a pragmatic approach, focusing on practical, applicable results, and on the ability to process large collections of real linguistic data. Our computational linguistic approach (Ref. 12, 18) consists in identifying terms of a thesaurus under their normal forms (temperature dependance ) or under different variants forms (dependance at near zero temperature) with partial parsing. This phenomenon is called "variation". All of the identified variants are linked to the thesaurus terms under their normal forms. SDOC applies the co-word analysis method to this collection of detected terms in full text. 2. OBJECTIVES AND HYPOTHESIS. This section focuses first on the three types of objectives, technical, conceptual and pragmatic that we want to achieve. Second, we suggest the hypothesis that the linguistic phenomenon terminological variation and stabilization are indicators that can be used in the strategic analysis of STI, especially analysing information contained in the title, abstract or full text of documents. 2.1 Technical, Conceptual and Pragmatic Objectives. 1 Published in Proceedings of the Fifth International Conference on Scientometrics and Informetrics. River- Forest (Chicago) Illinois, USA, June 7-10, 1995. Edited by M. E. D. Koening and A. Bookstein. Ledford, NJ: Learned Information Inc., 1995, p. 435-444