Vanda Broughton — University College London Language Related Problems in the Construction of Faceted Terminologies and their Automatic Management Abstract The paper describes current work on the generation of a thesaurus format from the schedules of the Bliss Bib- liographic Classification 2 nd edition (BC2). The practical problems that occur in moving from a concept based approach to a terminological approach cluster around issues of vocabulary control that are not fully addressed in a systematic structure. These difficulties can be exacerbated within domains in the humanities because large num- bers of culture specific terms may need to be accommodated in any thesaurus. The ways in which these problems can be resolved within the context of a semi-automated approach to the thesaurus generation have consequences for the management of classification data in the source vocabulary. The way in which the vocabulary is marked up for the purpose of machine manipulation is described, and some of the implications for editorial policy are discussed and examples given. The value of the classification notation as a language independent representation and mapping tool should not be sacrificed in such an exercise. Facet analysis as a general tool for vocabulary construction The value of using a faceted classification as a basis for a thesaurus has long been ac- knowledged, primarily in the work of Jean Aitchison (1986) who pioneered the meth- odology and built several subject specific thesauri (Aitchison et al., 1992) using BC2 schedules as a basis. 1 The role that can be played by facet analysis in the construction of thesauri is now formally acknowledged in recent revisions of the thesaurus standards. For example, in the new British Standard, BS8723 the value of facet analysis as a general methodology is explicitly stated: Facet analysis is useful in generating hierarchies that conform to the rules for hierarchical relationships … because these relationships are valid only for terms belonging to the same general category. … The choice of facets may vary … but … it is usual to use fundamental categories such as objects, materials, agents, actions, places, times, etc. These fundamental facets may be analysed into subfacets where it is helpful to do so … (BS8723-2:2005, p. 31) This is followed by further discussion of the detailed analysis, and accompanied by fig- ures showing the faceted display with its subfacets (or arrays) and node labels (prin- ciples of division). Facet analysis is considered at various other places in the standard (BS8723-2:2005, p. 32, 36, 37, 38, 39, 41), usually as part of a process of thesaurus con- struction that follows the Aitchison model. The current American standard (ANSI/NISO Z39.19-2005) also mentions facet analysis for the first time, although it is more cautious in its approach, stating only that facets may be useful (ANSI/NISO, 2005, 14–15). Al- though facet analysis is understood differently in the US (ANSI/NISO 2005, p. 14), it is clear that, in practice, facet analysis is applied in much the same way as in the UK, and the Art & Architecture Thesaurus is used to provide examples. Facet analysis is mentioned at several points in the text, including the observation that it is typically used for ‘large controlled vocabularies covering a broad domain or discipline with complex relationships among terms’ (ANSI/NISO 2005, p. 141). The faceted classification, because of the rigorous analytical principles used in its con- struction, makes explicit most of the relationships employed in the standard thesaurus 1. These included not only the international affairs thesaurus cited, but also the Department of Health and Social Services thesaurus, which was based on the Health Sciences Class H of BC2.