Fuzzy Sets and Systems 157 (2006) 2347 – 2355 www.elsevier.com/locate/fss Pseudometrics from three-positive semidefinite similarities M. Santos Tomás a , , Claudi Alsina a , Jaime Rubio-Martinez b a Sec. Matemàtiques ETSAB, Universitat Politècnica de Catalunya (UPC),Av. Diagonal 649, E-08028 Barcelona, Spain b Departament de Química Física, Universitat de Barcelona (UB), Martí i Franqués 1, E-08028 Barcelona, Spain Received 5 September 2005; received in revised form 27 December 2005; accepted 28 February 2006 Available online 3 April 2006 Abstract We prove that when some transformations are applied to three-positive semidefinite similarities we obtain a pseudometric. In addition, we demonstrate that some similarity coefficients usually employed in diversity studies fulfil this condition. © 2006 Elsevier B.V.All rights reserved. Keywords: Pseudometric; Metric; Dissimilarity; Similarity; Tanimoto; Dice; Cosinus 1. Introduction 1.1. The identification of homogeneous subgroups from a collection of heterogeneous objects is one of the most common tasks in computing. One of the principal reasons of the growing interest of these methods is their use in combinatorial chemistry for the design of large libraries of compounds in order to find new compounds with drug properties. As the size of those libraries is usually unmanageable, it is necessary to do a selection including the greater quantity of diversity without redundancy. It is generally believed that starting with libraries with a diverse set of compounds offers the best chance of finding active compounds, for this reason, it is necessary to quantify the degree of resemblance between all possible pairs, in order to find those that are more similar. To achieve this goal, the use of a similarity measure is necessary. A molecular similarity measure involve at least two principal components: (1) the representation, used to characterize the molecules that will be compared, and (2) the similarity coefficient, used as a quantitative measure of the degree of resemblance between pairs of such representations [9]. Compounds belonging to a chemical library or, in general, objects in a group G, can be described by n attributes or descriptors in such a way that a vector X i ={x 1i ,x 2i ,...,x ni }, X i G, defines the position of each object in this n-dimensional space. Different sets of descriptors generate different representations of the group. Descriptors may be of binary nature (i.e. dichotomous) or real numbers describing different properties of the objects. Corresponding author. Tel.: +34 93 4016373; fax: +34 93 4016372. E-mail address: maria.santos.tomas@upc.edu (M.S. Tomás). 0165-0114/$ - see front matter © 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.fss.2006.02.009