Ž . Mutation Research 430 1999 55–74 www.elsevier.comrlocatermolmut Community address: www.elsevier.comrlocatermutres Similarity pattern analysis in mutational distributions Nikita N. Khromov-Borisov a, ),1 , Igor B. Rogozin b , Joao Antonio Pegas Henriques a , ˜ ˆ Frederick J. de Serres c a GENOTOX, Laboratorio de Genotoxicidade, Centro de Biotecnologia, and Departamento de Biofısica, UniÕersidade Federal do Rio ´ ´ Grande do Sul, Bloco IV, Predio 43.421, Caixa Postal 15.005, Campus do Vale r UFRGS, Porto Alegre, RS CEP 91501-970, Brazil ´ b Institute of Cytology and Genetics, Russian Academy of Sciences, NoÕosibirsk, Russia c Laboratory of Toxicology, System Toxicology Branch, EnÕironmental Toxicology Program, National Institute of EnÕironmental Health Sciences, PO Box 12233, Research Triangle Park, NC 27709-2233, USA Received 17 June 1999; received in revised form 12 August 1999; accepted 13 August 1999 Abstract Ž . The validity and applicability of the statistical procedure — similarity pattern analysis SPAN — to the study of Ž . Ž . mutational distributions MDs was demonstrated with two sets of data. The first was mutational spectra MS for 697 GC to Ž . AT transitions produced with eight alkylating agents AAs in the lacI gene of Escherichia coli. The second was a recently summarized data on the distributions of 11562 spontaneous, radiation- and chemical-induced forward mutations in the ad-3 Ž . region of heterokaryon 12 of Neurospora crassa. They were analyzed as large two-way contingency tables CTs where two Ž . Ž . kinds of profiles were compared: site or genotypic class profiles and origin or mutagen profiles. To measure similarity Ž . Ž 2 . homogeneity between any pair of profiles, the relevant sufficient statistics, Kastenbaum–Hirotsu squared distance KHi , was used. Collapsing the similar profiles into distinct internally homogeneous clusters named ‘collapsets’ revealed their similarity pattern. To facilitate the procedure, the computer program, COLLAPSE, was elaborated. The results of SPAN for the lacI spectra were found comparable with the results of their previous analysis with two multivariate statistical methods, Ž . Ž. the factor and cluster analyses. In the ad-3 data set, five collapsets were revealed among origin profiles OPs : I Ž . Ž . Ž . ENU s 4NQO s 4HAQO s FANFT s SQ18506; II AF-2 s EI s MMS s DEP; III ETO s UV; IV AHA s Ž . PROCARB; and V He ions sprotons. Moreover, the previous observation that MDs are dose-dependent was confirmed for X-ray-induced MDs. Profiles induced with the low doses of X-rays are similar to that induced with 85 Sr, and profiles induced with the medium X-ray doses to those induced with protons and He ions. Evaluated similarities appear to be rather reasonable: mutagens with similar mode of action induce similar MDs. Similarity pattern revealed among genotypic class Ž . profiles GCPs seems to be also interpretable. When supplemented with descriptive cluster analysis, SPAN appears to be a fruitful methodology in MS analysis. q 1999 Elsevier Science B.V. All rights reserved. Keywords: Mutational spectra; Mutational distributions; SPAN; Similarity pattern analysis; Contingency table collapsing; Kastenbaum– Hirotsu squared distance; Escherichia coli; lacI gene; Neurospora crassa; ad-3 forward mutation assay; Spontaneous mutations; Chemical mutagens; Non-ionizing radiation; Ionizing radiation ) Corresponding author. Tel.: q55-51-316-7003; fax: q55-51-319-1079. Ž . E-mail address: nikita@dna.cbiot.ufrgs.br N.N. Khromov-Borisov 1 On leave from Biological Institute of Saint-Petersburg University, Russia. 0027-5107r99r$ - see front matter q 1999 Elsevier Science B.V. All rights reserved. Ž . PII: S0027-5107 99 00148-7