Amazonian Journal of Plant Research ©2017 Universidade Federal do Pará This paper is available online free of all access charges Faculdade de Engenharia Agronômica http://www.ajpr.online - Amaz. Jour. of Plant Resear. 3(1):276-289. 2019 Jorge G. Aguilera 276 E-mail: j51173@yahoo.com Original Paper The combination of data as a strategy to determine the diversity of tomato subsambples Jorge G. Aguilera 1 , Bruno G. Marim 1 , Tesfahun A. Setotaw 1,2 , Alan M. Zuffo 3 , Carlos Nick 4 and Derly J. H. da Silva 4 1 Postgraduate in Genetics and Breeding, UFV, Viçosa, MG, Brazil 2 Kulumsa Agricultural Research Center, Assela, Ethiopia 3 Federal University of Mato Grosso do Sul, Chapadão do Sul Unit - Rod MS 306, Km 105, mailbox 112, 79560-000, Chapadão do Sul, MS, Brazil 4 Department of Plant Science, UFV, Viçosa, MG, Brazil Received: 19 November, 2018. Accepted: 16 March, 2019 First published on the web August, 2019 Doi: 10.26545/ajpr.2019.b00035x Abstract The estimation of genetic diversity by qualitative, quantitative, and molecular data and their combination are important in characterizing germplasm collections for pre-breeding purposes, mainly for the identification of divergent parents. For this purpose, we assessed a population of 94 tomato subsamples from UFV Vegetable Germplasm Bank (BGH-UFV) using 10 ISSR markers and agronomic data (three qualitative and six quantitative traits). Data revealed the existence of genetic diversity in germplasm considering the three data classes. Principal coordinates analysis (PCoA) confirmed the genetic variability of the subsamples, explaining 27% of the variability in the first two PCoAs. The Bayesian based clustering analyses using the STRUTURE software verified the existence of a structured population, with three populations. The mantel test for the correlation produced by the three data classes showed highly significant correlation (r = 0.31, P<0.001) among quantitative and molecular data. The Tocher method of clustering for each dissimilarity matrices showed that the clustering patterns were dependent on the data classes. According to the results we found, it is possible to predict the best combinations of parents that can provide maximum gain in a breeding program. Besides the combine use of the quantitative, qualitative and molecular data, using multivariate and Bayesian method of clustering is an efficient method to study the genetic diversity of tomato plants in the germplasm bank. Key-words: Solanum lycopersicum, ISSR, Quantitative and Qualitative Data, Sum of Matrices, Population Structure. Introduction Tomato (Solanum lycopersicum L.) originated from South America and research indicates that the species was already cultivated by the Incas and Aztecs about 1300 years ago. Bolivia, Chile, Ecuador and Peru considered as centers of diversity of this vegetable (Currence, 1963). Tomato can be grown in tropical and subtropical regions worldwide, both for in natura consumption and for the processing industry, standing out as the second most grown vegetable in the world, which is surpassed only by the potato (FAO, 2018). The great variability in Lycopersicon genus has allowed the development of cultivars to meet the most diverse market demands. The Federal University of Viçosa (UFV) has a germplasm collection with about 860 tomato subsamples from six different species (Silva et al., 2001). This collection is the genetic basis for UFV tomato pre-breeding programs and has been widely used to search for genes that confer resistance