Pak. J. Bot., 46(5): 1865-1870, 2014. ASSESSMENT OF GENETIC DIVERGENCE IN TOMATO THROUGH AGGLOMERATIVE HIERARCHICAL CLUSTERING AND PRINCIPAL COMPONENT ANALYSIS QUMER IQBAL * , MUHAMMAD YUSSOUF SALEEM, AMJAD HAMEED AND MUHAMMAD ASGHAR 1 Nuclear Institute for Agriculture and Biology (NIAB), P.O. Box 128, Jhang Road, Faisalabad, Pakistan *Corresponding author email: qumerhort@gmail.com Abstract For the improvement of qualitative and quantitative traits, existence of variability has prime importance in plant breeding. Data on different morphological and reproductive traits of 47 tomato genotypes were analyzed for correlation, agglomerative hierarchical clustering and principal component analysis (PCA) to select genotypes and traits for future breeding program. Correlation analysis revealed significant positive association between yield and yield components like fruit diameter, single fruit weight and number of fruits plant -1 . Principal component (PC) analysis depicted first three PCs with Eigen-value higher than 1 contributing 81.72% of total variability for different traits. The PC-I showed positive factor loadings for all the traits except number of fruits plant -1 . The contribution of single fruit weight and fruit diameter was highest in PC-1. Cluster analysis grouped all genotypes into five divergent clusters. The genotypes in cluster-II and cluster- V exhibited uniform maturity and higher yield. The D 2 statistics confirmed highest distance between cluster- III and cluster- V while maximum similarity was observed in cluster-II and cluster-III. It is therefore suggested that crosses between genotypes of cluster-II and cluster-V with those of cluster-I and cluster-III may exhibit heterosis in F 1 for hybrid breeding and for selection of superior genotypes in succeeding generations for cross breeding programme. Key words: Tomato germplasm/variety, Genetic diversity, Correlation, Fruit Yield. Introduction Tomato (Lycopersicon esculentum Mill., 2n=2x=24) is one of the most important Solanaceous vegetable crop grown all over the world. It is versatile in nature and used for various cooking purposes. It can be processed in puree, paste, ketchup, sauce, soup etc. The average yield of tomato is very low in the tune of 10.1 tonnes per hectare in Pakistan (Anon., 2011a) as compared to 33.6 tonnes per hectare of modern agricultural systems of tomato in the world (Anon., 2011b). Besides yield limiting factors, the lack of information on genetic diversity and adaptability misleads to choice of parents suitable for hybridization program. Consequently the hybrids (F 1 s) or recombinants (selected at F 2 / later generations) very often do not express full spectrum of genetic trait (s) of interest owing to limited genetic base and inappropriate selection of the parents. This problem can only be overcome if the breeders have substantial information on genetic diversity of source population. Knowledge about levels and patterns of genetic diversity is very important for diverse applications in plant breeding. Such study focuses on the degree of similarities or dissimilarity in genetic resources (Reif et al., 2005; Rashid et al., 2008; San-San-Yi et al., 2008) leading to set up organization of gene banks and isolation of best parental combinations. Following hybridization, these parental combinations can possibly produce progenies with elevated genetic variability, thereby increasing chances of creating superior genotypes with traits of interest (Mohammadi & Prasanna, 2003; Crossa & Franco, 2004). In tomato, yield is the cumulative effect of many components contributing individually to yield (Bernousi et al., 2011). Different characteristics viz., number of flowers cluster -1 , days to first fruit ripening, fruit weight, fruit length, fruit width assume vital importance and must be assessed for genetic divergence aiming to develop high yielding tomato varieties or hybrids. The most commonly used algorithms for this purpose, are canonical variable analysis, principal component analysis and clustering methods (Mohammadi & Prasanna, 2003; Sudre et al., 2007). Principal component analysis is frequently used to determine the relative significance of different variables of classification, prior to cluster analysis (Jackson, 1991). Additionally PCA also gives a reduced dimension model that would point out the measured differences among different groups and leads to understanding of variables by telling how much of the total variance is explained by each one. Mahalanobis D 2 statistics is powerful tool for measuring divergence among a set of population on the basis of statistical distance utilizing multivariate measurements. The present study was therefore conducted to categorize the available germplasm into separate clusters or groups on the basis of genetic diversity among their morphological attributes using agglomerative hierarchical clustering and principal component analysis. Having performed analysis, the desirable groups of genotypes could be crossed with confidence to develop either open pollinated or hybrid varieties on commercial scale. Material and Methods Forty four exotic tomato genotypes collected from Tomato Genetic resource Center (TGRC) along with three local varieties (Pakit, Galia and Naqeeb) were grown in tomato experimental field of Nuclear Institute for Agriculture and Biology (NIAB), Faisalabad, Pakistan in Randomized Complete Block Design with 2 replications. Five to six inch nursery seedlings were transplanted in field keeping Plant to Plant and Bed to Bed distance of 50 cm and 1.5 m, respectively. Seven plants of each genotype per replication were grown by adopting standard agronomic and plant protection practices to maintain healthy crop. The