Copy-number polymorphisms: mining the tip of an iceberg Patrick G. Buckley * , Kiran K. Mantripragada * , Arkadiusz Piotrowski, Teresita Diaz de Sta ˚ hl and Jan P. Dumanski Department of Genetics and Pathology, Rudbeck laboratory, Uppsala University, 751 85 Uppsala, Sweden Copy-number polymorphisms (CNPs) represent a greatly underestimated aspect of human genetic vari- ation. Recently, two landmark studies reported genome- wide analyses of CNPs in normal individuals and represent the beginning of an understanding of this type of large-scale variation. Future array-CGH-based CNP analyses should include standard criteria on a common microarray platform. It is only when parallel analyses of CNPs and SNPs are performed in an integrated format that we will obtain a global picture of our genetic diversity. Introduction The study of human genetic variation at the DNA level constitutes a major challenge and has received consider- able attention in the post-genomic era. The dominating type of variation explored so far in the genome has been single nucleotide polymorphisms (SNPs), overshadowing the issue of copy-number polymorphisms (CNPs) (gains and deletions) [1]. The current approach to study genetic variation can be viewed as biased, in the sense that the identification of genome-wide large-scale CNPs is vir- tually untouched compared with detailed analyses of millions of SNPs. We believe that analysis of SNPs and CNPs are necessary to obtain a more complete picture of our genetic diversity. The presence of a limited number of the best-studied form of DNA copy-number variation (indels) was pre- viously observed in the human genome, and several studies have ascertained their importance in health and disease [2–8]. For example, in a study of ovarian cancer cell lines, Lin et al., identified a 276-bp region of chromosome 22q13 that was deleted not only in 47% of ovarian cancer cell lines but also in 18% of constitutional DNA samples from healthy individuals [6]. Another study reported a 102-bp homozygous deletion on chromosome 8p12–21 in biliary tumors and pancreatic tumors (and cell lines) as well as in normal individuals [9]. Both of these studies concluded that the identified deletions might represent normal human genetic variation rather than cancer-associated aberrations. Interestingly, some indels provide a protective effect against disease in ‘normal’ (or unaffected) individuals. For example, the 32-bp deletion polymorphism of the C-C chemokine receptor 5 gene (CCR5) confers a reduced susceptibility to HIV-1 infection in homozygous individuals [10]. This demonstrates that the hidden functionality of such normal variation only becomes apparent when challenged by environmental factors. The genome-wide detection of CNPs has been compli- cated because of the lack of high-resolution and high- throughput techniques. A fundamental step towards identifying such variation was the development of micro- array-based comparative genomic hybridization (array- CGH) [11,12]. This method is based on the assessment of fluorescence ratios between differentially labeled test and reference DNA, hybridized to a microarray [13,14]. Altered fluorescence ratios are therefore indicative of DNA copy-number imbalance (loss or gain) in the test versus the reference genome. Although many array-CGH studies that focus on tumor-associated gains or deletions (indicating the presence of activated oncogenes or inacti- vated tumor-suppressor genes) have been performed [15–18], there are few studies addressing this type of variation in normal individuals. Genome-wide array-based detection of CNPs Recently, two landmark studies have reported the pre- sence of CNPs in humans using different genome wide array-CGH based techniques [19,20]. Iafrate et al. used commercially available bacterial artificial chromosome (BAC) arrays, whereas Sebat et al. developed and applied a custom-made oligonucleotide array for the assessment of copy-number variation. A brief comparison of these studies is presented in Table 1. Both studies convincingly demonstrate the presence of genomic imbalances among normal individuals, which overlap with genes and often coincide with segmental duplications in the genome and can contribute to phenotypic variation and disease susceptibility. In these reports, Sebat et al. and Iafrate et al. identified 76 and 255 loci, respectively, that display copy-number variation in the human genome. One of the disadvantages of both studies is the limited number of samples used to assess DNA copy-number variation. One would expect at least 100 individuals to be analyzed to assess the frequency of CNPs using similar standards as applied for SNPs (i.e. a change detected in !1% of analyzed samples is considered a mutation, rather than a polymorphism). However, this suggested number might need to be revised, because the overall frequencies of private [i.e. unique to single individual (or kindred) or Corresponding author: Dumanski, J.P. (jan.dumanski@genpat.uu.se). * Both of these authors contributed equally. Available online 26 April 2005 Update TRENDS in Genetics Vol.21 No.6 June 2005 315 www.sciencedirect.com