Using PAC nested deletions to order contigs and microsatellite markers at the high repetitive sequence containing Npr3 gene locus q Rodney C. Gilmore a , Joseph Baker Jr. a , Sean Dempsey a , Rosemarie Marchan a , Robert N.L. Corprew Jr. a , Goldie Byrd b , Nobuyo Maeda c , Oliver Smithies c , Richard D. Bukoski a , Ken R. Harewood a , Pradeep K. Chatterjee a, * a Julius L. Chambers Biomedical/Biotechnology Research Institute, North Carolina Central University, 1801 Fayetteville Street, Durham, NC 27707, USA b Department of Biology, North Carolina Central University, 1801 Fayetteville Street, Durham, NC 27707, USA c Department of Pathology and Laboratory Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA Received 30 May 2001; received in revised form 18 July 2001; accepted 1 August 2001 Received by J.A. Engler Abstract Highly polymorphic di- and tetranucleotide repeats in and around Npr3, a potential candidate gene for hypertension, have been identified using a novel approach. Because this chromosomal site is rich in repetitive DNA and difficult to sequence, P1 artificial chromosomes were retrofitted with a loxP transposon to map the gene sequence within a clone using a series of nested deletions. Sequences from ends of deletions 1–3 kb apart identified a (CA) 20 and a (TA) 18 -(CA) 8 repeat 8 kb upstream and within an intron of Npr3, respectively. DNA from 17 individuals was analyzed for length polymorphisms in these and eight additional repeats identified in 200 kb of working draft sequence from this region in GenBank. The sequence contigs and microsatellite repeats from GenBank were ordered using the P1-derived artificial chromosome deletion series. Several of these repeats were found to vary considerably in length in the set of genomic DNA tested. Since this site in chromosome 5p has recently been implicated in disease in studies with genetically hypertensive rats, the microsatellite markers reported here will be useful for genetic analysis and may even be implicated in the disease process in humans. We discuss how these types of data are useful for interpreting draft DNA sequence coming out of the genome projects, and the utility of deletion clones as a resource for ordering contigs and gap filling. q 2001 Elsevier Science B.V. All rights reserved. Keywords: Polymorphic markers; Npr3 gene locus; Gap filling 1. Introduction Building large contiguous segments of DNA sequence using an overlap of shotgun reads is most efficient in areas of the genome that are relatively free of repetitive sequences (Dunham et al., 1999). Chromosomal regions high in repe- titive DNA content require iterative cycles of directed sequencing to remove ambiguity in aligning sequence reads. Contigs assembled from these regions are usually short, and ordering them is tedious. Microsatellite repeats and other polymorphic markers identified in draft sequences from such regions are less useful because their location relative to a gene is unknown. It is desirable therefore to develop alternative approaches that can order short sequence contigs independent of their overlap and estimate gap sizes such that the full potential of genetic markers can be realized prior to obtaining a finished sequence of the region. A dense, evenly spaced nested deletion series gener- ated in bacterial artificial chromosomes (BACs)/P1-derived artificial chromosomes (PACs) spanning a locus of interest appears ideally suited for the purpose since sequences from ends of deletions can be ordered by sizing the clones using FIGE. Here we explore these objectives in the repetitive sequence-rich human natriuretic peptides clearance receptor (Npr3) gene locus, and illustrate how nested deletions can be used to order contigs and several di- and tetranucleotide repeats, and size gaps in draft sequences from this region. In an effort to identify suitable polymorphic markers for link- age analysis of genes such as Npr3 and new ones yet to be identified from this region of Chr 5p to essential hyperten- sion, variation in length of several of these microsatellite markers was tested in genomic DNA from 17 individuals. Gene 275 (2001) 65–72 0378-1119/01/$ - see front matter q 2001 Elsevier Science B.V. All rights reserved. PII: S0378-1119(01)00654-0 www.elsevier.com/locate/gene q GenBank Accession Numbers AZ303311–AZ303372. Abbreviations: FIGE, field inversion gel electrophoresis; IPTG, isopro- pyl-b-d-thiogalactopyranoside; SNP, single nucleotide polymorphism * Corresponding author. Tel.: 11-919-530-7017; fax: 11-919-530-7998. E-mail address: pchatterjee@wpo.nccu.edu (P.K. Chatterjee).