SHORT COMMUNICATION The Human Contactin-Associated Protein-like 2 Gene (CNTNAP2) Spansover 2 Mb of DNA at Chromosome 7q35 Kazuhiko Nakabayashi and Stephen W. Scherer 1 Department of Genetics, The Hospital for Sick Children, and Department of Molecular and Medical Genetics, University of Toronto, Toronto, Ontario M5G 1X8, Canada Received November 7, 2000; accepted January 23, 2001 Contactin-associated genes are members of the neur- exin superfamily that encode a group of transmembrane proteins that mediate cell– cell interactions in the ner- vous system. To study the human contactin-associated protein-like 2 gene (CNTNAP2), we have determined its complete DNA sequence and its genomic organization to comprise 25 exons spanning greater than 2.0 Mb of DNA at 7q35. Our results indicate that CNTNAP2 encom- passes almost 1.5% of chromosome 7 and is one of the largest genes in the human genome. © 2001 Academic Press The neurexin (NRXN) gene family encodes cell-sur- face proteins that mediate neuronal cell– cell interac- tions (4). The gene family includes NRXN1, NRXN2, NRXN3, and two genes encoding contactin-associated proteins, CNTNAP1 (also called CASPR or NRNX4) and CNTNAP2 (also called CASPR2) (4, 6). Numerous neurexin isoforms are generated from the three NRXN genes, and in combination with the CNTNAP1 and CNTNAP2 proteins they facilitate neuronal adhesion and signaling (4). The CNTNAP2 protein was shown to be a component of the juxtaparanodal region in myelin- ated axons where it forms a complex with Shaker-like K + channels (6). Radiation hybrid mapping data (http://www.ncbi.nlm.nih.gov/genemap98/) assigned ESTs specific for CNTNAP2 to the D7S688 –D7S505 interval. In our effort to construct a gene map of hu- man chromosome 7 and to assess possible involvement of CNTNAP2 in disease, we determined its DNA se- quence and genomic structure. Through our analysis we identified a full-length CNTNAP2 gene different from one described previously. Our data revealed that CNTNAP2 was unusually large, spanning the majority of the q35.1– q35.2 Giemsa dark band on chromosome 7. As such, CNTNAP2 may represent a positional can- didate gene for the DFNB13 form of nonsyndromic deafness that maps to 7q34 – q36 (5). For our initial analysis, the cDNA sequence from CNTNAP2 (NM_014141) and AK000960 in Unigene Hs.106552 was used for comparison to available genomic DNA sequence. Nucleotides 1–384 of NM_014141 showed 100% identity to the genomic DNA sequence (AC078937), which can be mapped unambiguously to 7q31 (see Fig. 1). The remainder of NM_014141 (nt 385–5132) corresponded to sequences mapped to 7q35 (Figs. 1 and 2). In an attempt to confirm that nu- cleotides 1–384 of NM_014141 were indeed part of CNTNAP2 we performed 5' RACE using fetal brain RNA. Our results consistently produced products that corre- sponded to a consensus 363 bp sequence that, as ex- pected, aligned to nucleotides 385– 639 of NM_014141 but never to nucleotides 1–384. Moreover, our newly identified cDNA sequence (nucleotides 1–108 of the 5' RACE product) was contiguous to nucleotides 385–513 of NM_014141 at 7q35 (Fig. 1). This suggested that the first 384 bp of NM_014141 are not derived from CNTNAP2 and most likely represent a chimeric cDNA product. We then assembled all available CNTNAP2 cDNA sequence originating from 7q35 into an 8107-bp consensus (GenBank Accession No. AF319045) that contains the apparent complete ORF of 1331 amino acids and a 5' and 3' UTR. Northern blot results indi- cated mRNA sizes of approximately 8 –9 kb (our data and also Ref. 6), suggesting that this consensus DNA sequence is likely complete. Analysis of ESTs in Uni- gene Hs. 106552 identified a putative isoform in the 3' UTR (data not shown), but did not identify significant alternatively spliced cDNAs in the protein-coding se- quence of CNTNAP2. Comparison of the 8107-bp consensus cDNA to genomic DNA sequence enabled us to define 21 of the 25 exons of CNTNAP2 (Table 1, Fig. 2). However, two segments of cDNA (nt 2396 –2523 and 3151–3615) did not correspond exactly to genomic sequence. BAC clones NH1080A16 and CIT-HSP2025B7 were identi- Sequence data from this article have been deposited with the GenBank Data Library under Accession No. AF319045. 1 To whom correspondence should be addressed at the Department of Genetics, Room 9107, The Hospital for Sick Children, 555 Univer- sity Avenue, Toronto, ON M5G 1X8, Canada. Telephone: (416) 813- 7613. Fax: (416) 813-8319. E-mail: steve@genet.sickkids.on.ca. All articles available online at http://www.idealibrary.com on Genomics 73, 108 –112 (2001) doi:10.1006/geno.2001.6517 108 0888-7543/01 $35.00 Copyright © 2001 by Academic Press All rights of reproduction in any form reserved.