630 NATURE BIOTECHNOLOGY VOL 18 JUNE 2000 http://biotech.nature.com RESEARCH ARTICLES After the first complete sequence of a human genome is obtained, the next challenge will be to discover and understand the function and variation of genes and, ultimately, to understand how such qual- ities affect health and disease 1,2 . A key to this undertaking will be the availability of methods for efficient and accurate identification of genetic variation and expression patterns among large sets of genes 2 . Several powerful techniques have been developed for such analyses that depend either on specific hybridization of probes to microar- rays 3,4 or on the counting of tags or signatures of DNA fragments 5–8 . Whereas the former provides the advantages of scale and the capa- bility of detecting a wide range of gene expression levels, such mea- surements are subject to variability relating to probe hybridization differences and cross-reactivity, element-to-element differences within microarrays, and microarray-to-microarray differences 9-11 . On the other hand, the latter methods, which provide digital repre- sentations of abundance, are statistically more robust; they do not require repetition or standardization of counting experiments (since counting statistics are well modeled by the Poisson distribution), and the precision and accuracy of relative abundance measurements may be increased by increasing the size of the sample of tags or sig- natures counted 9 . Unfortunately, however, this property is difficult to realize routinely because of the cost and scale of effort required. To address some of these problems, we describe a method for sequencing DNA that does not require physical separation of frag- ments and show how combining it with in vitro cloning of DNA templates on microbeads 12 results in a robust new analytical plat- form for genomic analysis. The power of this approach, which we refer to as massively parallel signature sequencing (MPSS) analysis, resides in the ability to conveniently handle complex mixtures of nucleic acid fragments by in vitro cloning of constituent fragments onto microbeads in sufficient quantities to conduct and monitor biochemical or enzymatic reactions by fluorescent probes. We show that multiple cycles of a ligation-based DNA sequencing method can be simultaneously carried out on a million microbeads, each having copies of a single template attached, to generate millions of signature sequences. Template-containing microbeads are assembled in a flow cell that constrains the microbeads to form a closely packed planar array that remains fixed as sequencing reagents are pumped through the flow cell. Sequencing progress is monitored optically by collect- ing and imaging fluorescent signals generated by the entire microbead array onto a CCD detector followed by image processing. We show how MPSS analysis can be used to simultaneously acquire in a single operation hundreds of thousands of signature sequences from a yeast cDNA library, and we validate the accuracy of the signatures by comparison with the known genome sequence of Saccharomyces cerevisiae. We also demonstrate the technique’s potential for gene expression analysis by comparing expression lev- els of genes of the human acute monocytic leukemia cell line, THP- 1, measured by MPSS analysis and by conventional sequencing. Results In vitro cloning on microbeads. Before sequencing, templates are “cloned” on microbeads by first generating a complex mixture of con- jugates between the templates and oligonucleotide tags, where the number of different oligonucleotide tags is at least a hundred times larger than the number of templates. For example, in the present Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays Sydney Brenner*, Maria Johnson, John Bridgham, George Golda, David H. Lloyd, Davida Johnson, Shujun Luo, Sarah McCurdy, Michael Foy, Mark Ewan, Rithy Roth, Dave George, Sam Eletr, Glenn Albrecht, Eric Vermaas, Steven R. Williams, Keith Moon, Timothy Burcham, Michael Pallas, Robert B. DuBridge, James Kirchner, Karen Fearon, Jen-i Mao, and Kevin Corcoran Lynx Therapeutics, Inc., 25861 Industrial Blvd., Hayward, California 94545 *Corresponding author (e-mail: sbrenner@lynxgen.com). Received 17 February 2000; accepted 19 April 2000 We describe a novel sequencing approach that combines non-gel-based signature sequencing with in vitro cloning of millions of templates on separate 5 μm diameter microbeads. After constructing a microbead library of DNA templates by in vitro cloning, we assembled a planar array of a million template- containing microbeads in a flow cell at a density greater than 3 × 10 6 microbeads/cm 2 . Sequences of the free ends of the cloned templates on each microbead were then simultaneously analyzed using a fluores- cence-based signature sequencing method that does not require DNA fragment separation. Signature sequences of 16–20 bases were obtained by repeated cycles of enzymatic cleavage with a type IIs restric- tion endonuclease, adaptor ligation, and sequence interrogation by encoded hybridization probes. The approach was validated by sequencing over 269,000 signatures from two cDNA libraries constructed from a fully sequenced strain of Saccharomyces cerevisiae, and by measuring gene expression levels in the human cell line THP-1. The approach provides an unprecedented depth of analysis permitting application of powerful statistical techniques for discovery of functional relationships among genes, whether known or unknown beforehand, or whether expressed at high or very low levels. Keywords: DNA sequencing, ligation, gene expression, fluid microarray, yeast © 2000 Nature America Inc. • http://biotech.nature.com © 2000 Nature America Inc. • http://biotech.nature.com