Multiplex Approach for Screening Genetic Markers of Microbial Indicators Robert D. Stedtfeld 1 , Sam Baushke 1 , Dieter Tourlousse 1 , Benli Chai 2 , James R. Cole 3 , Syed A. Hashsham 4 * ABSTRACT: Genetic markers are expected to provide better specificity in epidemiological studies and potentially serve as better indicators of water- borne pathogens. Methods used to assess genetic markers of emerging microbial indicators include pulsed field gel electrophoresis, polymerase chain reaction (PCR), and microarrays. This paper outlines a high-throughput approach to screen for such genetic markers using a set of theoretical and experimental screening tools. The theoretical screening involves evaluating genes related to the ribosomal RNA and specific functions from emerging indicator groups, followed by experimental validation with appropriate sampling schemes and high-throughput and economical screening methods, such as microarrays, real time PCR, and on-chip PCR. Analysis of a wide range of samples covering temporal variability in location, host, and waterborne disease outbreaks is essential. The proposed approach is expected to shorten the time and cost associated with searching for new genetic markers of emerging indicators by at least 10-fold. Water Environ. Res., 79 260 (2007). KEYWORDS: indicator, pathogens, drinking water, on-chip polymerase chain reaction. doi:10.2175/106143007X181378 Introduction Specific markers to indicate the presence of human fecal matter and/or waterborne pathogens are needed to address the concerns associated with the current indicators (Bordalo et al., 2002; Chao et al., 2003; Engelbrecht et al., 1977, 1979). A number of methods focusing on the sequence variability in either the entire genome of a given isolate or in a subset of the genome have been used to ex- plore various bacterial genera. These methods include pulsed field gel electrophoresis (PFGE), denaturing gradient gel electrophoresis (DGGE), terminal restriction fragment length polymorphisms (T- RFLP), ribotyping, repetitive polymerase chain reaction (rep-PCR), PCR, real time PCR, and microarrays (Table 1). Fingerprinting methods, such as PFGE, ribotyping, T-RFLP, and DGGE, have been able to classify isolates into their respective hosts, with only a 60 to 85% success rate. The PCR and real-time PCR focusing on genes related to specific functions have proven to be more successful (i.e., the enterococcal surface protein, esp, in Enterococcus faecium with a 97% success rate [Scott et al., 2005]). Many of these methods are marred by the high false-positive rate of classification (Stoeckel et al., 2004). All methods may benefit from a more comprehensive field evaluation, so that variability associated with spatial, temporal, and epidemiological factors can be fully characterized. Thus, the search for host-specific genetic markers is expected to continue. Depending on the method used, the search for host-specific genetic markers involves one or more of the following: (1) Assessing a large number of organisms for hosts specificity, (2) Assessing spatial variability, (3) Assessing the allelic variability within a given marker gene, (4) Establishing association with the presence of virulence genes of pathogen during outbreaks, (5) Monitoring the set of suspected target marker genes in a large number of environmental samples and hosts, and (6) Establishing relationships with epidemiological data. To establish the epidemiological context, marker and virulence genes associated with pathogens may be analyzed in parallel. The number of samples and targets necessary for such an analysis requires high-throughput tools, such as microarray (Hashsham, Wick, Rouillard, Gulari, and Tiedje, 2004), real-time PCR (Seurinck et al., 2005), on-chip PCR (Matsubara et al., 2005), and both phylogenetic and function-specific databases for efficient selection of gene sequences. Tools to theoretically analyze gene sequences for the presence of specific markers are evolving, with significantly better tools to analyze 16S and 23S rRNA genes than the set of genes related to other functions (Hashsham, Callister, and Tijdens, 2004). Tools to experimentally evaluate such marker genes are numerous and only limited by the extent of sampling effort (Schweitzer and Kingsmore, 2001). This paper presents an approach for screening genetic markers using high-throughput genomic tools, namely PCR followed by microarrays and on-chip PCR. The overall scheme is illustrated in Figure 1. It consists of a theoretical screening of genetic markers, in- cluding design of primers and probes encompassing both 16S rRNA genes and genes related to specific functions, followed by high- throughput experimental validation using an appropriate number of environmental and outbreak samples. It is expected that, once the markers are found, methods for online monitoring will also emerge. Theoretical Screening of Genetic Markers Three different categories of genes can be used for high- throughput screening of genetic markers, including the following: (1) 16S and 23S rRNA genes, (2) Genes related to specific functions, and (3) Virulence genes associated with pathogens. 1 Doctoral student, Department of Civil and Environmental Engineering, Michigan State University, East Lansing. 2 Specialist Programmer, Center for Microbial Ecology, Michigan State University. 3 Research Associate Professor, Center for Microbial Ecology, Michigan State University. 4 Associate Professor, Department of Civil and Environmental Engineering and the Center for Microbial Ecology, Michigan State University. * A126 Research Complex-Engineering, Michigan State University, East Lansing, Michigan, 48824; e-mail: hashsham@egr.msu.edu. 260 Water Environment Research, Volume 79, Number 3