Analysis of aptamer sequence activity relationshipsw Mark Platt,z ab William Rowe,z ab Joshua Knowles, ad Philip J. Day ac and Douglas B. Kell* ab Received 26th August 2008, Accepted 15th October 2008 First published as an Advance Article on the web 12th November 2008 DOI: 10.1039/b814892a DNA sequences that can bind selectively and speciﬁcally to target molecules are known as aptamers. Normally such binding analyses are performed using soluble aptamers. However, there is much to be gained by using an on-chip or microarray format, where a large number of aptameric DNA sequences can be interrogated simultaneously. To calibrate the system, known thrombin binding aptamers (TBAs) have been mutated systematically, producing large populations that allow exploration of key structural aspects of the overall binding motif. The ability to discriminate between background noise and low aﬃnity binding aptamers can be problematic on arrays, and we use the mutated sequences to establish appropriate experimental conditions and their limitations for two commonly used ﬂuorescence-based detection methods. Having optimized experimental conditions, high-density oligonucleotide microarrays were used to explore the entire loop–sequence–functionality relationship creating a detailed model based on over 40 000 analyses, describing key features for quadruplex-forming sequences. Introduction The development of the technique known as SELEX 1,2 or systematic evolution of ligands by exponential enrichment has yielded the ability to raise polynucleotides with high aﬃnity and speciﬁcity to target molecules. These nucleotide sequences, known as aptamers, have been developed to bind targets ranging from small molecules to polypeptides and proteins. 1–4 With binding aﬃnities comparable to those of biologically derived antibodies, the anthropogenic nature of aptamers means that they possess much greater ﬂexibility in terms of applications, with uses encompassing diagnostics and therapeutics. 5,6 Aptamers with aﬃnity to the coagulation protein thrombin were among the ﬁrst raised to a protein target and to date perhaps represent the most comprehensively studied. 4 Sequen- cing of DNA aptamers derived from the SELEX process against thrombin reveals the dependency of binding on the consensus sequence; GGtTGGN 2–5 GGTtGG. 4,7 Within this sequence mutual hydrogen bonding between the tetrad of guanine repeats leads to the formation of a unimolecular quadruplex structure in the presence of monovalent cations, as evidenced by NMR 7 and crystallographic structural studies 8 (see Fig. 1). This is not only of interest because it highlights the complex interplay between nucleic acid struc- ture and binding aﬃnity, but such structures oﬀer biological signiﬁcance, as the formation of G-quadruplexes within geno- mic DNA has been linked with various processes including transcriptional control 9 and telomeric maintenance. 10,11 The consensus sequence of thrombin-binding aptamers describes the protein–DNA interaction parameters, and thus if G-quadruplex structure is vital to protein binding the quadruplex structural parameters are also inherent to this model. Understanding the relationship between biological sequence and structure can aid in the detection of putative G-quadruplex structures within the genome, 9 while correlating sequence to binding aﬃnity can be useful to many aspects of Insight, innovation, integration Aptamers are ideal for diagnostic and pharmaceutical studies, but gaining knowledge into mechanism and key structural features is essential for novel and diverse future applications. DNA microarrays allow thousands of sequen- ces to be interrogated simultaneously. We have therefore utilized a high density array format to screen key structural features for G quadruplex forming sequences, using the known protein thrombin. This format rapidly yields a vast amount of data allowing a detailed model to be built describing key loop–sequence–functionality relationships. The ability to survey the landscape systematically using aptamers of known sequence makes microarray formats highly suited for studying sequence speciﬁc protein binding proﬁles. z These authors contributed equally to this work. w Electronic supplementary information (ESI) available: Construction of the statistical model and colour version of Fig. 4. See DOI: 10.1039/b814892a a Manchester Interdisciplinary Biocentre, The University of Manchester, 131 Princess Street, Manchester, UK M1 7DN. E-mail: dbk@manchester.ac.uk b School of Chemistry, The University of Manchester, Oxford Road, Manchester, UK M13 9PL c School of Translational Medicine, The University of Manchester, Oxford Road, Manchester, UK M13 9PT d School of Computer Science, University of Manchester, Kilburn Building, Oxford Road, Manchester, UK M13 9PL 116 | Integr. Biol., 2009, 1, 116–122 This journal is  c The Royal Society of Chemistry 2009 PAPER www.rsc.org/ibiology | Integrative Biology