ORIGINAL PAPER Genetic distance sampling: a novel sampling method for obtaining core collections using genetic distances with an application to cultivated lettuce J. Jansen Æ Th. van Hintum Received: 3 January 2006 / Accepted: 12 October 2006 / Published online: 16 December 2006 Ó Springer-Verlag 2006 Abstract This paper introduces a novel sampling method for obtaining core collections, entitled genetic distance sampling. The method incorporates informa- tion about distances between individual accessions into a random sampling procedure. A basic feature of the method is that automatically larger samples are ob- tained if accessions are further apart and smaller samples if accessions are closer together. Genetic dis- tance sampling can be used in conjunction with pre- defined stratifications of the accessions. Sample sizes are determined automatically; they depend on the distances between accessions within strata. The meth- od is applied to the collection of cultivated lettuce of the Centre for Genetic Resources, the Netherlands. In this paper, genetic distances between accessions are obtained using AFLP marker data. However, genetic distance sampling can be applied using any measure of genetic distance between accessions. Some properties of genetic distance sampling are discussed. Introduction Gene banks have been founded with the aim to con- serve the genetic diversity of crop species. This diver- sity forms the raw material of plant breeding. If possible, genetic diversity, also referred to as germ- plasm, is conserved in the form of accessions: batches of seed sampled from wild populations, traditional landraces, modern cultivars, genetic stock or other re- search material. Many gene banks currently face problems caused by the large sizes of collections, and the resulting costs of maintaining these collections. This may endanger the long-term conservation of the collections. In addition, excessive collection sizes may hinder the accessibility by the users of genetic diversity, such as plant breeders (van Hintum et al. 2000). The concept of core collec- tions was introduced by Frankel (1984). A core col- lection is a collection of limited size with the aim to represent the genetic diversity (or spectrum) of the whole collection (Brown 1995). From this definition of a core collection it follows that it should be avoided that not only identical accessions but also similar (or near-identical) accessions become part of a core col- lection. Several methods have been introduced for sampling accessions from a gene bank collection to form a core collection. These methods include simple random sampling and stratified random sampling, but also more sophisticated methods. Schoen and Brown (1993, 1995) describe a method (referred to as M strategy) by which entries of the core collection are selected by minimizing the overall probability that an allele pres- ent in the gene bank collection is not retained in the core collection. The computer program MSTRAT Communicated by A. Charcosset. J. Jansen (&) Biometris, Wageningen University and Research Centre, P.O. Box 100, 6700 AC Wageningen, The Netherlands e-mail: johannes.jansen@wur.nl Th.van Hintum Centre for Genetic Resources, the Netherlands (CGN), Wageningen University and Research Centre, P.O. Box 16, 6700 AA Wageningen, The Netherlands 123 Theor Appl Genet (2007) 114:421–428 DOI 10.1007/s00122-006-0433-9