October 3, 2007 17:37 Proceedings Trim Size: 9.75in x 6.5in apbc057a GENOME HALVING WITH DOUBLE CUT AND JOIN ROBERT WARREN AND DAVID SANKOFF University of Ottawa The genome halving problem, previously solved by El-Mabrouk for inversions and re- ciprocal translocations, is here solved in a more general context allowing transpositions and block interchange as well, for genomes including multiple linear and circular chro- mosomes. We apply this to several data sets and compare the results to the previous algorithm. 1. Introduction In this paper we discuss a generalization of the genome halving process studied by El-Mabrouk. 3 Before stating and solving the problem formally in the ensuing sections, we first give some motivation for the generalization. Models of genome rearrangement processes have permitted different repertoires of operations. Certainly, realistic models must account for inversion. Likewise, reciprocal translocations, Robertsonian translocations and other processes of chro- mosome fusion and fission, all of which involve transferring an entire telometric (i.e., suffix or prefix) region of at least one chromosome, are widespread across all eukaryotic domains. Other movements of chromosomal fragments, usually not involving telomeres, are widely attested, and grouped together under the label of transpositions. They are produced by a variety of processes, such as gene duplication followed by the loss of the original copy, or retrotransposition, or recombination errors. Of the three true movement rearrangements, a inversion, translocation and trans- position, only the first two, separately or in combination, have proved very amenable to mathematical modeling, as exemplified by the Hannenhalli-Pevzner formula for the edit distance between two genomes, i.e., the minimum number of operations required to transform one genome into another, and the efficient algorithm for pro- ducing such a series of operations. No formula or efficient algorithm exists for transposition, either by itself or in combination with the other two operations. Recently, Yancopoulos et al. 6 introduced the “double cut and join” (DCJ) op- eration as the basis for generating all the movement rearrangements. This allowed for the inclusion of transposition with inversion and translocation in a single model a Duplications of genes or of chromosomal segments, as well as deletions and insertions are often considered as aspects of genome rearrangement, but they are not really of the same biological nature as the movements inherent in inversion, translocation and transposition, and mathematical models of rearrangement are not easily extended to encompass them. 1