Improved Kernelization and Linear Time FPT for Convex Recoloring of Trees Hans L. Bodlaender 1 , Michael R. Fellows 2 , Michael A. Langston 3 , Mark A. Ragan 4 , Frances A. Rosamond 2 , and Mark Weyer 5 1 Department of Information and Computing Sciences, Utrecht University, Utrecht, the Netherlands hansb@cs.uu.nl 2 Office of the DVC(Research), University of Newcastle, Callaghan NSW 2308, Australia, {michael.fellows,frances.rosamond}@newcastle.edu.au 3 Department of Computer Science, University of Tennessee, Knoxville TN 37996-3450 and Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831-6164 U.S.A. langston@cs.utk.edu 4 Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD 4072 Australia, m.ragan@imb.uq.edu.au 5 Institut f¨ ur Informatik, Humboldt-Universit¨ at zu Berlin, Berlin, Germany mark.weyer@math.uni-freiburg.de Abstract. The Convex Recoloring (CR) problem measures how far a tree of characters differs from exhibiting a “perfect phylogeny”, and has basic applications in phylogenetics, and some further applications in the analysis of gene expression data. For input consisting of a vertex- colored tree T , the problem is to determine whether recoloring at most k vertices can achieve a convex coloring, meaning by this a coloring where each color class induces a connected subtree. The problem was introduced by Moran and Snir, who showed that CR is NP-hard, and described a search-tree based FPT algorithm with a running time of O(k(k/ log k) k n 4 ). It has recently been shown that the problem belongs to the subclass Poly(k) FPT , with a kernelization bound of O(k 6 ). We substantially improve on this, showing that the problem has a kernel of size O(k 2 ). We also give a short proof that the problem can be solved in linear time for fixed k. Topics: Algorithms and Complexity, Bioinformatics 1 Introduction There are two motivations for the work reported in this paper. The first is to devise better algorithms for an elegant combinatorial problem with applications in bioinformatics. We achieve significant improvements on the best previous results (some obtained by the same set of authors). This research has been partially supported by the Australian Centre for Bioinfor- matics, by the U.S. National Science Foundation under grant CCR–0311500, by the U.S. National Institutes of Health under grants 1-P01-DA-015027-01, 5-U01-AA- 013512-02 and 1-R01-MH-074460-01, and by the European Commission under the Sixth Framework Programme.