2D-dynamic representation of DNA sequences Dorota Bielin ´ ska-Wa ß _ z a, * , Timothy Clark b , Piotr Wa ß _ z c , Wiesław Nowak a , Ashesh Nandy d a Instytut Fizyki, Uniwersytet Mikolaja Kopernika, ul. Grudzia ßdzka 5, 87-100 Torun ´ , Poland b Computer-Chemie-Centrum and Interdisciplinary Center for Molecular Materials, Friedrich-Alexander-Universita ¨ t Erlangen-Nu ¨ rnberg, Na ¨ gelsbachstrasse 25, 91052 Erlangen, Germany c Centrum Astronomii, Uniwersytet Mikolaja Kopernika, ul. Gagarina 11, 87-100 Torun ´ , Poland d Environmental Science Programme, Jadavpur University, Calcutta 700 032, India Received 10 March 2007; in final form 7 May 2007 Available online 18 May 2007 Abstract A new ‘dynamic’ 2D-graphical representation of DNA sequences is presented. The model is based on 2D-plots that have been used before and are easy to visualize, but it removes many degeneracies present in the previous approaches. The moments of inertia of the ‘dynamic’ graphs are proposed as a new kind of descriptor for DNA sequences. Ó 2007 Elsevier B.V. All rights reserved. 1. Introduction In recent years, the rapid growth of sequence data in DNA databases has stimulated the development of meth- ods aimed at the numerical characterization of these data. In particular, important contribution by Randic ´ et al. should be mentioned. Some representative examples are given by [1–3]. More details may be found in a recent review by Nandy et al. [4]. The first step in this approach is a proper mathematical representation of the DNA sequences. Then, mathematical descriptors that character- ize the abstract representation of the sequences are created. Several approaches to the graphical representation of DNA, from 2D to 6D-methods have been introduced. The original method of plotting DNA sequences as a walk in 2D-space using four orthogonal directions that represent the four bases was introduced in Refs. [5–7]. However, such a representation may lead to some parts of the sequence being hidden if the walk is performed back and forth along the same trace. In order to eliminate, or to minimize, the degeneracy caused by repetitive walks, the angles between four base vectors were changed to 30° in a more recent 2D-method [8]. However, in this approach, some accidental degeneracies were also observed [9]. Later attempts aimed at avoiding degeneracies resulted in other methods of visu- alization of DNA sequences in 2D-space, for instance hor- izontal lines [10] or square units called cells [11] introduced instead of the Cartesian coordinate system. In later Letters, multidimensional graphical representations were intro- duced, for example 3D [12], 4D [13], or 6D [14]. However, all these multidimensional methods are difficult to visual- ize. On the basis of mathematical representations, descrip- tors, which represent a tool for fast identification and characterization of the similarities between the sequences, have been created. There are two classes of approaches to define such descriptors: the geometrical [15–17] and graph–theoretical ones [18–21]. A review of the graph–the- oretical representations and descriptors used in medicinal chemistry and in bioinformatics has recently been pub- lished by Gonzalez–Diaz et al. [22]. In this Letter, we introduce a new graphical representa- tion of the DNA sequences, which we call 2D-dynamic graphs. The sequences are represented as sets of point masses. Points with different masses are distinguishable. The distribution of the points in 2D-space is based on a previous model (Nandy plots) [6]. The new representation 0009-2614/$ - see front matter Ó 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.cplett.2007.05.050 * Corresponding author. E-mail address: dsnake@phys.uni.torun.pl (D. Bielin ´ ska-Wa ß _ z). www.elsevier.com/locate/cplett Chemical Physics Letters 442 (2007) 140–144