Computational Biology and Chemistry 41 (2012) 35–40 Contents lists available at SciVerse ScienceDirect Computational Biology and Chemistry jo ur n al homep age: www.elsevier.com/locate/compbiolchem Research Article On topological indices for small RNA graphs Alexander Churkin a , Idan Gabdank b , Danny Barash a, a Department of Computer Science, Ben-Gurion University, 84105 Beer-Sheva, Israel b Department of Pathology, Stanford School of Medicine, Stanford, CA 94305-5324, United States a r t i c l e i n f o Article history: Received 31 March 2012 Received in revised form 11 October 2012 Accepted 12 October 2012 Keywords: RNA secondary structure RNA graph representation Laplacian eigenvalues Small RNA graphs a b s t r a c t The secondary structure of RNAs can be represented by graphs at various resolutions. While it was shown that RNA secondary structures can be represented by coarse grain tree-graphs and meaning- ful topological indices can be used to distinguish between various structures, small RNAs are needed to be represented by full graphs. No meaningful topological index has yet been suggested for the anal- ysis of such type of RNA graphs. Recalling that the second eigenvalue of the Laplacian matrix can be used to track topological changes in the case of coarse grain tree-graphs, it is plausible to assume that a topological index such as the Wiener index that represents all Laplacian eigenvalues may provide a similar guide for full graphs. However, by its original definition, the Wiener index was defined for acyclic graphs. Nevertheless, similarly to cyclic chemical graphs, small RNA graphs can be analyzed using elementary cuts, which enables the calculation of topological indices for small RNAs in an intu- itive way. We show how to calculate a structural descriptor that is suitable for cyclic graphs, the Szeged index, for small RNA graphs by elementary cuts. We discuss potential uses of such a procedure that considers all eigenvalues of the associated Laplacian matrices to quantify the topology of small RNA graphs. © 2012 Elsevier Ltd. All rights reserved. 1. Introduction One of the significant issues in modeling an RNA molecule is on how to represent its secondary structure in a simplified yet useful manner. Several approaches have been devised, among which three major historical ones are the full graph representation where each nucleotide is a node (Waterman, 1978), a coarse grain tree-graph representation where each motif is a node (Shapiro, 1988), and a full tree leading to a homeomorphically irreducible tree (Fontana et al., 1993). All of these types of representation have been imple- mented in the Vienna RNA package (Hofacker et al., 1994) while the first one has been instrumental in the early development of fold- ing prediction algorithms (Smith and Waterman, 1978; Nussinov et al., 1978; Zuker and Stiegler, 1981). This full graph representa- tion where each nucleotide is a node is equivalent to a dot–bracket representation in the Vienna RNA package (Hofacker et al., 1994; Hofacker, 2003) and a ct file in mfold (Zuker, 1989, 2003). In the context of RNA secondary structure, coarse grain tree- graphs have been used in a variety of ways (Shapiro, 1988; Le et al., 1989; Benedetii and Morosetti, 1996; Barash, 2003; Churkin and Barash, 2006; Shu et al., 2006, 2008). They can also be generalized to abstract shapes (Giegerich et al., 2004). In Shapiro (1988) and Le Corresponding author. E-mail address: dbarash@cs.bgu.ac.il (D. Barash). et al. (1989), the coarse grain representation of an RNA secondary structure was suggested, which was later called Shapiro’s repre- sentation in the Vienna RNA package. In Benedetii and Morosetti (1996), topological indices were first suggested to be used for coarse grain tree-graphs. In Barash (2003), it was found that the second eigenvalue of the Laplacian matrix can provide a similarity measure for differentiating between various tree-graph topologies. This can be exploited when filtering candidates in the process of deleterious mutation prediction, which was used in the corresponding predic- tion software RNAmute (Churkin and Barash, 2006). In Shu et al. (2006) the RDMAS webserver was developed suggesting several topological indices for estimating mutational deleteriousness. Sub- sequently, in Shu et al. (2008), a detailed study of topological indices was carried on a newly suggested coarse grain representation called element-contact graphs. It should be noted that mathematical the- orems by Fiedler (1973) and Merris (1987) were shown useful for the estimation of how the coarse grain tree-graph representing an RNA secondary structure is shaped. However, the coarse grain tree- graphs are not informative enough when dealing with small RNAs. For those, and in general for RNA graphs, it was first suggested by Merris in a personal communication to examine the Wiener topo- logical index (Wiener, 1947) that provides information about the complete spectrum of the Laplacian matrix and not only its sec- ond eigenvalue. Interestingly, Merris (1989) has shown that the Wiener topological index can be calculated by the complete spec- trum of the Laplacian matrix. For more information on the field of 1476-9271/$ see front matter © 2012 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.compbiolchem.2012.10.004