On 3-D Graphical Representation of Proteomics Maps and Their Numerical Characterization Milan Randic ´,* Jure Zupan, and Marjana Novic ˇ National Chemistry Institute of Slovenia, Ljubljana, Hajdrihova 19, Slovenia Received January 3, 2001 We consider numerical characterization of proteomics maps by representing a map as a three-dimensional graphical object based on x, y coordinates of the spots and using their relative abundance as the z coordinate. In our representation the protein spots are first ordered based on their relative abundance and labeled accordingly. In the next step a 3-D path is constructed connecting spots having adjacent labels. Finally a matrix is constructed by assigning to each pairs of labels (i, j) matrix element, the numerical value of which is based on the quotients of the Euclidean distance and the distance along the 3-D zigzag between the two points. The approach has been illustrated on a fragment of a proteomics map and compared with 2-D graphical representation of proteomics maps. INTRODUCTION In the preceding paper, a novel approach to analysis of proteomics maps has been outlined in which a proteomics map is “transformed” into a geometrical pattern of line segments obtained by first ordering spots relative to their abundance and then connecting spots with adjacent labels. 1 A result is a rather complex 2-D zigzag path that crosses itself several times that has been referred to as the map “fingerprint”. It appears that the fingerprint pattern is characteristic for a map in the sense that different maps are expected to yield distinct fingerprint patterns. Important advantage of such novel graphical view of proteomics maps is that the zigzag graphical representation is susceptible to rigorous mathematical analysis. Randic ´, Kleiner, and DeAl- ba 2 have developed an approach in chemical graph theory 3 which offers numerical characterization of molecular skel- etons and mathematical curves embedded in a space based on a set of structural invariants derived from suitably constructed matrices associated with molecular skeletons or mathematical curves. In this paper we want to generalize the initial characterization of proteomics maps based on 2-D fingerprint patterns by considering representation of pro- teomics maps in 3-D, where the third coordinate indicates relative abundance. The 2-D representation of proteomics maps only indirectly considers the relative abundance of protein spots via the ordering of spot and assignment of labels. Now instead of semiqualitative representation of proteomics maps we will consider fully quantitative repre- sentation of protemics maps in which numerical values of relative abundance is taken into account. ON 3-D REPRESENTATION OF A MAP Currently proteomics maps that are reported as experi- mental gel photographs are often reproduced as “bubble” diagrams by a computer software program in which protein spots are represented by circles of different size. In the preceding paper we have illustrated one such bubble map and have also listed the (x, y) coordinates and abundance for the 20 most abundant proteins. In Figure 1 we show a 3-D zigzag path connecting these 20 most intensive spots. The zigzag path is descending from the maximal abundance value at about 144.4 70 to the minimal value of 72.2. Projection of the zigzag curve on the x, z plane gives the map the fingerprint, that was illustrated in ref 1 and was the basis for 2-D representation of the proteomics map. The problem to consider is how can one arrive at a quantitative characterization of maps given either as bubble diagrams or defined by a 3-D zigzag path that may facilitate comparison of different maps and even associate with such maps some numerical characterization. We decided to expand on the idea of fingerprint patterns recently proposed for 2-D graphical representations of proteomics maps by considering an abundance of protein spots as the third coordinate in a 3-D space. NUMERICAL CHARACTERIZATION OF 3-D CURVE One can arrive at a numerical characterization of a curve, chemical structure, or any object having a well-defined periphery and having a fixed geometry or being embedded in a space (or even 2-D plane), by constructing the so-called D/D matrix. 2,4-6 The D/D matrix combines information on distances between points that characterize the object con- sidered and the information on adjacency. The element (i, j) of the D/D matrix corresponding to two points is obtained as a quotient of the Euclidean distance between the points divided by the distance measured along the path connecting the two points. In the case of molecular graphs each edge of a graph (or a chain) contributes one unit of length, thus the distance along the paths is simply given by the number of edges between the two points. Here instead of segments of unit length we have segments of variable length. In Table 1 we show the Euclidean distances as measured in 3-D space for the first 10 most abundant protein spots. From this information one can construct the D/D matrix by considering quotients of the corresponding distances through the space * Corresponding author fax: (515)292-8629; e-mail: milan.randic@ki.si. Current address: 3225 Kingman Rd., Ames, IA 50311. 1339 J. Chem. Inf. Comput. Sci. 2001, 41, 1339-1344 10.1021/ci0001684 CCC: $20.00 © 2001 American Chemical Society Published on Web 07/14/2001