International Journal of Soft Computing and Engineering (IJSCE) ISSN: 2231-2307, Volume-3 Issue-3, July 2013 74 Published By: Blue Eyes Intelligence Engineering & Sciences Publication Retrieval Number: C1638073313/2013©BEIESP Design and Development of Abstractness in Graph Mining Technique using Structural Datum S. P. Victor, M. Antony Sundar Singh Abstract-Graphs are everywhere, ranging from social networks and mobile call networks to biological net-works and the World Wide Web. Mining big graphs leads too many interesting applications including cyber security, fraud detection, Web search, recommendation, and many more. In this paper we describe a technique for the conversion of real-time environment to a Graph Mining pattern. We analyze very large, real world graphs with billions of nodes and edges. Our findings include digraph structures in the connected component size distribution. In the future we will extend our research to propose a GraphTemplateConverter for any real-time complex entities. Keywords- Graph mining, Graph pattern, Graph template, Graph network. I. INTRODUCTION A graph is set of nodes, pairs of which might be connected by edges. In a wide array of disciplines, data can be intuitively cast into this format[1]. For example, computer network consist of routers/computers (nodes) and the links (edges) between them. Social networks consist of individuals and their interconnections (which could be business relationships or kinship or trust, etc)[2]. Protein interaction networks link proteins which must work together to perform some particular biological function. Graphs are seemingly ubiquitous. The problems of detecting abnormalities (outliers) in a given graph and of generating synthetic but realistic graphs have received considerable attention recently.[3] Table I: Graph notations Table of Symbols Symbol Description N Number of nodes in the graph E Number of edges in the graph K Degree for some node < k > Average degree of nodes in the graph CC Clustering coefficient of the graph CC(k) Clustering coefficient of degree-k nodes Identifying tightly coupled pattern to the problem of finding the distinguishing characteristics of real-world graphs, that is, the patterns that show up frequently in such graphs and can thus be considered as marks of realism. A good generator will create graphs which match these patterns. Patterns and generators are important for many applications.[4] Manuscript Received on July, 2013 Dr.S.P.Victor, HOD / Department of Computer Science St.Xaviers College,Tiruelveli- Mobile: 9486041594 Tamilandu,India. M.Antony Sundar Singh, Research Scholar Manonmaniam Sundaranar University, Tirunelveli. Tamilnadu,India. Detection of abnormal sub graphs/edges/nodes. Abnormalities should deviate from the normal patterns so understanding the patterns of naturally occurring graphs is a prerequisite for detection of such outliers.[6] Simulation studies. Algorithms meant for large real- world graphs can be tested on synthetic graphs which look like the original graphs.[5] Realism of samples. We might want to build a small sample graph that is similar to a given large graph. This smaller graph needs to match the patterns of the large graph to be realistic. Graph compression. Graph patterns represent regularities in the data. Which can be used to better compress the data. II. PROPOSED METHODOLOGY Fig-1: Proposed methodology for Graph conversion As a mathematical construct, a graph consists of two types of elements: nodes and edges. In translating a problem to a graph-based representation, the first step is to decide how problem elements will translate to these distinct graph elements. As a general guideline, nodes are used to represent entities in a problem genes, people, cities, businesses and edges are used to represent relationships between the entities 'regulates', 'knows', 'exports to', 'sells to'. If these relationships are directional (e.g., the fact that A sells to B doesn't imply that B sells to A), then the result will be a DIRECTED GRAPH; if not, then an UNDIRECTED GRAPH will result. This distinction is important, as some graph-theoretic measurements will treat directed and undirected graphs differently. Unless specifically stated, undirected edges are the assumed norm. In determining the mapping from problem elements to graph elements, it is sometimes necessary to have multiple types of nodes or edges. While many problems can be adequately represented Without multiple node or edge types, some disciplines, such as social network analysis, make significant use of 2-mode graphs (i.e., graphs having two different node types).