GraMMy: Graph representation learning based on micro–macro analysis Sucheta Dawn ⇑ , Monidipa Das, Sanghamitra Bandyopadhyay Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India article info Article history: Received 23 May 2021 Revised 25 December 2021 Accepted 12 July 2022 Available online 16 July 2022 Communicated by Zidong Wang Keyword: Graph neural network Micro–macro analysis Locality sensitive hashing Context-based learning abstract Graph Neural Networks (GNNs) are robust variants of deep network models, typically designed to learn from graph-structured data. Despite the recent advancement of GNNs, the basic message passing scheme of learning often holds back these models in effectively capturing the influence of the nodes from higher order neighbourhood. Further, the state-of-the-art approaches mostly ignore the contextual significance of the paths through which the message/information propagates to a node. In order to deal with these two issues, we propose GraMMY as a novel framework for hierarchical semantics-driven gra ph represen- tation learning based on M icro-M acro analy sis. The key idea here is to study the graph structure from dif- ferent levels of abstraction, which not only provides an opportunity for flexible flow of information from both local and higher-order neighbours but also helps in more concretely capturing how information travels within various hierarchical structures of the graph. We incorporate the knowledge gained from micro and macro level semantics into the embedding of a node and use this to perform graph classifica- tion. Experimentations on four bio-informatics and two social datasets exhibit the superiority of GraMMy over state-of-the-art GNN-based graph classifiers. Ó 2022 Elsevier B.V. All rights reserved. 1. Introduction Graph is a pervasive structure, which is used to represent com- plex systems where both entities and their interconnections are equally important for realization [8]. Real-life situations, e.g., social network, biological network, recommender system, etc., are better to be modeled in terms of graphical structure, as the information about individual entities is not enough to understand the whole system [31,34]. The rich information about their collective activi- ties is also needed to be captured. Thus, acquiring the euclidean representation of the nodes and the graphs for solving machine learning tasks on the graph has become a fascinating area of research in recent years. Graph Neural Network (GNN) uses deep learning techniques to serve the purpose and has been proved to be extremely beneficial in many applications such as recognition, classification, clustering, prediction, and so on [25,19,10]. Related works and limitations: In GNN literature, most of the approaches follow more or less similar kind of method called ”Message Passing” scheme, where a GNN layer iteratively finds euclidean representation of a node by aggregating neighbours’ fea- tures and combining with existing node embedding (randomly ini- tialized). Hence, the choice of two functions, namely Aggregate and Combine turns out to be crucial for this approach. We discuss some of the existing models from the GNN literature here. The GNN approach proposed by Scarselli et al. [24], is one of the earliest works in this domain. The approach recursively updates node latent representations by exchanging information with the neigh- bouring nodes until equilibrium is reached. The recurrent function is chosen to be a contraction mapping to ensure convergence. The Gated Graph Neural Network (GGNN)[2] uses a gated recurrent unit as the recurrent function and back-propagation through time (BPTT) for parameter learning. The approach does not require any condition on parameters to converge and thus reduces the number of steps. However, these models of GNN learning often find it difficult to work on larger graphs and may often suffer from the stability issue. Recently introduced Stochastic Steady-State Embedding (SSE) approach [4] uses a recurrent function which takes a weighted average of the states from previous steps and a new state to ensure the stability of the algorithm. The GraphSage model [11] overcomes the scalability issue by proposing a batch- training algorithm that samples a fixed-sized neighbourhood of a node to aggregate information. Among the various GNN models, the Graph Isomorphism Network (GIN) [32] is found to have the maximal representational power among all massage-passing tech- nique based models. GIN achieves this by imposing a constraint on the functions used in the model, that is the Aggregate and Combine must be injective. As claimed in [32], both GIN and the Weisfeiler- Lehman test of graph isomorphism are equally powerful in a graph classification task. https://doi.org/10.1016/j.neucom.2022.07.013 0925-2312/Ó 2022 Elsevier B.V. All rights reserved. ⇑ Corresponding author. E-mail addresses: suchetad_r@isical.ac.in (S. Dawn), monidipa_t@isical.ac.in (M. Das), sanghami@isical.ac.in (S. Bandyopadhyay). Neurocomputing 506 (2022) 84–95 Contents lists available at ScienceDirect Neurocomputing journal homepage: www.elsevier.com/locate/neucom