Intelligenza Artificiale 14 (2020) 103–114
DOI 10.3233/IA-190038
IOS Press
103
ActorNode2Vec: An Actor-based solution
for Node Embedding over large networks
Gianfranco Lombardo
∗
and Agostino Poggi
Department of Engineering and Architecture, University of Parma, Italy
Abstract. The application of Machine Learning techniques over networks, such as prediction tasks over nodes and edges, is
becoming often crucial in the analysis of Complex systems in a wide range of research fields. One of the enabling technologies
in that sense is represented by Node Embedding, which enables us to learn features automatically over the network. Among
the different approaches proposed in the literature, the most promising are DeepWalk and Node2Vec, where the embedding is
computed by combining random walks and neural language models. However, characteristic limitations with these techniques
are related to memory requirements and time complexity. In this paper, we propose a distributed and scalable solution, named
ActorNode2vec, that keeps the best advantages of Node2Vec and overcomes the limitations with the adoption of the actor
model to distribute the computational load. We demonstrate the efficacy of this approach with a large network by analyzing
the sensitivity of walk length and number of walks parameters and make a comparison also with Deep walk and an Apache
Spark distributed implementation of Node2Vec. Results show that with ActorNode2vec computational times are drastically
reduced without losing embedding quality and overcoming memory issues.
Keywords: Network science, embedding, node embedding, Node2vec, actodes, distributed systems, data mining, complex
systems, actor model
1. Introduction
In a wide range of disciplines it is possible to find
real systems often characterized by heterogeneous
or similar entities that interact with each other in a
complex way: In fact, these so-called Complex sys-
tems are pervasive in several research fields, such as
Sociology, Biology, Genetics, Physics, Computer sci-
ence and Finance and their analysis for knowledge
discovery is still challenging. Complex system anal-
ysis involves often the use of graphs (or networks) to
model the behavior of the system with the basic idea
of representing entities as nodes and their interactions
and dynamics as (un)directed edges. For decades the
study of graph-data has been limited to analysis of
the network topology with structural metrics that are
∗
Corresponding author: Gianfranco Lombardo, Department of
Engineering and Architecture, University of Parma, Italy. E-mail:
gianfranco.lombardo@unipr.it.
capable of extracting connectivity patterns among the
system components. More recently, with the progress
of Machine Learning techniques, it is emerged the
idea of taking advantages from this kind of struc-
tures, also to perform prediction tasks or clustering.
For example in [1] the authors modeled the interac-
tion between proteins as a network, with the aim of
automatic predicting a correct label for each protein
describing its functionalities. In [2] the authors uses
a temporal network to model the US stock market
in order to discover correlations among the dynam-
ics of stocks’ cluster and to predict economic crises.
In [3] the authors analyze a social group of patients
to extract new knowledge about their emotional state
and disease temporal pattern by modeling them in
two attributed networks: an interaction network and
a friendship network. However, the application of
Machine Learning directly on graph-data is made it
difficult because of the necessary manually feature
1724-8035/20/$35.00 © 2020 – IOS Press and the authors. All rights reserved