Anomalous Node Detection in Networks with Communities of Different Size Juan Campos and Jorge Finke 1 Abstract— Based on two simple mechanisms for establishing and removing links, this paper defines an event-driven model for the anomalous node detection problem. This includes a representation for (i) the tendency of regular nodes to connect with similar others (i.e., establish homophilic relationships); and (ii) the tendency of anomalous nodes to connect to random targets (i.e., establish random connections across the network). Our approach is motivated by the desire to design scalable strategies for detecting signatures of anomalous behavior, using a formal representation to take into account the evolution of network properties. In particular, we assume that regular nodes are distributed across two communities (of different size), and propose an algorithm that identifies anomalous nodes based on both geometric and spectral measures. Our focus is on defining the anomalous detection problem in a mathematical framework and to highlight key challenges when certain topological properties dominate the problem (i.e., in terms of the strength of communities and their size). I. INTRODUCTION The lofty aim of network models is to serve as analytical frameworks that capture the dynamic relationships across large interconnected systems. It is of interest to understand how interaction processes explain the formation of structure, i.e., how mechanisms for establishing and removing links in- fluence the evolution of topological properties. Mechanism- based models provide the basis for the design of algorithms that take account of regular patterns in networks. A common approach to the anomalous node detection problem is to study the evolution of local and global prop- erties, including (i) the proportion of close-knit groups (i.e., subgraphs of k nodes, each with at least k/2 neighboring nodes) [1], [2]; and (ii) the formation of communities (i.e., groups of nodes with tight connections within and sparse connections across them) [3], [4]. How to detect close-knit groups of anomalous nodes on networks with different-sized community structures remains an open challenge. The contribution of this paper is twofold. First, we in- troduce a model based on two mechanisms, which char- acterizes how regular nodes impact the size and strength of communities. Second, we propose an anomalous node detection algorithm that combines geometric and spectral network measures. As in [5], our approach aims to effectively attribute detection signatures to patterns resulting from nodes that persistently engage in random link attacks (RLAs) [6]. Unlike the work in [5], the design of our algorithm is based on a representation of interactions underlying the behavior of regular nodes. We take a discrete-event modelling approach 1 Both authors are with the Department of Electrical Engineering and Computer Science, Pontificia Universidad Javeriana, Cali, Colombia. juan.campos, finke@ieee.org and use simulations to give insight into scenarios where the challenge of how to detect anomalous nodes is significant. Our results suggest that the ability to detect anomalous nodes is highly constrained by the degree to which homophilic relationships impact community strength. The formation of strong communities of similar size facilitates the detection of anomalous nodes. II. PRELIMINARIES A. Notation Let G = (G(0),G(1), ...) represent a sequence of unweighted, undirected networks. Each network G(t) = (N,A(t)) is composed of a set of nodes N = {1, ..., n} and a set of edges A(t). An element {i, j }∈ A(t) if and only if node i links to node j , and {i, i} / A(t) for all i N . Note that the set of nodes N remains constant. It is composed of anomalous nodes (referred to as nodes of type 0) and two types of regular nodes (referred to as nodes of type 1 and 2). The function g : N →{0, 1, 2} defines the type of a node. Let N δ = {i N : g(i)= δ} be the set of nodes of type δ, and n δ = |N δ | the size of N δ . Assume that n 2 n 1 , so that N 2 refers to the majority group whenever there exists a difference in group size. Let A i (t) = {{j ,j }∈ A(t): j = i} be the set of edges that link node i to its neighboring nodes, and A c i (t) denote the complement of A i (t). Furthermore, let k i (t)= |A i (t)| denote the number of neighbors of node i, and k δ i (t)= |{{i, j }∈ A i (t): g(i)= g(j )}| the number of same-type neighbors of node i. Moreover, at any time t let R i (t) A i (t) be a subset of edges that node i is able to redirect. Consider the following assumption. A1 Suppose that |R i (t)| = |R i (0)| = |R j (0)| = r for some constant r N, and R i R j = for all t 0 and i, j N . Assumption A1 requires that all nodes redirect the same number of edges. Moreover, each edge is redirected by a unique node at any time. Based on assumption A1, Section III describes decision-making mechanisms that encourage regular nodes to connect with other nodes of the same type, contributing to the formation of communities. In contrast, the behavior of anomalous nodes is characterized by weak de- grees of membership to any particular community, resulting from the following generic behavior. Definition 1: Random links attacks (RLAs) are a collab- orative action by a close-knit group of anomalous nodes, which target randomly selected regular nodes, with no par- ticular preference for any type of node [6]. An anomalous 2017 American Control Conference Sheraton Seattle Hotel May 24–26, 2017, Seattle, USA 978-1-5090-5992-8/$31.00 ©2017 AACC 3218