This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS 1 Finding Emergent Patterns of Behaviors in Dynamic Heterogeneous Social Networks Benjamin W. K. Hung , Anura P. Jayasumana, Senior Member, IEEE , and Vidarshana W. Bandara Abstract—The search in graph databases for individuals or entities undertaking latent or emergent behaviors has applica- bility in the areas of homeland security, consumer analytics, behavioral health, and cybersecurity. In this setting, even partial matches to hypothesized indicators are worthy of further investi- gation, and analysts in these domains aim to identify and main- tain awareness of entities that either fully or partially match the queried attributes over time. We provide a comprehensive version of a graph pattern matching technique called Investigative Search for Graph Trajectories (INSiGHT) to find emergent patterns of behaviors in networks and tailor the application to detecting rad- icalization in the homeland security domain. To enable analysts’ accounting of recurring behavioral indicators and the recency of behaviors as the imminence of a threat, we provide parameterized methods to score multiple occurrences of indicators and to dampen the significance of indicators over time, respectively. Additionally, we provide an indicator categorization scheme and a match filtering technique to ensure that partial matches to the most salient indicators are identified while reducing the number of false positives. Furthermore, since individuals may be radicalized in small groups or be involved in collective terrorist plots, we introduce a non-combinatorial neighborhood matching technique that enables analysts to use INSiGHT to identify potential query matches from clusters of individuals who may be operating in conspiracies. We demonstrate the performance of our approach using a synthetic radicalization data set and a large, real-world data set of the BlogCatalog social network. Index Terms— Graph pattern matching, graph streams, inves- tigative graph search, pattern matching trajectories. I. I NTRODUCTION S EARCH in graph databases for individuals with specific types of connections or attributes is an increasingly rich research area. In particular, pattern matching in graphs has been studied for use in social search and recommender systems in [1], [6], [11], [13], [26], [30], [31], [35], and [40]. However, there are several shortcomings in current approaches when applied to the search for individuals or entities undertaking latent behaviors, which are hidden or emergent activities exhibited by an entity [14]. This kind of search problem is Manuscript received January 15, 2019; revised June 9, 2019 and July 29, 2019; accepted August 11, 2019. This work was supported in part by the U.S. Department of Justice, Office of Justice Programs/National Insti- tute of Justice, under Award 2017-ZA-CX-0002. (Corresponding author: Benjamin W. K. Hung.) B. W. K. Hung and A. P. Jayasumana are with the Depart- ment of Electrical and Computer Engineering, Colorado State Univer- sity, Fort Collins, CO 80525 USA (e-mail: benjamin.hung@colostate.edu; anura.jayasumana@colostate.edu). V. W. Bandara is with JDA Software Group, Inc., Irving, TX 75062 USA (e-mail: vida.bandara@jda.com). Digital Object Identifier 10.1109/TCSS.2019.2938787 particularly appropriate for law enforcement and intelligence analysis. In this setting, even partial matches to hypothe- sized indicators are worthy of further investigation. Moreover, the aim is to identify and maintain awareness of entities that either fully or partially match the queried attributes over time. As the underlying entity-level data are dynamically updated with behaviors and attributes, the goal is to find all those with the emergent pattern of behaviors. Emergent pattern detection is a process by which analysts discover the occurrence and logically connect an entity’s set of activities over time. Analysts identify a hypothesized pattern of behavior and subsequently employ technological means to detect characteristic behaviors which match the pattern in order to provide an early warning. Emergent pattern detection is important in domains such as homeland security, consumer analytics, behavioral health, and cybersecurity because of the compelling interest to detecting the presence of an individ- ual’s latent behavior utilizing time-stamped indicator data. To prevent future terrorist attacks, law enforcement agencies recurringly assess the individual risk of a large number of individuals for the likelihood of violence as they progress along a dynamic and phase-based radicalization process and exhibit indicative behaviors or psychological states [3], [25], [27], [28]. Similarly, in consumer analytics, businesses are interested in an individual’s online activities and purchases over time to track his or her place on the customer journey and determine the potential for future purchases [9], [10], [42]. In behavioral health, family members and caregivers are inter- ested in identifying those who may be exhibiting indicators of suicide risk over time [24], [34], [36]. Lastly, in cybersecurity, organizations continually seek to prevent insider threats by detecting risk potential using performance-related and techni- cal indicators recorded over time [4], [8]. In summary, emergent pattern detection utilizing longitudi- nal characteristics and activities’ data is applicable in numer- ous contexts. However, this challenge is not yet adequately addressed by extant graph pattern matching approaches. There- fore, we formulate a solution approach called investigative graph search [18] that enables the search for and prioritization of entities of interest that over time exhibit part or all of a pattern of attributes or connections. A. Basic Investigative Graph Search Problem We start with a graph model consisting of nodes and edges. The node types, constrained by a discrete number of node 2329-924X © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.