Paper SAS3353-2019 Introducing Pattern Matching for Graph Queries in SAS ® Viya ® 3.4 Matthew Galati, Steve Harenberg, and Rob Pratt, SAS Institute Inc. ABSTRACT SAS ® Visual Data Mining and Machine Learning 8.3 in SAS ® Viya ® 3.4 includes a new patternMatch action, which you can use to execute graph queries that search for copies of a query graph within a larger graph, with the option of respecting node or link attributes (or both). This feature is also available via the PATTERNMATCH statement in the NETWORK procedure. This paper presents examples of pattern matching in social network and anti-money laundering applications. It also provides a functional comparison to Neo4j’s query language, Cypher, and computational comparisons to both iGraph and Neo4j. INTRODUCTION SAS Visual Data Mining and Machine Learning 8.3 in SAS Viya 3.4 includes the network action set and corresponding NETWORK procedure, which contain a number of graph theory and network analysis algorithms that can augment data mining and machine learning approaches. In many practical applications of data mining and machine learning models, pairwise interaction between the entities of interest in the model often plays an important role. For example, when you are modeling churn in a telecommunications network to support a retention campaign, the influence of individual customers on other customers—such as friends and acquaintances that they regularly interact with—might contribute to the propensity of the other customers to churn. You could likewise imagine a customer being able to influence the propensity of his or her acquaintances to acquire new products. Social networks such as Facebook and Twitter are obvious examples of networks that represent such interactions between individuals. Networks also appear explicitly and implicitly in many other application contexts. Networks are often constructed from certain relationships that are based on natural co-occurrence; examples are relationships among researchers who coauthor articles, actors who appear in the same movie, words or topics that occur in the same document, items that appear together in a shopping basket, terrorism suspects who travel together or are seen in the same location, and so on. In these types of relationship, the strength or frequency of each interaction is modeled as a weight on the corresponding link of the resulting network. To support the myriad ways in which networks appear in data mining, the network action set makes no assumptions about the context or application from which the network arises. It provides a number of network analysis algorithms that take an abstract graph or network as input, help explain network structure, and compute important network measures. Depending on the application, this type of network analysis can stand on its own and provide independent value, or it can support machine learning models—for example, by providing additional features that are derived from network measures such as node centrality. This paper uses the NETWORK procedure in the presentation of examples. For more information about the NETWORK procedure, see SAS Visual Data Mining and Machine Learning: The NETWORK Procedure. For more information about the network action set, see SAS Visual Data Mining and Machine Learning: Programming Guide. The general interface for using the network action set is the same for all languages that SAS Viya supports: CASL, Python, Java, Lua, and R. For more information about how SAS Viya supports these languages, see An Introduction to SAS Viya Programming. For the remainder of this paper, the authors refer to the network analytics package as Network, independent of the chosen interface language. PATTERN MATCHING Given two graphs, G (main) and Q (query), subgraph isomorphism is the problem of finding all subgraphs Q 0 of G that are isomorphic to Q (that is, that have the same topology as graph Q ). Pattern matching addresses the analogous problem in the presence of node and link attributes. It is the problem of finding all subgraphs Q 0 of G isomorphic to graph Q such that all node and link attributes defined in Q are preserved in Q 0 under the isomorphism map. 1