Parallel Sparse Flow-Sensitive Points-to Analysis Jisheng Zhao Rice University Houston, Texas, USA jisheng.zhao@rice.edu Michael G. Burke Rice University Houston, Texas, USA mgb2@rice.edu Vivek Sarkar Georgia Institute of Technology Atlanta, Georgia, USA vsarkar@gatech.edu Abstract This paper aims to contribute to further advances in pointer (or points-to) analysis algorithms along the combined dimen- sions of precision, scalability, and performance. For precision, we aim to support interprocedural fow-sensitive analysis. For scalability, we aim to show that our approach scales to large applications with reasonable memory requirements. For performance, we aim to design a points-to analysis algo- rithm that is amenable to parallel execution. The algorithm introduced in this paper achieves all these goals. As an ex- ample, our experimental results show that our algorithm can analyze the 2.2MLOC Tizen OS framework with < 16GB of memory while delivering an average analysis rate of > 10KLOC/second. Our points-to analysis algorithm, PSEGPT, is based on the Pointer Sparse Evaluation Graph (PSEG) form, a new analysis representation that combines both points-to and heap def-use information. PSEGPT is a scalable interpro- cedural fow-sensitive context-insensitive points-to analy- sis that is amenable to efcient task-parallel implementa- tions, even though points-to analysis is usually viewed as a challenge problem for parallelization. Our experimental results with 6 real-world applications on a 12-core machine show an average parallel speedup of 4.45× and maximum speedup of 7.35×. The evaluation also includes precision results by demonstrating that our algorithm identifes sig- nifcantly more inlinable indirect calls (IICs) than SUPT [15] and SS [9], two state of the art SSA-based points-to analyses implemented in LLVM. CCS Concepts · Software and its engineering Au- tomated static analysis; Keywords Static Analysis, Pointer Analysis, Parallelism Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proft or commercial advantage and that copies bear this notice and the full citation on the frst page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specifc permission and/or a fee. Request permissions from permissions@acm.org. CC’18, February 24ś25, 2018, Vienna, Austria © 2018 Association for Computing Machinery. ACM ISBN 978-1-4503-5644-2/18/02. . . $15.00 htps://doi.org/10.1145/3178372.3179517 ACM Reference Format: Jisheng Zhao, Michael G. Burke, and Vivek Sarkar. 2018. Parallel Sparse Flow-Sensitive Points-to Analysis. In Proceedings of 27th In- ternational Conference on Compiler Construction (CC’18). ACM, New York, NY, USA, 12 pages. htps://doi.org/10.1145/3178372.3179517 1 Introduction Points-to analysis is a fundamental requirement for many program analyses, optimizations, and debugging/verifcation tools. It is used to determine if two pointer expressions may refer to the same memory location. Static analysis and more specifcally, alias analysis, is in general undecidable [21]. Hence, a large number of approximation algorithms have been published that balance the precision and the efciency of pointer analysis. These algorithms explore various di- mensions to achieve this balance. However, fnding an efec- tive balance across precision, scalability, and performance in points-to analysis remains a major challenge. Many fow- sensitive algorithms achieve a desirable level of precision but are impractical for use on large software. Likewise, many fow-insensitive algorithms scale to large software, but do so with major limitations in precision. Further, in light of the recent multicore hardware trends, more attention needs to be paid to the use of parallelism for improved perfor- mance. Our focus in this paper is primarily on fow-sensitive points-to analysis, which has been shown to be important for a growing list of program analyses [7], including those that check for security vulnerabilities [5, 8], and that ana- lyze multi-threaded codes. A further goal of this paper is to leverage sparseness and parallelism to achieve scalability, as discussed below. The traditional fow-sensitive approach [4, 14, 27] uses a dense iterative datafow analysis, which does not scale to large programs. A frequently used method for optimizing a fow-sensitive datafow analysis is to perform a sparse analy- sis, such as in the fow-sensitive points-to analysis of[2, 12], which uses the Sparse Evaluation Graph (SEG) [3] to directly connect variable defnitions (defs) with their uses, allowing data fow facts to be propagated only to those program lo- cations that need the values. In general, sparse points-to analysis can be challenging because of an inherent circular- ity Ð pointer information is required to compute the def-use information needed to enable a sparse points-to analysis. Hardekopf and Lin [9] present a semi-sparse (SS) fow- sensitive points-to analysis which exploits partial SSA form to perform a sparse analysis on łtop-level" (scalar) variables 59