Optimizing Program Performance via Similarity, Using a Feature-agnostic Approach Rosario Cammarota, Laleh Aghababaie Beni Alexandru Nicolau, and Alexander V. Veidenbaum Department of Computer Science, University of California Irvine, Irvine, USA {rosario.c,laghabab,nicolau,alexv}@ics.uci.edu Abstract. This work proposes a new technique for performance eval- uation to predict performance of parallel programs across diverse and complex systems. In this work the term system is comprehensive of the hardware organization, the development and execution environment. The proposed technique considers the collection of completion times for some pairs (program, system) and constructs an empirical model that learns to predict performance of unknown pairs (program, system). This approach is feature-agnostic because it does not involve previous knowl- edge of program and/or system characteristics (features) to predict per- formance. Experimental results conducted with a large number of serial and paral- lel benchmark suites, including SPEC CPU2006, SPEC OMP2012, and systems show that the proposed technique is equally applicable to be employed in several compelling performance evaluation studies, includ- ing characterization, comparison and tuning of hardware conﬁgurations, compilers, run-time environments or any combination thereof. Keywords: Program Characterization, Feature-agnostic, Cluster Anal- ysis, Empirical Performance Modeling, Program Optimization 1 Introduction Over the past decade there has been an exponential growth in computer perfor- mance [1] that quickly led to more sophisticate and diverse software and com- puting platforms (e.g., heterogeneous multi-core platforms [2], parallel browsers [3]). The cost of software development and hardware design too increases and creates the need for evaluating performance of proposed software and system changes before the actual implementation and deployment begin. However, given the increasing complexity of modern micro-architectures [2], software [4,5], development and execution environments [6], performance of a program on new systems (specularly, performance that a system delivers to new programs) is diﬃcult to predict. Constructing a comprehensive model that in- cludes all the possible aspects featuring software and computing platform is practically limited by the cost of feature retrieval compared with the perfor- mance goal to reach. For example, while having negligible run-time overhead, collecting a large number of hardware performance counters is not possible at