Quantifying the invariance and robustness of Permutation-based Indexing schemes St´ ephane Marchand-Maillet 1 , Edgar Roman-Rangel 1 , Hisham Mohamed 1 , and Frank Nielsen 2 1 Department of Computer Science, University of Geneva, Switzerland, stephane.marchand-maillet@unige.ch 2 LIX Polytechnique, Paris, France Abstract. Providing a fast and accurate (exact or approximate) ac- cess to large-scale multidimensional data is a ubiquitous problem and dates back to the early days of large-scale Information Systems. Simi- larity search, requiring to resolve nearest neighbor (NN) searches, is a fundamental tool for structuring information space. Permutation-based Indexing (PBI) is a reference-based indexing scheme that accelerates NN search by combining the use of landmark points and ranking in place of distance calculation. In this paper, we are interested in understanding the approximation made by the PBI scheme. The aim is to understand the robustness of the scheme created by modeling and studying by quantifying its invariance properties. After discussing the geometry of PBI, in relation to the study of ranking, from empirical evidence, we make proposals to cater for the inconsistencies of this structure. Keywords: Permutation Based Indexing, ranking, geometry 1 Introduction Providing a fast and accurate (exact or approximate) access to large-scale mul- tidimensional data is a ubiquitous problem and dates back to the early days of large-scale Information Systems. The approach generally taken is that of defin- ing a structure of the space based on information similarity and to partition the information space according to this structure for quantized or hierarchical ac- cess. The most common base for structuring the space is to assume the existence of a relevant metric in the space and to base the indexing on the properties of that metric space to resolve the Nearest Neighbor (NN) search problem. From there, a large variety of indexing techniques have been defined [36,10,37,29]. In this paper, we are interested in a finer understanding of the approximations made by the PBI scheme (and, more generally, permutation-based distance mea- surements). In particular, the aim is to understand the robustness of the scheme created by quantifying its invariance properties. The main contributions is the definition of a formal space partitioning model for the PBI scheme, embarking power tools from geometry modeling.