Geometrically robust perceptual fingerprinting: an asymmetric case Oleksiy Koval, Sviatoslav Voloshynovskiy, Farzad Farhadzadeh, Taras Holotyak and Fokko Beekhof * ABSTRACT In this paper, the problem of multimedia object identification in channels with asymmetric desynchronizations is studied. First, we analyze the achievable rates attainable in such protocols within digital communication framework. Secondly, we investigate the impact of the fingerprint length on the error performance of these protocols relaxing the capacity achieving argument and formulating the identification problem as multi class classification. 1. INTRODUCTION Recent advances in modern networking and multimedia technologies have open an access to an exponential and permanently increasing volume of multimedia data via various public services and social networks. In these circumstances, an urgent demand for efficient high multimedia volume managing and security systems arises. Therefore, scalability of existing design principles of such systems for large scale applications is raised by this demand. In this paper we would like to analyze multimedia identification based on robust fingerprinting that can be considered as a unique solution to the identification problem when no modification of the content is admissible. Such a constraint is imposed while identifying art works, biometrics, medical data, making application of digital data hiding principles for this purpose unacceptable. A digital fingerprint (a.k.a. robust perceptual hashing) provides a compact and robust representation of a content designed for its distinctive, computationally efficient and privacy protected management. Recently, the domain of robust fingerprinting has performed a significant evolution. The main progress achieved at the side of practical algorithm development mainly concerns robust feature extraction techniques as well as elaboration of efficient matching strategies in large databases, 1 . 2 Analysis of achievable rates of an identification system was firstly accomplished by Willems et. al. 3 in a classical information-theoretic formula- tion of infinitely long codeword transmission over the discrete memoryless channel (DMC). Several groups of authors analyzed the identification problem within the information-detection framework for various channel and communicated codeword length assumptions. 4–6 In the most of the mentioned cases (besides a conjecture formulated in 4, 5 ) it is explicitly assumed that the query is ideally synchronized with the database content. Such an assumptions while being valid for analysis of communications over the DMC could be too restrictive for the target application and might lead to inaccurate performance limit estimates. Moreover, omnipresence of desynchronizations at both multimedia database enroll- ment and query identification stages, makes an extension of existing analysis results for the DMC to the channels with desynchronization distortions an important research issue. The first attempt to characterize the performance loss in terms of achievable rate introduced due to geomet- rical desynchronization over such a channel was recently performed in 7 where it is assumed that the database is composed of original/ideal multimedia data. Such a system design principle was firstly proposed by Willems et all. 8 for biometric identification and will be referred as symmetric in our paper. The presented analysis results are obtained assuming that geometrical desynchronization can be modeled as a parametric mapping defined over the set of finite cardinality. * O. Koval, S. Voloshynovskiy, F. Farhadzadeh, T. Holotyak and F. Beekhof are with CUI-University of Geneva, Stochastic Information Processing Group, Battelle Batiment A, 7 route de Drize, 1227 Carouge, Switzerland. The contact author is O. Koval (email: Oleksiy.Koval@unige.ch). http://sip.unige.ch