Independent Performance Evaluation of Biometric Systems Davrondzhon Gafurov, Bian Yang, Patrick Bours and Christoph Busch Gjøvik University College, {Firstname.Lastname}@hig.no I. I NTRODUCTION Often in the development of a biometric product the evaluator of the system is the same institution who developed the algorithm. Furthermore, usually the test data set is also collected by the same developer/evaluator and in most cases such database will not be public. Consequently, test results cannot be verified by independent institutions. Although this can be justifiable (e.g. in the optimization phase of an algorithm), from the perspective of a potential customer it reduces trustworthiness of the developed system and reported performances. Therefore, for performance evalua- tion the availability of independent databases and desirably independent evaluators are very important. It is also essential that algorithm developers do not have access to the testing database and thus the risk of tuned algorithms is minimized. Pseudonymous identifiers (PI) are complementary to image- or minutiae- based references and provide a level that is both more privacy protective and more efficient than symmetric or asymmetric encryption of a biometric reference image or minutiae template record [1]. With PI an individual retains complete control of its biometric data as multiple PI can be generated from a single biometric characteristic without any risk that these can be linked together. At the same time any of these identifiers can be cancelled and replaced by a new one if needed. Research on such PI is a core objective of the TURBINE project [2]. This work presents the biometric performance test report on a fingerprint performance evaluation that has been gener- ated in the context of the TURBINE project [2]. It is worth to emphasize that this performance report is results of the first testing round which are not final results of the project. For the second and final testing round, project partners will submit their improved algorithms, and its results will be available in year 2011. Furthermore, here we only report the ”biometric performance” per se of the algorithms while the ”security performance” of the PI algorithms is evaluated by others in TURBINE. II. PERFORMANCE METRICS AND DATA SETS The main error types associated with any biometric perfor- mance are FMR versus FNMR, and FAR versus FRR. In this work we will refer to the former and latter pairs as algorithm and system performances (or errors), respectively. They are related to each other according to the below formulas: F AR = FMR*(1-FTA),FRR = FTA+FNMR*(1-FTA) In our tests, we define FTA using formula below. FTA = FTC + FTX * (1 - FTC) where FTC (Failure To Capture) and FTX (Failure To eXtract) are estimated as follow FTC = (# terminated capture attempts) + (# not sufficient quality images) Total # capture attempts FTX = ( # not encoded templates) Total # images submitted to the template encoder (”#” stands for ”number of”). It is worth noting that FTA computation incorporates both hardware (in FTC) and software (in FTX) related failures. The (binary) fingerprint verification algorithms are pro- vided by project partners, in particular Sagem Securite, Precise Biometrics, Philips Research Europe and University of Twente. An external fingerprint verification package by Neurotechnology (VeriFinger [3]) is bought and also in- cluded in the testing. The submitted PI software for the first benchmark encompasses software which simulates the effect of a physical protection layer obtained when imple- menting encoding and comparison within a smartcard (on- card-comparison techniques). As a test database we use the GUC100 multi-scanner fingerprint database which consisted of fingerprint images of all 10 fingers from 100 subjects (almost 72000 fingerprint images in total) [4]. Neither project partners nor external parties had access to the GUC100 database or were involved in the testing activity. Performance evaluations were carried out solely by the GUC research team as an independent and neutral academic party in the project. III. PERFORMANCE RESULTS The focus of this work is not on comparing individual performances of algorithms or scanners but rather empha- sizing characteristics of biometric performance evaluation and observing the potential performance degradation in the transition to the PI level. Therefore, the names of scanner (denoted as S1, ..., S6) and algorithm suppliers (except Neurotechnology) are anonymized. Test results are given in terms of the DET-curves. At the minutiae level curves x-axis are plotted in logarithmic scale.