Poster Abstract: A Novel and Efficient Approach to Evaluate Biometric Features for User Identification Namrata Kayastha and Kewei Sha Department of Computing Sciences University of Houston - Clear Lake Houston, TX, USA {kayasthan3411, sha}@uhcl.edu Abstract— Classifications based on biometric features are widely used in modern healthcare applications, including user identification, authentication, and tracking. The complexity and accuracy of classification algorithms largely depend on the size and the quality of the feature set used to build classifiers. Feature evaluation and selection are critical steps to decide a small set of high-quality features to build accurate and efficient classifiers. This paper proposes a novel and efficient approach to evaluate and select biometric features for user identification applications based on activity sensor data collected from the user’s wrists. For each feature, we first generate an NRMSD matrix, each entry of which represents the similarity level of any two users. Then, we define a heuristic, the Farness value to evaluate the quality of the feature based on the NRMSD matrix of the feature. Finally, we select a set of high-quality features whose Farness value is higher than 0.10. Keywords— Biometrics, Identification, Classification, Feature Evaluation, Feature Selection I. PROBLEM STATEMENT AND MOTIVATION Biometrics based user identification has attracted significant research attention in recent years. It identifies individuals based on their unique physical or behavioral characteristics. The selection of features plays an important role in both the accuracy and the efficiency of classification. Ideally, in the design of the feature-based classification algorithm, we should only choose a small set of high-quality features to train an efficient classifier with high accuracy, which can distinctly identify individuals. This is because a large set of features not only significantly increase the complexity of classification algorithms, but also reduce the accuracy of the classification as it may contain some low-quality features. How to select a set of high-quality features remains a research challenge. Researchers have proposed various feature evaluation and selection algorithms to improve the quality of feature set [1,2,3]. In [1], Rajesh et al. use two feature evaluation methods, namely, the information gain based feature ranking (IGFR) and the correlation based feature subset selection (CFSS). The IGFR method ranks the features based on their distinguishability, whereas the CFSS method selects the best subset of features for classification. They also showed the correlation among the subset of features from the set of acceleration and rotation based features by using the CFSS method. However, the CFSS method fails to provide any comparison between the features with respect to the users. Contrary, our design of feature evaluation shows the evaluation of each feature with respect to the users. Damasevicius et al. propose a method for human activity recognition and user identification using random projections to reduce the dimension of features so that they improve the efficiency of classification by lowering dimensionality feature space [2, 3]. In their proposed model, the best random projection with the smallest overlapping area is selected. They then select the top three [2] or top ten [3] best features ranked by the Matlab Rankfeatures function using the entropy function. However, their feature evaluation approach is not quite efficient, and there also lacks a mechanism to decide the number of features, which results in missing many high-quality features. From the above analysis, existing proposed feature selection algorithms are mostly general algorithms that aim to select a set of high-quality features for general purpose of classification. Considering that the goal of biometrics based user identification is to efficiently differentiate users, we believe the feature selection algorithms should be designed and optimized to align with the application goal, i.e., to select a set of features that distinctly differentiate one user from others. Therefore, this paper proposes a novel and efficient approach to evaluate biometric features focusing on better user identification. The novelty of the paper is to use, for each feature, Normalized Root Square Mean Difference (NRMSD) [4] to measure the similarity level of any two users based on the user’s activity data. Then, we define the Farness value to evaluate the quality of the feature. Our fundamental goal is to find a minimum set of high-quality features by eliminating low-quality features based on the feature evaluation result. II. DESIGN OF FEATURE EVALUATION AND SELECTION A. Description of Dataset Based on the MetaWear C sensing platform, we collected both accelerometer and gyroscope data from 14 users. During data collection, the Metawear device is attached to the wrist of the user, capturing the user’s hand movement while the user is walking around normally. Sensor readings consist of both accelerometer and gyroscope readings along x, y, and z axes. The data are collected in two sessions at two different times. Each session collects 15 seconds of data sampled at a frequency of 100Hz. Before the data are analyzed, they are cleaned, interpolated, and resampled to guarantee the quality of the dataset. Then, based on features presented in the previous research [1,2,3], a set of statistical and physical features are extracted from the preprocessed data to enrich the dataset for analysis. B. Rationale of Feature Evaluation Our major goal of feature evaluation is to find high-quality features that can distinctly differentiate any two users. Thus, we need to find a measure to achieve this goal. For a high-quality