Comparison of Data Preprocessing Methods and the Impact on Auto-encoder’s Performance in Activity Recognition Domain Adam Harasimowicz Gdansk University of Technology Faculty of Electronics, Telecommunications and Informatics Department of Computer Architecture 11/12 Gabriela Narutowicza Street, 80-233 Gdansk, Poland Email: haras.adam@gmail.com Abstract—Raw data preprocessing and feature extraction have strong influence on a performance of Machine Learning classifiers and usually these steps are performed by experts who are able to decide which information are important. However there are algorithms, commonly used in Deep Learning, which allow automatically extract meaningful descriptions of samples. It motivated us to investigate the impact of different data preprocessing methods on auto-encoder (AE). We also involved one of most recent algorithm’s version which outperforms standard AE – Sparse Auto-Encoder. The experiments base on accelerometer data gathered during previous researches and they contain measurements of eight activities. Presented comparison of methods allows to choose a subset of the most promising algorithms for Human Activity Recognition problems. Keywords—activity recognition; accelerometer; autoencoder; deep learning; pattern recognition; data preprocessing; ubiquitous computing; context awareness; sensors I. INTRODUCTION Activity recognition domain is connected with such research areas like pattern recognition, ubiquitous computing and context awareness. The works in this problem domain started in the late ‘90s and one of the first experiments were [1] and [2] according to [3, 4]. Since that time researchers have improved results, found most appropriate sensors [5, 6], size of samples [7] and developed such Humans Activity Recognition (HAR) systems like DiaTrace [8], iLearn [9], UbiFit [5, 10] and others [11-15]. All of those systems based on Machine Learning approach and so far the most widely applied method for feature extraction in HAR problems is feature engineering which requires an expert knowledge to choose the most accurate features set. Mostly there are used features from two groups - time domain and frequency domain [4] which are computed independently for every sample containing measurements from sensors. One of such features can be mean, standard deviation, variance, interquartile range, mean absolute deviation, minimum or maximum, energy, entropy, kurtosis and correlation coefficients. However, there are some methods commonly used in Deep Learning (DL) which allow automatically select meaningful representation of data [16, 17]. It is achieved during features learning process [18, 19]. Such process can be performed in unsupervised or semi-supervised way what significantly increases number of possible application domains. There also have been shown that this approach connected with Deep Learning methods allows achieve state-of-art or outperform it in such domains like Natural Language Processing, Computer Vision and Speech Recognition [18]. But the performance of feature learning algorithms (like auto-encoder, Restricted Boltzmann Machine, Sparse Coding and Deep Belief Network) depends on initial preprocessing and normalization raw data. II. MOTIVATIONS Deep machine learning is dynamically developed area by researchers and, as it was shown, it outperforms previously used methods in many problem domains [18, 20]. There are also works about DL in HAR but they base on video processing [21]. Nevertheless, according to our knowledge, this new approach was not adapted in HAR, which base on sensors, except [22]. Moreover, because data preprocessing has significant impact on classification performance we wanted provide guidance for future researches. Also since [22] Deep Learning algorithms have been improved so our work consider this aspect too. Additionally we can observe rapid development in areas of wearable technology and Internet of Things what will raise huge amount of data and some of these information could be used to activity recognition. Also due to high data availability unsupervised or semi-supervised learning methods will have potentially advantage over pure supervised algorithms. III. AUTOMATIC FEATURE LEARNING Features learning process and automatic extraction can be achieved using different methods. But most of them base on the same concept – finding lower-dimensional representation of data without loss of important information. For the experiments we decided to choose Sparse Auto-Encoder, which belongs to the wider family of similar methods called auto-