RESEARCH ARTICLE High accuracy at low frequency: detailed behavioural classification from accelerometer data Jack Tatler 1, *, Phillip Cassey 1 and Thomas A. A. Prowse 2 ABSTRACT Accelerometers are a valuable tool for studying animal behaviour and physiology where direct observation is unfeasible. However, giving biological meaning to multivariate acceleration data is challenging. Here, we describe a method that reliably classifies a large number of behaviours using tri-axial accelerometer data collected at the low sampling frequency of 1 Hz, using the dingo (Canis dingo) as an example. We used out-of-sample validation to compare the predictive performance of four commonly used classification models (random forest, k-nearest neighbour, support vector machine, and naïve Bayes). We tested the importance of predictor variable selection and moving window size for the classification of each behaviour and overall model performance. Random forests produced the highest out-of-sample classification accuracy, with our best-performing model predicting 14 behaviours with a mean accuracy of 87%. We also investigated the relationship between overall dynamic body acceleration (ODBA) and the activity level of each behaviour, given the increasing use of ODBA in ecophysiology as a proxy for energy expenditure. ODBA values for our four high activitybehaviours were significantly greater than all other behaviours, with an overall positive trend between ODBA and intensity of movement. We show that a random forest model of relatively low complexity can mitigate some major challenges associated with establishing meaningful ecological conclusions from acceleration data. Our approach has broad applicability to free-ranging terrestrial quadrupeds of comparable size. Our use of a low sampling frequency shows potential for deploying accelerometers over extended time periods, enabling the capture of invaluable behavioural and physiological data across different ontogenies. KEY WORDS: Accelerometer, Animal behaviour, Classification model, ODBA, Random forest INTRODUCTION The foundation of animal ecology is understanding how individuals interact with their abiotic and biotic environment. These interactions are increasingly being measured with bio-logging techniques, where biological data are recorded remotely from devices attached to animals. This approach has allowed researchers to answer questions on everything from hunting tactics of puma (Williams et al., 2014) to energy expenditure in cormorants (Gómez Laich et al., 2011) and diving behaviour in whales (Ishii et al., 2017). Consequently, the ability to continuously observefree-ranging animals has facilitated the development and exploration of entirely new theories (Wilmers et al., 2015). Accelerometers are a valuable tool in bio-logging research as they provide quantitative measurements of animal behaviour and physiology where direct observation is not possible or logistically feasible. The use of accelerometers mitigates some of the major challenges associated with studying the behaviour of wild animals, such as extensive time investment, animal disturbance and observer bias. Accelerometers measure acceleration (gravitational and inertial) caused by animal movement in different planes, allowing the development of classification models calibrated to predict behavioural states such as resting, walking, swimming and eating (e.g. Pagano et al., 2017). Further, there is a strong linear relationship between body acceleration and energy expenditure in many taxa, which is of particular interest to ecophysiologists (Halsey and White, 2010; Wilson et al., 2006; Halsey et al., 2009). Although accelerometry has been used to study animal movement and behaviour for almost two decades (Yoda et al., 1999), recent methodological advancements have increased its accessibility and appeal to a broader scientific community. Classifying animal behaviours to high-frequency acceleration data presents a suite of new and complex challenges. One approach is unsupervised machine learning, in which pattern-recognition algorithms identify different states directly from the accelerometer signatures. Unsupervised learning is intrinsically challenging so algorithms are frequently used to learnthe relationship between acceleration data and behaviour using a model-training dataset that is acquired from direct observation. The ability of the algorithm to interpret this relationship depends largely on the variables used to characterise the raw acceleration data. Several attempts to simplify or streamline this approach have been made, with varying success. Ladds et al. (2017) introduced a super-machine-learning method that identified six behaviours in four species of pinniped with approximately 73% accuracy. They used a high sampling frequency (25 Hz), large training dataset (90,000 individual data points) and a very large set of input variables (n=147). In contrast, when using fewer input variables and the relatively simple approach (k-nearest neighbour), McClune et al. (2014) classified four behaviours in Eurasian badgers (Meles meles) with an overall classification accuracy of 89%. In general, it is expected that the classification accuracy of a model will increase when using: (a) higher sampling frequencies; (b) more training data; and (c) broader behaviour categories (i.e. fewer behaviours to be classified). The consequence of following these criteria is not only increased computational time and difficulty, but loss of behavioural diversity and decreased deployment time on free-ranging animals due to memory constraints, i.e. the exact opposite of what researchers are aiming for. Reducing the sampling frequency would greatly increase deployment time (e.g. from days to months) whilst also decreasing computational effort. However, it is challenging to accurately Received 3 May 2018; Accepted 10 October 2018 1 School of Biological Sciences and Centre for Applied Conservation Science, University of Adelaide, Adelaide, SA 5005, Australia. 2 School of Mathematical Sciences, University of Adelaide, Adelaide, SA 5005, Australia. *Author for correspondence ( jack.tatler@adelaide.edu.au) J.T., 0000-0002-8380-3612; P.C., 0000-0002-2626-0172 1 © 2018. Published by The Company of Biologists Ltd | Journal of Experimental Biology (2018) 221, jeb184085. doi:10.1242/jeb.184085 Journal of Experimental Biology