Detecting behavioral trajectories in continued education online courses Bruno Elias Penteado, Seiji Isotani Institute of Mathematical and Computer Sciences, University of São Paulo, 13566-900, São Carlos, SP, Brazil brunopenteado@usp.br, sisotani@icmc.usp.br Paula Maria Pereira Paiva, Marina Morettin- Zupelari, Deborah Viviane Ferrari Speech Language Pathology and Audiology Department - Bauru Dental School, University of São Paulo Bauru, SP, Brazil AbstractContinuing education online has been steadily increasing worldwide and its students’ profiles tend to be different from those in other formal modalities. Hence, it is relevant to understand the behavior of these students along their course trajectories. In this work, we investigate a specialization online course involving hearing healthcare professionals (n=96), to discover different navigation strategies and the factors which may influence it. To that end, we applied sequence analysis techniques to interpret different navigation patterns and selected demographic covariables which may influence these behavioral trajectories. Three clusters were found, with subtle differences among them and three variables with greater predictive power: age, professional experience and region. Keywords: learning trajectories, sequence analysis, learning analytics I. INTRODUCTION In different countries, e-learning has made continuing education more accessible, particularly to working adults. Understanding how this population engage with online courses is crucial for administrators and instructors to provide a better design and support for their courses. The following research questions are posed: RQ1. How to identify different behavioral trajectories in the learning process of students during the course? RQ2. What are the main factors which influence the different trajectories? II. RELATED WORKS Many different approaches have been used to model and understand behavioral trajectories. Stochastic models are commonly used for time series sequences [1], along with other techniques, like process conformance [2], sequential data mining [3], clustering [4], process mining [5], and grammar modeling [6], and machine learning [7]. However, these methods carry complex tasks both for modeling and interpreting the results, particularly for professionals not in the field of computer science. This work proposes a method based on the analysis of sequences of actions, that reflect different study strategies and explore its causal factors, enabling course administrators to interpret the resulting data and adapt the design of their courseware accordingly. III. METHODOLOGY A. Material. We gathered data from the online specialization course “Auditory Rehabilitation in Children - ARC”, developed by the Speech-Language Pathology and Audiology (SLPA) Department (University of Sao Paulo, Bauru Campus), the Samaritano Association and the Brazilian Ministry of Health. The ARC aimed to qualify hearing healthcare practitioners with up-to-date research and practices. The course was student-centered, employing reflective, active and problem- solving methods, contextualized in the participants’ clinical practices and everyday challenges. ARC was organized in modules, released as the course progressed. Navigation within a module was open, although an instructional sequence was suggested. Each module had its own structure, thus making difficult the comparison between different ones. Therefore, in this work, data from a single module on pediatric auditory assessment was selected. Enrolled in this module were 96 students (93 SLPA and 3 physicians; 91 females) with an age average of 36.1 years (sd: 7.6, range: 22-54). Data regarding demographic characteristics and professional background (including self-reported experience in different Audiology related areas) were extracted from students’ admission forms. The region where participants worked was used here as a proxy for their socioeconomic context. B. Codification Table 1 describes the actions considered and their codes. TABLE I. LMS ACTIONS ENCODED FOR THIS STUDY. Code Action carried out by the student Post FOR Posting a message on the Forum. Post SS Posting a question on the Standby Support - a specific tool for timely doubts content or technical-wise. View CONT Visualizing instructional materials (videos or written). View CTXT Visualizing the contextualization section, posing the issue to be discussed in the module. View FAQ Visualizing the FAQ section, having questions about the content of the course, based on a previous edition of the course. View FOR Visualizing a discussion thread in the Forum. The main thread was created by the instructor and involved the clinical case to be discussed. View MAP Visualizing supporting material links or bibliographical references given by the professor or colleagues. View PD Visualizing message posted on the Standby Support. C. Data analysis As a pre-processing step, we removed 6 outliers (number of actions > 130), due to the high skewness caused by their behavior, resulting in a total of 90 participants. To respond to RQ1, the clustering technique was used to analyze group similar behaviors. We applied agglomerative hierarchical grouping based on Ward's method and the similarity based on optimal matching distance metric. For RQ2, we applied discrepancy analysis of the sequences using