15 © Springer Nature Switzerland AG 2020 S. A. Costa et al. (eds.), Mathematics (Education) in the Information Age, Mathematics in Mind, https://doi.org/10.1007/978-3-030-59177-9_2 Chapter 2 An Unsupervised Approach to User Characterization in Online Learning and Social Platforms Dan Vilenchik A Short History of User Characterization Making sense of data that is automatically collected from online platforms such as online social media or e-learning platforms is a challenging task: the data is mas- sive, multidimensional, noisy, and heterogeneous (composed of differently behav- ing individuals). In this chapter we focus on a central task common to all on-line social platforms and that is the task of user characterization. For example, automati- cally identify a spammer or a bot on Twitter, or a disengaged student in an e-learning platform. Understanding the nature and patterns of interaction between members of a social network is a long standing research topic. Back in the 1950s (Katz and Felix Lazarsfeld 1957) studied the problem of identifying infuential people in social net- works. Two decades later, Freeman’s seminal work (Freeman 1978) coined three key indices of centrality: degree (the number of friends), closeness (the average number of hops from a user to all other users in the network) and betweenness (the fraction of shortest paths that have to go through this user), fueling a torrent of theo- retical and experimental work in the area. The subject became even more attractive to researchers and industry as the role of online social networks (OSNs) increased dramatically in recent years, with new business opportunities for marketeers. The task of characterizing users of OSNs is typically approached as a supervised learning classifcation problem. A target variable is defned, e.g. the ethnicity and political affliation of a user (Pennacchiotti and Popescu 2011), gender, age, regional origin (Rao et al. 2010), occupational class (Preotiuc-Pietro et al. 2015), etc. Next, data is collected from the network (typically using some sort of crawling proce- dure), and relevant features are extracted from each user account. Finally, one of a host of machine learning algorithms is trained to perform the task. D. Vilenchik (*) School of Electrical and Computer Engineering, Ben-Gurion University, Beersheba, Israel e-mail: vilenchi@bgu.ac.il