Deep Learning based Personality Recognition from Facebook Status Updates Jianguo Yu Human Interface Lab The University of Aizu Fukushima, Japan d8182103@u-aizu.ac.jp Konstantin Markov Human Interface Lab The University of Aizu Fukushima, Japan markov@u-aizu.ac.jp Abstract—Many approaches have been proposed to automati- cally infer users personality from their social networks activities. However, the performance of these approaches depends heavily on the data representation. In this work, we apply deep learning methods to automatically learn suitable data representation for the personality recognition task. In our experiments, we used the Facebook status updates data. We investigated several neural network architectures such as fully-connected (FC) networks, convolutional networks (CNN) and recurrent networks (RNN) on the myPersonality shared task and compared them with some shallow learning algorithms. Our experiments showed that CNN with average pooling is better than both the RNN and FC. Convolutional architecturewith average pooling achieved the best results 60.0±6.5%. Index Terms—Big Five model, Automatic Personality Recog- nition, Convolutional Neural Networks, Social Media. I. I NTRODUCTION Social networks such as Facebook, Twitter, and Weibo have become essential components of everyday life and hold rich sources that reﬂect individual’s personality. Our personality affects our life choices, well-being, and many other behaviors. During the social interaction, people have to interact with unknown individuals. In order to achieve effective cooperation, it is important to predict the preferences and behaviors of the people we deal with. Such predictions can be found everywhere in the daily life and are often based on the personality of that person. For example, interviewers also consider whether the interviewee’s personality is suitable for their company. A girl may consider marriage based on her boyfriend’s personality. Automatic recognition of person’s personality from his/her social network activities allows to make predictions about preferences across contexts and environments [1] and has many important practical applications, such as products, jobs, or services recommendation [2] [3], word polarity disambigua- tion, mental health diagnosis, etc. Many approaches have been proposed to automatically infer users’ personality from the content they generate in social networks. However, the performance of these approaches depends heavily on the data representation which often is based on hard-coded prior knowledge. Recently, deep learning approaches have obtained very high performance across many different natural language process- ing (NLP) tasks. Unlike traditional methods, deep learning approaches can learn suitable representation automatically. In this work, we implemented several deep learning al- gorithms including fully-connected neural networks (FC), convolutional neural networks (CNN) and recurrent neural networks (RNN) in our personality recognition system and evaluated it on the task from the “Workshop on Computational Personality Recognition (Shared Task)” [4]. For classiﬁcation performance comparison, we used the same task results from some traditional shallow machine learning methods published elsewhere. II. RELATED WORK In the context of this study, personality is formally described by ﬁve dimensions known as the Big-Five personality traits [5]: • EXTraversion vs. Introversion (sociable, assertive, play- ful vs. aloof, reserved, shy). • NEUroticism vs. Emotional stability (calm, unemotional vs. insecure, anxious). • AGReeableness vs. Disagreeable (friendly, cooperative vs. antagonistic, faultﬁnding). • CONscientiousness vs. Unconscientious (self-disciplined, organised vs. inefﬁcient, care-less). • OPEness to experience (intellectual, insightful vs. shal- low, unimaginative). Automatic recognition of personality typically involves binary classiﬁcations of which trait types an user belongs to given the content generated by him/her. The true labels are usually obtained by self-assessment questionnaire [6]. A variety of approaches have been proposed for this task utilizing different classiﬁers and feature spaces. Until recently, most of the models were based on shallow learning approaches such as Support Vector Machine (SVM) [7] [8], Naive Bayes classiﬁer (NB) [9], K-Nearest Neighbors (kNN) [10], and Logistic Regression (LR) [11]. In the early studies, text features were typically extracted by tools like Linguistic inquiry and word count (LIWC) [12] and good results were usually achieved by selecting features from a very large feature space like [13], which achieved a very high classiﬁcation performance on the myPersonality task using ranking algorithms for feature selection and SVMs and Boosting as learning algorithms. Deep 383 978-1-5386-2965-9/17/$31.00 © 2017 IEEE 2017 IEEE 8th International Conference on Awareness Science and Technology (iCAST 2017)