XII Encontro de Alunos e Docentes do DCA/FEEC/UNICAMP (EADCA) XII DCA/FEEC/University of Campinas (UNICAMP) Workshop (EADCA) Campinas, 17 e 18 de Outubro de 2019 Campinas, Brazil, October 17-18, 2019 Classification of Facial Action Units in Brazilian Sign Language Emely Pujólli da Silva, Paula D. Paro Costa, Kate M. Oliveira Kumada, José Mario De Martino {emelypujolli@gmail.com, paulad@unicamp.br, kate.kumada@ufabc.edu.br, martino@fee.unicamp.br } Abstract – Facial expression (FE) is fundamental for the understanding of Brazilian Sign Language (Libras). FE not only has the function of conveying affective states but also is an essential part of its grammar, i.e., FE is one of the parameters used in the construction of a sign. An automatic recognition model capable of classifying facial expressions can be applied to assistive technologies and further deepening Libras’ linguistic studies. Here we present a method for automatic recognition of facial expressions in Libras by using convolutional neural networks combined with a region feature extraction approach. The obtained experimental recognition rates are close to the state-of-the-art. Keywords – Libras, affective facial expressions, facial expression recognition, FACS, AU combinations. Introduction: Sign language (SL) is a visuospatial language form by a combination of manual and non-manual means of expression. Grammatically rich and structured in a sentence-forming linguistic system, they are widely adopted by the deaf community to communicate among themselves. Facial expressions (FE) are part of non- manual markers. They have the function of conveying affective states and grammatical meaning. That makes FE indispensable in the understanding of the language. Like any spoken languages, sign languages evolve nat- urally and can vary accordingly with geographic barri- ers, what gives origin to a variety of sign languages, for instance, American Sign Language (ASL), British Sign Language (BSL), Swiss-German Sign Language (DSGS), French Sign Language (FSL), Russian Sign Language (RSL), Brazilian Sign Language (Libras) and others. Automatic Sign Language Recognition (ASLR) is an important issue for enabling communication be- tween deaf and hearing people. It can be used to make the deaf more independent [8]. Such systems are built by detecting and recognizing signs observed in a video input. The output from the system is the translated text or sentence. Unfortunately, research has been limited to small-scale systems capable of recognizing a subset of some sign languages [12]. The construction of such a model has a few chal- lenges. First, the development of a computational model of any language needs a careful study and description of the same to sufficiently inform and assist the model. An- other critical issue is the vocabulary size of the experi- mental corpora used to verify the robustness and general- ization capabilities of the proposed systems [2]. Further- more, occlusions can happen during the performance of a sign, making harder the detection and tracking of the signer movements and positions. A model for recogni- tion of facial expression in sign language suffers the same challenges in developing as an ASLR system, although it can help to increase accuracy rates in systems that rec- ognize manual signs. Additionally, it may be used as an annotation tool. In this work, we propose a model for recogni- tion of facial expression in Libras based on Convolutional Neural Networks (CNN). Aiming to standardize Libras’ facial expressions and ensure that our approach is in- cluded in the study of facial expressions in general, we use the Facial Action Coding System (FACS). Ekman and Friesen created FACS in the seventies to code changes in facial appearance, called Action Units (AUs), which are associated with the movements of facial muscles [7]. The following Libras’ facial expression recognition sys- tem constitutes the early results for our ongoing research work. We summarize our contributions as follows: (1) Proposal of a taxonomy of facial expressions in LIBRAS associatd with FACS; (2) Construction of a Libras Cor- pus; (3) Proposal and Evaluation of a model for the recog- nition of facial expressions in Libras. Libras Facial Expression and FACS: Facial expressions parameters are essential in sign language since they carry emotional, grammatical, lexical, and prosodic informa- tion. In [16] is introduced the Libras’ facial expression taxonomy where facial expressions and head pose are de- scribed and listed accordingly with their function in the discourse. Generally, the head pose supports the seman- tics functions of Libras. Questions, affirmations, denials, relative, conditional clauses, topics, and focus are com- municated, e.g., with the help of the signer’s head pose. The facial expression does not only reflect a person’s af- fect and emotions but also constitutes a large part of the grammar in Libras. For example, a change of mouth con- figuration combined with the lifting of the eyebrows cor- responds to an intensification of the sign. Some signs can only be defined by facial expressions alone, while oth- ers remain ambiguous unless additional facial expression information is made available. For instance, the signs “what" and “who" are completely identical with respect to gesturing and can only be differentiated by referring to eyebrow movement and lip patterns (see Figure 1(A)-(E)). As we can observe in Figure 1(F), the Libras’ fa- cial expression taxonomy describes the exact expression. Also, different terminologies are used to document facial expressions in Libras. That large and unstandardized la- bel makes hard to generalize into a system. We adopted a more usual way of describing facial expression that orig- inated in the field of psychology and which became pop-