General-Purpose Communicative Function Recognition using a Hierarchical Network with Cascading Outputs and Maximum a Posteriori Path Estimation Eug´ enio Ribeiro 1,2 , Ricardo Ribeiro 1,3 and David Martins de Matos 1,2 1 INESC-ID Lisboa, Portugal 2 Instituto Superior T´ ecnico, Universidade de Lisboa, Portugal 3 Instituto Universit´ ario de Lisboa (ISCTE-IUL), Portugal eugenio.ribeiro@inesc-id.pt Abstract ISO 24617-2, the standard for dialog act anno- tation, deﬁnes a hierarchically organized set of general-purpose communicative functions. The au- tomatic recognition of these functions, although practically unexplored, is relevant for a dialog sys- tem, since they provide cues regarding the intention behind the segments and how they should be inter- preted. In this paper, we explore the recognition of general-purpose communicative functions in the DialogBank, which is a reference set of dialogs an- notated according to the standard. To do so, we adapt a state-of-the-art approach on ﬂat dialog act recognition to deal with the hierarchical classiﬁca- tion problem. More speciﬁcally, we propose the use of a hierarchical network with cascading out- puts and maximum a posteriori path estimation to predict the communicative function at each level of the hierarchy, preserve the dependencies between the functions in the path, and decide at which level to stop. Furthermore, since the amount of dialogs in the DialogBank is reduced, we rely both on ad- ditional dialogs annotated using mapping processes and on transfer learning to improve performance. The results of our experiments show that the hier- archical approach outperforms a ﬂat one and that maximum a posteriori estimation outperforms an it- erative prediction approach based on masking. 1 Introduction From the perspective of a dialog system, it is important to identify the intention behind the segments in a dialog, since it provides an important cue regarding the information that is present in the segments and how they should be inter- preted. According to Searle [1969], that intention is re- vealed by dialog acts, which are the minimal units of lin- guistic communication. Consequently, automatic dialog act recognition is an important task in the context of Natural Lan- guage Processing (NLP), which has been widely explored over the years. In an attempt to set the ground for more comparable research in the area, a standard for dialog act annotation, ISO 24617-2, was developed [Bunt et al., 2012; Bunt et al., 2017]. However, annotating dialogs according to this standard is an exhaustive process, especially since the an- notation does not consist of a single dialog act label, which in the standard nomenclature is called a communicative func- tion, but rather of a complex structure which includes infor- mation regarding the semantic dimension of the dialog act and relations with other segments, among others. Conse- quently, the amount of data annotated according to the stan- dard is still small and the automatic recognition of its com- municative functions is practically unexplored. We explore the automatic recognition of communicative functions in the English dialogs available in the Dialog- Bank [Bunt et al., 2016; Bunt et al., 2019], which is the only publicly available source of dialogs fully annotated according to the standard. We focus on general-purpose communicative functions, since, contrarily to the dialog act labels of widely explored corpora in dialog act recognition research, they pose a hierarchical classiﬁcation problem, with paths that may not end on a leaf communicative function. Furthermore, we focus on the Task dimension, since it is the one in which general- purpose communicative functions are predominant. To approach the problem, we propose adaptations of a state-of-the-art approach on dialog act recognition that allow it to deal with the hierarchical problem posed by the general- purpose communicative functions of the standard. The adap- tations focus on the ability to predict communicative func- tions at the multiple levels of the hierarchy, identify when the available information is not enough to predict more speciﬁc functions, and preserve the dependencies between the func- tions in the path. Furthermore, given the reduced amount of dialogs in the DialogBank, we explore the use of addi- tional dialogs annotated using mapping processes, as well as of transfer learning processes to improve performance. In the remainder of the paper, we start by providing an overview on the standard and dialog act recognition ap- proaches in Section 2. Then, in Section 3, we describe our approach for predicting the general-purpose communicative functions of the standard. Section 4 describes our experimen- tal setup, including the datasets, evaluation methodology, and implementation details. Finally, Section 5 presents and dis- cusses the results of our experiments and Section 6 summa- rizes the contributions and provides pointers for future work. arXiv:2003.03556v1 [cs.CL] 7 Mar 2020