How Useful are Semantic Links for the Detection of Implicit References in CSCL Chats? Traian Rebedea, Costin-Gabriel Chiru, Gabriel-Marius Gutu Faculty of Automatic Control and Computers University Politehnica of Bucharest Bucharest, Romania Email:{traian.rebedea, costin.chiru}@cs.pub.ro, gabi.gutu@gmail.com Abstract—Chat conversations are used for a large range of Computer-Supported Collaborative Learning (CSCL) tasks especially because they allow the creation of multiple conversation threads that run in parallel. Thus, several different topics can be debated at the same time, fostering the exploitation of different ideas and facilitating collaborative knowledge creation. In order to detect these threads, our method proposed to firstly detect the links that arise between the utterances of a conversation. From a computational linguistics perspective, there is a wide variety of different types of links between utterances and there is no mechanism to compute all of them. This paper proposes to explain to what degree semantic similarity measures from Natural Language Processing (NLP) may be used to detect the links that arise between utterances in CSCL chat conversations and which is the effectiveness of applying solely this technique for implicit links identification. Keywords—Chat Conversations, CSCL, Natural Language Processing, Semantic Relatedness, Latent Semantic Analysis, Implicit Links I. INTRODUCTION Chat conversations are used in a large range of Computer- Supported Collaborative Learning (CSCL) activities, especially for debating and solving difficult problems [1]. One of the supposed reasons for the successful integration of chats in many CSCL tasks is that they allow the existence of parallel discussion threads that inter-animate throughout the discussion [2]. Discourse analysis does not provide a theory suitable for processing multi-party conversation chats. However, there are some new theories that propose the use of conversation or coherence graphs for chat analysis [1, 2]. At the base of these theories is the existence of a multitude of links – explicit or implicit – between utterances that might explain the evolution of the discussion threads. These links have been primarily connected with the notion of outer voices or echoes introduced by Bakhtin’s dialogic theory [3]. This model also defined the notions of heteroglossia, inter-animation and polyphony in discourse [3, 4], and it has been proposed by some researchers as a new theory of learning to be used for any CSCL task [5]. This learning theory is applied mainly to text-based collaborative learning situations where utterances can be associated with voices. Thus, the study of the “participants' voices (and the voices within their voices)” [5] acknowledges that each utterance has an inner or specific voice of the participant which uttered it, but also complex echoes from previous voices. Determining and analyzing this linkage between voices would provide a powerful method for analyzing learning and knowledge building both at an individual level, but also at the group level (e.g. social influence or collaborative knowledge construction). Furthermore, there have been studies that showed the existence of a connection between dialogism used for learning and thinking skills: the quality of individual thinking can be improved by improving the quality of dialogue (online and offline) and that “individual thinking skills originate in conversations, where we learn to reason, to evaluate, to join in creative play and to provide relevant information” [6]. However, the main difficulty is to determine the quality of a conversation, especially in online multi-party discussions. We have proposed that the degree of inter-animation in a conversation can be used to assess its quality [2] especially due to the fact that inter-animation assumes that meaning arises not from a single utterance, but rather from the interaction between them. This interaction between utterances is an important aspect in collaborative learning. Thus the focus is not on the individual participant or utterance, but on the inter-play that appears between different utterances and between different participants. Inter-animation and polyphony have been previously proposed for assessing the quality of problem solving tasks using chat conversations [2, 7] or for detecting pivotal moments in online discussions by identifying the changes in the degree of inter-animation throughout a discussion [8]. Moreover, inter-animation has been also linked to meaning making [9] and knowledge building [10] activities. The inter-animation of voices in a conversation may be represented through the links between the utterances, either explicit or implicit. In many conversation environments, such as online discussion forums, and special chat systems developed for CSCL, the participants are able to highlight explicitly one or more previous utterances the current one is responding to. However, many links to antecedent replies remain implicit, either due to the fact that the participants are not always using the explicit referencing feature, or because the current utterance is linked to many previous ones and it is