Histogram Based Method for Unsupervised
Meeting Speech Summarization
Nouha Dammak
1,2(B )
and Yassine BenAyed
1
1
Multimedia InfoRmation Systems and Advanced Computing Laboratory (MIRACL),
3021 Sfax, Tunisia
nouha.damak@gmail.com, yassine.benayed@isims.usf.tn
2
Higher Institute of Computer Sciences and Communication Techniques, University of Sousse,
Sousse, Tunisia
Abstract. The appearance of various platforms such as YouTube, Dailymotion
and Google Video has a major role in the increasing of the number of videos avail-
able on the Internet. For example, more than 15000 video sequences are seen every
day on Dailymotion. Consequently, the huge gathered amount of data constitutes a
big scientific challenge for managing the underlying knowledge. Particularly, data
summarization aims to extract concise abstracts from different types of documents.
In the context of this paper, we are interested in summarizing meetings’ data. As the
quality of video analyzing’s output highly depends on the type of data, we propose
to establish our own framework for this end. The main goal of our study is to use
textual data extracted from Automatic Speech Recognition (ASR) transcriptions
of the AMI corpus to give a fully unsupervised summarized version of meeting
sequences. Our contribution, called Weighted Histogram for ASR Transcriptions
(WHASRT), adopts an extractive, free of annotations and dictionary-based app-
roach. An exhaustive comparative study demonstrates that our method ensured
competitive results with the ranking-based methods. The experimental results
showed an enhanced performance over the existing clustering-based methods.
Keywords: Summarization · Unsupervised · Transcription · Automatic Speech
Recognition · Meetings · Natural Language Processing
1 Introduction
Most people spent a lot of their time in meetings. Once they finish, it turns very sig-
nificant to produce rendering reports, citing the main issues discussed at the meeting,
such as the problems encountered and the decisions made. It is now possible to record
and store a meeting even in audio format or video format. Several existing tools, which
are embodied in a context known as «Speech-to-text», generate rate text transcriptions
listing what has been said during the meeting period. An important issue is then to be
able to extract automatically from these textual transcriptions, often very noisy, topics
and summaries leading to the creation of the meetings reports’. In this paper, we furnish
a fully non-supervised extractive text summarization system and we check its effects on
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2021
A. Abraham et al. (Eds.): ISDA 2019, AISC 1181, pp. 396–405, 2021.
https://doi.org/10.1007/978-3-030-49342-4_38