CLUSTER CRITERION FUNCTIONS IN SPECTRAL SUBSPACE AND THEIR APPLICATION IN SPEAKER CLUSTERING Trung Hieu Nguyen, Haizhou Li Institute for Infocomm Research, Department of Human Language Technology, 1 Fusionopolis Way, #21-01 Connexis, South Tower, Singapore 138632 Eng Siong Chng Nanyang Technological University, School of Computer Engineering, Block N4, Nanyang Avenue, Singapore 639798 ABSTRACT In this paper, we propose two cluster criterion functions which aim to maximize the separation between intra-cluster distances and inter-cluster distances. These criteria can au- tomatically deduce the desired number of clusters based on their extremized values. We then propose an algorithm to apply our criterion functions in conjunction with spectral clustering. By exploiting the characteristic of spectral sub- space,we show that the speakers are more separable in this subspace which will further enhance the effectiveness of our proposed criteria. The algorithm is used in our agglomera- tive hierarchical speaker diarization system to test on Rich Transcription 2007 conference data set and obtains very good results. Index Termsspeaker diarization, criterion function, spectral clustering 1. INTRODUCTION Clustering is the procedure to group data points into clusters such that the data points in the same cluster possess strong internal similarities. Generally, there are two major issues in clustering: determining number of clusters (cluster valid- ity) and nding optimal partitioning (cluster criteria). Thus far, these two issues are handled separately with different criteria e.g. the sum-of-squared-error criterion cannot be used for cluster validity because it is monotonic decreasing with increasing number of clusters. In this paper, we pro- pose two cluster criterion functions when extremized could concurrently solve both issues. These functions have a sim- ple interpretation that they aim to maximize the separation between intra-cluster distances and inter-cluster distances. Recently, spectral clustering methods get much attention because of their ability to handle many difcult clustering problems. However, not much has been investigated for speaker clustering within this framework. In this paper, we introduce an algorithm using our proposed criterion functions in spectral subspace and provide a mathematical analysis to this algorithm in the ideal case. Furthermore, we also show in the experiment that the speakers are more separable in the spectral subspace which is a desirable property for clustering. We then demonstrate the use of this algorithm in our agglom- erative hierarchical speaker diarization system to estimate number of speakers. This approach has advantage compared to those using thresholds derived from development set to determine number of speakers [1, 2, 3] because it does not suffer from mismatch issues between development data and test data. Ajmera [4] proposed a system using a modied version of BIC. This system performs well in terms of having low diarization error rate (DER) and not requiring develop- ment data, however it usually generates many small clusters (which does not have much impact on DER) thus provides wrong number of speakers. The paper is organized as follow: rst we introduce two criterion functions in section 2, and then in section 3, we apply these functions in spectral subspace and provide de- tail analysis. We nally report some experimental results on speaker clustering using the proposed algorithm in section 4. 2. CLUSTERING CRITERION FUNCTIONS Given a set of point S = {s 1 ,s 2 ,...,s n } of n samples that we want to partition into c disjoint subsets S 1 ,...,S c . Let d (s i ,s j ) be the similarity function between two points s i and s j . Dene: D intra = {d (s i ,s j ) |∀i, j k : s i S k ,s j S k } D inter = {d (s i ,s j ) |∀i, j k = l : s i S k ,s j S l } We propose two criterion functions to measure the quality of partitioning. 2.1. T s criterion Let m 1 , σ 1 , n 1 , m 2 , σ 2 , n 2 be respectively the mean, standard deviation, size of D intra and D inter . 4085 978-1-4244-2354-5/09/$25.00 ©2009 IEEE ICASSP 2009