CLUSTER CRITERION FUNCTIONS IN SPECTRAL SUBSPACE AND
THEIR APPLICATION IN SPEAKER CLUSTERING
Trung Hieu Nguyen, Haizhou Li
Institute for Infocomm Research,
Department of Human Language Technology,
1 Fusionopolis Way, #21-01 Connexis,
South Tower, Singapore 138632
Eng Siong Chng
Nanyang Technological University,
School of Computer Engineering,
Block N4, Nanyang Avenue,
Singapore 639798
ABSTRACT
In this paper, we propose two cluster criterion functions
which aim to maximize the separation between intra-cluster
distances and inter-cluster distances. These criteria can au-
tomatically deduce the desired number of clusters based on
their extremized values. We then propose an algorithm to
apply our criterion functions in conjunction with spectral
clustering. By exploiting the characteristic of spectral sub-
space,we show that the speakers are more separable in this
subspace which will further enhance the effectiveness of our
proposed criteria. The algorithm is used in our agglomera-
tive hierarchical speaker diarization system to test on Rich
Transcription 2007 conference data set and obtains very good
results.
Index Terms— speaker diarization, criterion function,
spectral clustering
1. INTRODUCTION
Clustering is the procedure to group data points into clusters
such that the data points in the same cluster possess strong
internal similarities. Generally, there are two major issues
in clustering: determining number of clusters (cluster valid-
ity) and finding optimal partitioning (cluster criteria). Thus
far, these two issues are handled separately with different
criteria e.g. the sum-of-squared-error criterion cannot be
used for cluster validity because it is monotonic decreasing
with increasing number of clusters. In this paper, we pro-
pose two cluster criterion functions when extremized could
concurrently solve both issues. These functions have a sim-
ple interpretation that they aim to maximize the separation
between intra-cluster distances and inter-cluster distances.
Recently, spectral clustering methods get much attention
because of their ability to handle many difficult clustering
problems. However, not much has been investigated for
speaker clustering within this framework. In this paper, we
introduce an algorithm using our proposed criterion functions
in spectral subspace and provide a mathematical analysis to
this algorithm in the ideal case. Furthermore, we also show
in the experiment that the speakers are more separable in the
spectral subspace which is a desirable property for clustering.
We then demonstrate the use of this algorithm in our agglom-
erative hierarchical speaker diarization system to estimate
number of speakers. This approach has advantage compared
to those using thresholds derived from development set to
determine number of speakers [1, 2, 3] because it does not
suffer from mismatch issues between development data and
test data. Ajmera [4] proposed a system using a modified
version of BIC. This system performs well in terms of having
low diarization error rate (DER) and not requiring develop-
ment data, however it usually generates many small clusters
(which does not have much impact on DER) thus provides
wrong number of speakers.
The paper is organized as follow: first we introduce two
criterion functions in section 2, and then in section 3, we
apply these functions in spectral subspace and provide de-
tail analysis. We finally report some experimental results on
speaker clustering using the proposed algorithm in section 4.
2. CLUSTERING CRITERION FUNCTIONS
Given a set of point S = {s
1
,s
2
,...,s
n
} of n samples that
we want to partition into c disjoint subsets S
1
,...,S
c
. Let
d (s
i
,s
j
) be the similarity function between two points s
i
and
s
j
. Define:
D
intra
= {d (s
i
,s
j
) |∀i, j ∃k : s
i
∈ S
k
,s
j
∈ S
k
}
D
inter
= {d (s
i
,s
j
) |∀i, j ∃k = l : s
i
∈ S
k
,s
j
∈ S
l
}
We propose two criterion functions to measure the quality of
partitioning.
2.1. T
s
criterion
Let m
1
, σ
1
, n
1
, m
2
, σ
2
, n
2
be respectively the mean, standard
deviation, size of D
intra
and D
inter
.
4085 978-1-4244-2354-5/09/$25.00 ©2009 IEEE ICASSP 2009