Detection of Peculiar Word Sense by Distance Metric Learning with Labeled Examples Minoru Sasaki, Hiroyuki Shinnou Department of Computer and Information Sciences, Faculty of Engineering, Ibaraki University 4-12-1, Nakanarusawa, Hitachi, Ibaraki, 316-8511, Japan msasaki@mx.ibaraki.ac.jp, shinnou@mx.ibaraki.ac.jp Abstract For natural language processing on machines, resolving such peculiar usages would be particularly useful in constructing a dictionary and dataset for word sense disambiguation. Hence, it is necessary to develop a method to detect such peculiar examples of a target word from a corpus. Note that, hereinafter, we define a peculiar example as an instance in which the target word or phrase has a new meaning.In this paper, we proposed a new peculiar example detection method using distance metric learning from labeled example pairs. In this method, first, distance metric learning is performed by large margin nearest neighbor classification for the training data, and new training data points are generated using the distance metric in the original space. Then, peculiar examples are extracted using the local outlier factor, which is a density-based outlier detection method, from the updated training and test data. The efficiency of the proposed method was evaluated on an artificial dataset and the Semeval-2010 Japanese WSD task dataset. The results showed that the proposed method has the highest number of properly detected instances and the highest F-measure value. This shows that the label information of training data is effective for density-based peculiar example detection. Moreover, an experiment on outlier detection using a classification method such as SVM showed that it is difficult to apply the classification method to outlier detection. Keywords: Peculiar Word Sense Detection, Semi-Supervised Outlier Detection, Distance Metric Learning 1. Introduction In everyday life, we often encounter examples of words used in an unknown sense, including those that may not even be listed in the dictionary. For natural language processing on machines, resolving such peculiar usages would be particularly useful in constructing a dictionary and dataset for word sense disambiguation (WSD). Hence, it is necessary to develop a method to detect such pecu- liar examples of a target word from a corpus. Note that, hereinafter, we define a peculiar example as an instance in which the target word or phrase has a new meaning. As one approach, we consider the outlier detection methods used in data mining. However, although outlier detection is used to detect anomalous observations from data, it is gen- erally unsupervised and so is unable to incorporate sense information for the detection of peculiar examples. To solve this problem, here we propose a new approach using distance metric learning from labeled example pairs. First, distance metric learning is performed by large mar- gin nearest neighbor (LMNN) classification (Weinberger and Saul, 2009) for the training data, and new training data points are generated using the distance metric in the origi- nal space. Then, peculiar examples are extracted using the local outlier factor (LOF) (Breunig et al., 2000), which is a density-based outlier detection method, from the updated training and test data. In this paper, we present the results of experimental eval- uations of the proposed method using an artificial dataset and the Semeval-2010 Japanese WSD task dataset (Oku- mura et al., 2010). The proposed method proved to be ef- fective on both datasets in comparison with the LOF and the one-class support vector machine (SVM) (Sch¨ olkopf et al., 2001). Moreover, we present the results of an experi- ment on outlier detection using a classification method such as an SVM. The results show that it is difficult to apply the classification method for outlier detection. 2. Outlier Detection Many methods have been proposed for detecting outlier instances, such as distance-based methods (Orair et al., 2010), probabilistic methods (Kriegel et al., 2009), and density-based methods. Here we briefly explain the LOF algorithm, a local density-based method for outlier detec- tion, and the one-class SVM (Cortes and Vapnik, 1995), an unsupervised segmentation method based on machine learning. 2.1. LOF LOF is a well-known outlier detection method for unla- beled data sets. This method specifies the degree of out- lierness, determined from the difference in density between a data object and its neighborhood. Outliers are objects that have high LOF values; in other words, objects that have low LOF values are likely to be normal with respect to their neighborhood. The first step in computing the LOF value of data object x is to compute its k-distance(x), where k is an arbitrary positive constant. The k-distance(x) of object x in a dataset D is defined as the distance d (x, y) between two objects as follows: 1. d (x, y ) d (x, y) for at least k objects y D \{x}, 2. d (x, y ) < d (x, y) for at most k 1 objects y D \{x}. In other words, the k-distance(x) represents the distance be- tween the object x and the k-th nearest object from x. 601